Local mode (no network)

ogong-provider local runs a standalone inference server on your own machine with no OGONG network, no tunnel, and no account. It serves an OpenAI-compatible API and an Ollama-compatible API on the same port, so existing tools point at it unchanged. This is the zero-friction on-ramp, and a drop-in local runner in its own right.

Serve a model

# Serve one model (downloads on demand into ~/.ogong-provider/models if a catalog name)
ogong-provider local --model ~/.ogong-provider/models/llama-3.2-3b-instruct.gguf

# Serve several at once, each routable by id
ogong-provider local \
  --model llama-3.2-3b-instruct \
  --model qwen2.5-7b-instruct

Flag	Default	Meaning
`--model <path-or-name>`	-	a `.gguf` path or a catalog name; repeatable
`--mmproj <gguf>`	-	multimodal projector for a single vision model
`--n-ctx <n>`	`8192`	context size
`--listen <addr>`	`127.0.0.1:11434`	bind address (Ollama’s port by default)
`--upstream <url>`	-	adapter mode: forward to an existing OpenAI server instead of spawning one (mutually exclusive with `--model`)
`--served-models <json>`	-	curated on-demand (LRU) set covering every modality; takes precedence over `--model`

Because it binds Ollama’s default port (127.0.0.1:11434) and speaks Ollama’s API, anything configured for Ollama works against it with no changes.

Use it

# OpenAI-style
curl http://127.0.0.1:11434/v1/chat/completions \
  -H 'content-type: application/json' \
  -d '{"model":"llama-3.2-3b-instruct","messages":[{"role":"user","content":"hi"}]}'

# Ollama-style
curl http://127.0.0.1:11434/api/chat \
  -d '{"model":"llama-3.2-3b-instruct","messages":[{"role":"user","content":"hi"}]}'

One-shot terminal chat

For a quick REPL without even setting up a server:

ogong-provider run llama-3.2-3b-instruct --system "You are concise."

Spawns the engine and streams replies to prompts read from stdin. Ctrl-D / Ctrl-C to quit. No account, no config.

Managing models

ogong-provider pull --list        # browse the catalog
ogong-provider pull <name|url>    # download into ~/.ogong-provider/models

When you’re ready to contribute to the network, the same binary joins it; see Provider node.