Local mode (no network)
ogong-provider local runs a standalone inference server on your own machine with no OGONG
network, no tunnel, and no account. It serves an OpenAI-compatible API and an
Ollama-compatible API on the same port, so existing tools point at it unchanged. This is
the zero-friction on-ramp, and a drop-in local runner in its own right.
Serve a model
# Serve one model (downloads on demand into ~/.ogong-provider/models if a catalog name)
ogong-provider local --model ~/.ogong-provider/models/llama-3.2-3b-instruct.gguf
# Serve several at once, each routable by id
ogong-provider local \
--model llama-3.2-3b-instruct \
--model qwen2.5-7b-instruct
| Flag | Default | Meaning |
|---|---|---|
--model <path-or-name> | - | a .gguf path or a catalog name; repeatable |
--mmproj <gguf> | - | multimodal projector for a single vision model |
--n-ctx <n> | 8192 | context size |
--listen <addr> | 127.0.0.1:11434 | bind address (Ollama’s port by default) |
--upstream <url> | - | adapter mode: forward to an existing OpenAI server instead of spawning one (mutually exclusive with --model) |
--served-models <json> | - | curated on-demand (LRU) set covering every modality; takes precedence over --model |
Because it binds Ollama’s default port (127.0.0.1:11434) and speaks Ollama’s API, anything
configured for Ollama works against it with no changes.
Use it
# OpenAI-style
curl http://127.0.0.1:11434/v1/chat/completions \
-H 'content-type: application/json' \
-d '{"model":"llama-3.2-3b-instruct","messages":[{"role":"user","content":"hi"}]}'
# Ollama-style
curl http://127.0.0.1:11434/api/chat \
-d '{"model":"llama-3.2-3b-instruct","messages":[{"role":"user","content":"hi"}]}'
One-shot terminal chat
For a quick REPL without even setting up a server:
ogong-provider run llama-3.2-3b-instruct --system "You are concise."
Spawns the engine and streams replies to prompts read from stdin. Ctrl-D / Ctrl-C to quit. No account, no config.
Managing models
ogong-provider pull --list # browse the catalog
ogong-provider pull <name|url> # download into ~/.ogong-provider/models
When you’re ready to contribute to the network, the same binary joins it; see Provider node.