LiteLLM Gateway (removed)¶
This component is no longer part of the stack.
LiteLLM was planned as an optional unified API gateway that would have aggregated all Ollama inference servers behind a single OpenAI-compatible endpoint (load balancing, circuit breaker, fallback chains).
The service was never activated in production (LITELLM_URL remained commented out) and was therefore removed from docker-compose.yml.
The orchestrator communicates directly with the configured Ollama servers via the INFERENCE_SERVERS defined in .env.
Archived: April 2026