First-Time Setup¶

After installation, the Admin UI automatically redirects first-time visitors to the Setup Wizard. This wizard runs whenever no inference servers are configured.

Why Inference Servers Are Required¶

MoE Sovereign is a Multi-Model Orchestrator — it routes requests to specialized LLM models and merges their answers. Without at least one inference server (an LLM API endpoint), the system cannot route, plan, or respond to any queries.

The minimum viable configuration is:

1 inference server (Ollama, OpenAI, LiteLLM, vLLM, or any OpenAI-compatible API)
1 Judge model (the merger/synthesizer — can be the same model used for experts)

A separate Planner model is optional but recommended for large multi-GPU setups.

Supported Backend Types¶

Type	Example URL	Notes
Ollama (local)	`http://localhost:11434/v1`	Recommended for self-hosted GPU setups
OpenAI API	`https://api.openai.com/v1`	Use your OpenAI API key as the token
LiteLLM	`http://localhost:4000/v1`	Proxy for multiple providers
vLLM	`http://localhost:8000/v1`	High-performance inference server
Any OpenAI-compatible	`http://<host>:<port>/v1`	API must match the `/v1/chat/completions` schema

Wizard Walkthrough¶

Step 1 — Welcome¶

The first screen explains what you will need. No input required — click Continue.

flowchart TD
    W["<b>First-Time Setup — Step 1 of 4 (25%)</b><br/><br/><b>Welcome to MoE Sovereign</b><br/><br/>What you will need:<br/>• At least one inference server<br/>• A Judge model (the merger)<br/>• Optionally, a Planner model<br/><br/>Backend options: Ollama · OpenAI · vLLM / Custom<br/><br/>→ Continue"]

Step 2 — Inference Servers¶

Add one or more LLM inference servers. Each row represents one server node.

Field	Description	Example
Name	A short identifier (used in logs and model assignments)	`RTX`, `GPU1`, `OpenAI`
URL	Full base URL of the OpenAI-compatible API	`http://192.168.1.10:11434/v1`
GPUs	Number of GPUs on this node (informational)	`2`
API Type	`Ollama` or `OpenAI` — controls how the API is called
API Key / Token	Authentication token (use `ollama` for local Ollama)

Local Ollama

If running Ollama on the same host as MoE Sovereign, use the Docker network address: http://host.docker.internal:11434/v1 or the host IP http://192.168.x.x:11434/v1

Click Add server to add more rows. Click Continue when done.

Step 3 — Core Models¶

Select the Judge model and optionally a Planner model.

Judge model — the merger/synthesizer. This model receives all expert responses and produces the final answer. Choose your most capable model.

Planner model — the routing brain. This model decomposes incoming requests and decides which experts to call. If left blank, the Judge model is used for planning as well.

Setting	Recommendation
Single-GPU setup	Use the same model for Judge and Planner
Multi-GPU setup	Dedicate a fast small model (e.g. 8B) to planning, large model (e.g. 70B) to judging

Model names

For Ollama, model names follow the name:tag format (e.g. qwen2.5:72b, llama3.1:8b). For OpenAI, use the full model identifier (e.g. gpt-4o, gpt-4-turbo).

Step 4 — Public Access URLs¶

Optional. Skip this step if using MoE Sovereign locally.

Field	Description	Example
Base URL	API endpoint shown in response links	`https://moe-sovereign.org:8088`
Public Admin URL	URL of the Admin UI (if using Caddy)	`https://admin.moe-sovereign.org`
Public API URL	URL of the API endpoint (if using Caddy)	`https://api.moe-sovereign.org`

Click Launch Dashboard to finish.

After the Wizard¶

Once the wizard completes, the orchestrator restarts and applies the new configuration. The dashboard becomes available with all features enabled.

Recommended next steps:

Expert Templates → /templates — create model assignment templates for your users
User Management → /users — add users and assign templates
Claude Code Profiles → /profiles — configure agentic coding presets
Monitoring → / (dashboard) — verify all containers are green

Repeating the Wizard¶

The wizard triggers automatically whenever INFERENCE_SERVERS is empty. To re-run it:

Go to the Configuration tab in the dashboard
Clear the Inference Servers table and save
The next page load will redirect to /setup

Or set it manually via the .env file:

sudo docker compose exec moe-admin bash -c \
  'sed -i "s/^INFERENCE_SERVERS=.*/INFERENCE_SERVERS=[]/" /app/.env'
sudo docker compose restart moe-admin

Minimum Viable Config Example¶

For a single Ollama server with one large model:

Setting	Value
Server Name	`Local`
Server URL	`http://192.168.1.10:11434/v1`
API Type	Ollama
API Key	`ollama`
Judge Model	`qwen2.5:72b`
Judge Endpoint	`Local`
Planner Model	(leave blank)

This gives you a fully functional MoE Sovereign instance using a single model for all roles. Expert Templates can later assign specialized models per category.