MoE API — User Guide¶

Sovereign Multi-Model Orchestrator Internal AI platform · Access documentation for API users

Table of Contents¶

Access & First Login
User Portal — Overview
Managing API Keys
Using the API
Token Budget & Consumption
Change Profile & Password
Errors & FAQ

Receiving credentials¶

Your account is created by the administrator. You will receive:

Username (e.g. max.mustermann)
Initial password (set by the admin)
URL of the User Portal (e.g. http://moe.intern:8088/user/login)

Open the User Portal in your browser: http://<server>:8088/user/login
Enter username and password
You will be redirected to the Dashboard
Change your password immediately under Profile & Password

Note: Your account has no permissions by default. The administrator must explicitly grant access rights (models, modes, skills). Contact the admin if needed.

2. User Portal — Overview¶

The portal is available at: http://<server>:8088/user/

Section	URL	Description
Dashboard	`/user/dashboard`	Budget status, recent activity, API keys
Billing	`/user/billing`	Token consumption by model/mode
Usage History	`/user/usage`	All requests with token count
API Keys	`/user/keys`	Create & revoke keys
Profile	`/user/profile`	Display name, email, password

Dashboard at a glance¶

The dashboard shows:

Budget bars for daily, monthly, and total limits
Green (< 70%), Orange (70–90%), Red (> 90%)
14-day chart with daily token consumption
Active API keys with timestamp of last use

3. Managing API Keys¶

Why API keys?¶

The MoE API cannot be used directly via browser — you need an API key for each application (Claude Code, Open WebUI, custom scripts).

Create a new key¶

Navigate to API Keys (/user/keys)
Enter a label, e.g. Claude Code Laptop
Click Create key
The full key is displayed once — copy it immediately!

moe-sk-<48_hex_chars_replace_me_with_real_key>

Important: After closing the window, the key is never fully visible again. Only the prefix (e.g. moe-sk-a3f8...) remains for identification.

Revoke a key¶

If a key is compromised or no longer needed:

Go to API Keys
Click Revoke next to the relevant key
The key is immediately invalid (Valkey cache is invalidated)

Recommendations¶

Create one key per device / application
Name keys descriptively (Claude Code Server, Open WebUI, Python Script)
Rotate keys regularly (every 90 days recommended)

4. Using the API¶

Endpoint¶

http://<server>:8002

The platform provides two compatible API interfaces:

Interface	Endpoint	Usage
Anthropic Messages API	`/v1/messages`	Claude Code, Anthropic SDK
OpenAI Chat Completions API	`/v1/chat/completions`	Open WebUI, native LLMs, OpenAI SDK

Important: Claude Code communicates exclusively via the Anthropic Messages API (via ANTHROPIC_BASE_URL). The OpenAI-compatible API is intended for native LLM access (Open WebUI, custom scripts using the openai SDK).

Authentication¶

Pass the API key as an Authorization: Bearer header or as an x-api-key header:

# Anthropic Messages API — for Claude Code
curl http://<server>:8002/v1/messages \
  -H "Authorization: Bearer moe-sk-xxxxxxxx..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-6",
    "max_tokens": 1024,
    "messages": [{"role": "user", "content": "Explain Docker Compose."}]
  }'

# OpenAI Chat Completions API — for native LLMs / Open WebUI
curl http://<server>:8002/v1/chat/completions \
  -H "x-api-key: moe-sk-xxxxxxxx..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llama3.3:70b@N04-RTX",
    "messages": [{"role": "user", "content": "Explain Docker Compose."}]
  }'

Configuration in Claude Code¶

Claude Code uses the Anthropic Messages API. Set the following environment variables:

export ANTHROPIC_BASE_URL=http://<server>:8002
export ANTHROPIC_API_KEY=moe-sk-xxxxxxxx...

Or persistently in ~/.claude/settings.json:

{
  "env": {
    "ANTHROPIC_BASE_URL": "http://<server>:8002",
    "ANTHROPIC_API_KEY": "moe-sk-xxxxxxxx..."
  }
}

Claude Code forwards your requests to the MoE Orchestrator, which uses the configured Claude Code Profile (cc_profile) for tool execution and routing.

Configuration in Open WebUI (native LLMs)¶

Open WebUI uses the OpenAI-compatible API:

Go to Settings → Connections → OpenAI API
API Base URL: http://<server>:8002/v1
API Key: moe-sk-xxxxxxxx...

Python (Anthropic SDK)¶

import anthropic

client = anthropic.Anthropic(
    api_key="moe-sk-xxxxxxxx...",
    base_url="http://<server>:8002",
)

message = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello!"}]
)
print(message.content[0].text)

Python (OpenAI SDK — for native LLMs)¶

from openai import OpenAI

client = OpenAI(
    api_key="moe-sk-xxxxxxxx...",
    base_url="http://<server>:8002/v1",
)

response = client.chat.completions.create(
    model="llama3.3:70b@N04-RTX",
    messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)

Available Model IDs¶

Which models you can use depends on your permissions (see section 5). Ask your administrator which model IDs have been enabled for you.

Typical Claude model IDs (for Claude Code / Anthropic Messages API):

claude-sonnet-4-6 — Standard (MoE orchestration)
claude-opus-4-6 — Extended MoE orchestration
claude-haiku-4-5-20251001 — Fast & compact

Native LLM IDs follow the format model:tag@server, e.g. llama3.3:70b@N04-RTX.

5. Token Budget & Consumption¶

What is a token budget?¶

Your account has limits for:

Limit	Description	Reset
Daily	Max tokens per day	Midnight (UTC)
Monthly	Max tokens per month	First of month
Total	Lifetime limit (if configured)	No reset

1 token ≈ 0.75 words in English, approx. 0.5 words in German. A typical chat request consumes 500–3,000 tokens.

Budget exceeded?¶

When your budget is exhausted:

The API responds with HTTP 429 Too Many Requests
The budget bar appears in red in the portal
Contact the administrator for an increase

View consumption¶

Under Billing you see:

Consumption today / this month / total
Breakdown by model and mode
How many tokens remain

Under Usage History you find:

Every individual request with timestamp
Prompt tokens, completion tokens, total
Status (ok / budget_exceeded / error)

Permissions¶

By default all access is blocked. Unlocked resources are:

expert_template — Expert configuration package (defines which LLMs are used for which domains)
cc_profile — Claude Code integration profile (tool model, MoE mode, reasoning settings)
model_endpoint — Native LLMs on which inference server (OpenAI API access)
moe_mode — Processing mode (native, moe_orchestrated, moe_reasoning)
skill — Claude Code skills available to you
mcp_tool — MCP tools (precision calculator etc.)

6. Change Profile & Password¶

Navigate to Profile & Password (/user/profile)
Change display name and/or email
To change the password: enter new password in both fields (min. 8 characters)
Click Save

Note: The username cannot be changed yourself — contact the admin if needed.

7. Errors & FAQ¶

`401 Unauthorized`¶

Cause: API key invalid, revoked, or not present. Solution: Check in the portal whether your key is active. Create a new key if needed.

`429 Too Many Requests`¶

Cause: Daily or monthly token budget exhausted. Solution: Wait until reset (midnight / first of month) or contact the admin.

`403 Forbidden`¶

Cause: No permission for the requested model, mode, or skill. Solution: Ask the administrator to grant the corresponding permission.

Check capitalization in the username
Ensure your account is not blocked (the admin can check this)
Use the browser console (F12) for error details

Key forgotten / lost¶

There is no way to view an existing key again. Create a new key and revoke the old one.

Who is the administrator?¶

For account questions, budget increases, or permissions, contact the responsible person in your organization (IT department or the MoE platform operator).

MoE Sovereign Orchestrator — Internal — As of April 2026