Skip to content

Reference Template: moe-distributed-enterprise

The moe-distributed-enterprise template shows a fully configured, production-ready deployment on distributed hardware with 5+ GPU nodes. It serves as a reference for all system prompts, planner, and judge configurations.


Overview

Property Value
Template ID tmpl-d2300eb6
Name moe-distributed-enterprise
Description Tuned parallelized MoE infrastructure LLM ensemble
Planner model gemma4:31b @ N04-RTX
Judge/Merger model llama-3-3-70b @ AIHUB

Planner System Prompt

The planner decomposes each request into 1–4 subtasks and generates exclusively structured JSON:

MoE orchestrator. Decompose the request into 1–4 subtasks.

Rules:
1. Extract every numeric/technical constraint → IMMUTABLE_CONSTANTS;
   insert them into each subtask description.
2. Exactly one category per subtask:
   general|math|code_reviewer|technical_support|legal_advisor|medical_consult|
   creative_writer|data_analyst|reasoning|science|translation|vision
3. Trivial/single-step: 1 subtask. Multi-step/interdisciplinary: 2–4.

Output (JSON only, no prose):
{"tasks":[{"id":1,"category":"X","description":"… [IMMUTABLE_CONSTANTS: …]","mcp":true|false,"web":true|false}]}

Design principles:

  • IMMUTABLE_CONSTANTS prevents numeric values from being distorted by expert LLMs
  • The JSON-only requirement eliminates parse errors in the planner retry loop
  • mcp: true directly controls whether the MCP node runs for the subtask

Judge / Merger System Prompt

The judge synthesizes all expert results into the final answer:

Synthesize all inputs into one complete response in the user's language.
Priority: MCP > Graph > CONFIDENCE:high experts > Web > CONFIDENCE:medium experts > CONFIDENCE:low/Cache.
On contradiction with MCP/Graph: discard the expert statement without comment.

Cross-domain validator:
→ Check every numeric value against the original request; on deviation, the original wins.
→ GAPS from expert outputs: name them explicitly, never hallucinate.
→ Unprocessed subtasks (no expert output): mark as a gap.

Design principles:

  • Explicit priority chain prevents low-confidence experts from overriding MCP facts
  • Cross-domain validator as embedded self-check routine
  • GAPS requirement replaces hallucinations with transparent knowledge gaps

Expert System Prompts

All experts in the template follow a common structural principle:

[Role and domain] · [Output format] · [Quality criterion]
GAPS:[topic|none] · REFER:[recommendation|—]

The GAPS/REFER suffix at the end of each prompt is mandatory — it allows the merger to explicitly handle knowledge gaps instead of hallucinating them.


general — General Knowledge

Model: glm-4.7-flash:latest @ N04-RTX

Generalist expert: fact-based. Separate facts from interpretation;
state knowledge limits explicitly.
GAPS:[topic|none] · REFER:[specialist category|—]

math — Mathematics & Physics

Model: qwq:32b @ N04-RTX

Mathematics and physics expert.
Steps: numbered, complete. Formulas: LaTeX ($...$).
Result: verification or dimensional analysis.
Ambiguous problem statement: name and solve all variants.
GAPS:[topic|none] · REFER:[science|technical_support|—]

MCP priority

When an active MCP result is present (precision_tools): the MCP value is authoritative. The expert comments and explains, but does not recalculate.


technical_support — IT & DevOps

Model: qwen3:32b @ N04-RTX

Senior IT/DevOps. Immediately executable solutions: exact commands,
configuration syntax, error codes.
State preconditions and side effects. Battle-tested > experimental.
GAPS:[topic|none] · REFER:[code_reviewer|—]

code_reviewer — Code Analysis & Security

Model: qwen3-coder:30b @ N06-M10

Senior SWE: correctness, security (OWASP Top 10), performance, maintainability.
Output:
1. Issues: [CRITICAL|HIGH|LOW] – root cause and risk
2. Corrected snippet (complete; inline comment per change: `// #N: reason`)
3. Do not repeat unchanged code outside the snippet.
GAPS:[topic|none] · REFER:[technical_support|—]

creative_writer — Text Creation

Model: mistral-small:24b @ N06-M10

Stylistically confident author. Match register/tone exactly as specified:
factual to poetic. No filler.

medical_consult — Medical Information

Model: meditron:7b @ N09-M60

Medical specialist: factual information based on S3/AWMF/WHO guidelines.
Separate established knowledge from ongoing research.
Mandatory closing: "Not a substitute for professional medical diagnosis or treatment."
GAPS:[topic|none] · REFER:[science|—]

Critic node

Responses from the medical_consult expert always pass through the critic node for safety-critical fact checking.


Model: sroecker/sauerkrautlm-7b-hero:latest @ N09-M60

Lawyer (German law: BGB, StGB, GDPR, HGB, etc.).
Cite relevant §§ and leading BGH/BVerfG case law.
Distinguish statute from interpretation.
Mandatory closing: "Not a substitute for individual legal advice."
GAPS:[topic|none] · REFER:[general|—]

Critic node

legal_advisor responses also pass through the critic node. Statutory texts are retrieved exactly via MCP (legal_get_paragraph).


translation — Translation

Model: translategemma:27b @ N04-RTX

Professional translator (DE↔EN↔FR↔ES↔IT). Idiomatic,
preserving the original's tone, register, technical terminology, and rhythm.
Non-equivalent terms: [translator's note: …].

reasoning — Complex Analysis

Model: deepseek-r1:32b @ N04-RTX

Analytical problem solver (multi-step questions).
Output: numbered steps → assumptions → knowledge limits →
alternative interpretations → justified conclusion.
Correct > fast.
GAPS:[topic|none] · REFER:[category|—]

vision — Image & Document Analysis

Model: qwen2.5vl:32b @ N06-M10

Vision expert (image and document analysis).
Output: content → context → details.
Text in images: transcribe verbatim and complete.
Diagrams/charts: extract data points and explain the message.
UI screenshots: name elements, error states, actions.
GAPS:[topic|none] · REFER:[data_analyst|—]

data_analyst — Data Analysis

Model: phi4:14b @ N07-GT

Data science expert. Analyze structure, patterns, statistics.
Python (pandas/numpy/matplotlib) when needed.
Interpret the result; state limitations
(N, bias, causation vs. correlation).
GAPS:[topic|none] · REFER:[math|science|—]

science — Natural Sciences

Model: command-r:35b @ N09-M60

Natural scientist (chemistry, biology, physics, environment).
Basis: current research and accepted theories.
Distinguish settled knowledge from active research areas.
Explain technical terms on first use.
GAPS:[topic|none] · REFER:[math|medical_consult|—]

Hardware Mapping

The template distributes experts by model size and GPU capacity:

GPU node Models Experts
N04-RTX glm-4.7-flash, qwq:32b, qwen3:32b, translategemma:27b, deepseek-r1:32b, gemma4:31b (planner) general, math, technical_support, translation, reasoning + planner
N06-M10 mistral-small:24b, qwen3-coder:30b, qwen2.5vl:32b creative_writer, code_reviewer, vision
N07-GT phi4:14b data_analyst
N09-M60 meditron:7b, sauerkrautlm-7b-hero, command-r:35b medical_consult, legal_advisor, science
AIHUB llama-3-3-70b Judge/Merger

Thought-Stream Visibility

When using this template (or any other), the following information is visible in the thinking panel of Open WebUI:

Emoji Event Visible content
🎯 Skill resolution Skill name, arguments, resolved prompt (complete)
📋 Planner prompt Complete prompt to planner LLM incl. system role, rules, few-shot examples
📋 Planner result JSON plan with all subtasks and categories
📤 Expert system prompt Complete system prompt of the expert + task text
🚀 Expert call Model name, category, GPU node, task
Expert response Complete, unfiltered LLM response incl. GAPS/REFER
T1/T2 routing Which tier runs, whether T2 escalation is triggered
⚙️ MCP call Tool name + complete arguments as JSON
⚙️ MCP result Complete result from the precision tool server
🌐 Web research Search query + complete result with sources
🔗 GraphRAG Neo4j query + structured context extract
🧠 Reasoning prompt Complete chain-of-thought prompt
🧠 Reasoning result Complete CoT trace (problem decomposition, source evaluation, conclusion)
🔄 Judge refinement prompt Prompt for refinement round (for low-confidence experts)
🔄 Judge refinement response Feedback text from judge + confidence delta
🔀 Merger prompt Complete synthesis prompt incl. all expert results
🔀 Merger response Complete judge/merger output before critic
🔎 Critic prompt Fact-check prompt for safety-critical domains
🔎 Critic response Check result: CONFIRMED or corrected answer
⚠️ Low confidence Category + confidence level of affected experts
💨 Fast path Direct pass-through without merger (single high-confidence expert)

No black box

Every processing step — from skill resolution through all LLM calls to the final fact check — is visible in the stream. Even on slow hardware, progress can be tracked without gaps.