Intelligence & Learning¶
This section documents the two mechanisms by which MoE Sovereign goes beyond simple retrieval and generation: Causal Reasoning and Context Extension.
Both address fundamental limitations of small local LLMs — not by replacing them, but by restructuring how information flows around them.
Contents¶
| Document | What it covers |
|---|---|
| Causal Learning Loop | How the system learns world-rules and procedural dependencies from its own responses, stores them in Neo4j, and surfaces them back into future queries |
| Context Extension | How the MoE architecture effectively extends the usable context window for coding agents and large-document workflows, despite the small native context of local LLMs |
| Graph-basierte Wissensakkumulation | How the system actively maintains its own knowledge base: synthesis persistence (novel insights → :Synthesis Neo4j nodes) and graph linting (orphan cleanup + contradiction resolution via background LLM) |
| Memory Palace | Domain-scoped retrieval via metadata filters, isolated expert memory with expert_domain tagging in ChromaDB and Neo4j, and Claude Code auto-save hooks that persist session knowledge before context loss |
| CLI Agent Integration | Architectural analysis of how execution-loop agents (Aider, Open Interpreter) and infra-orchestrators (Hermes, Continue.dev) leverage all four MoE core components simultaneously — with delta table, Mermaid data-flow diagrams, and measured thresholds from the implementation |
| 7B Ensemble Capability | Measured benchmark results showing that 8 domain-specialist 7–9B models on legacy Tesla M10 hardware achieve GPT-4o mini class performance (6.11/10 on MoE-Eval) with full data sovereignty — overnight stability run, per-category analysis, and comparison to public cloud models |
| Agentic Re-Planning Loop | How MoE Sovereign autonomously detects knowledge gaps after synthesis and re-plans with targeted tool calls — enabling multi-step GAIA L3-class reasoning without manual prompt chaining |
Design Philosophy¶
Local LLMs typically have two hard constraints:
- Small context windows (4k–32k tokens) make it impossible to fit entire codebases, long documents, or rich conversation histories into a single prompt.
- No persistent memory across requests — every call starts cold, with no knowledge of previous answers.
MoE Sovereign addresses both constraints architecturally:
- Context is distributed: multiple experts each receive a focused subset of relevant information, rather than one model receiving everything at once.
- Memory is structured: factual and procedural knowledge extracted from responses is stored in Neo4j and retrieved in future requests as structured
[Knowledge Graph]and[Procedural Requirements]blocks. - Insights compound: Novel multi-source syntheses produced by the merger are captured as
:Synthesisnodes, enriching future graph traversals. A background linting process resolves contradictions and removes orphaned nodes. - Memory is domain-aware: Each piece of stored knowledge is tagged with its source expert (
expert_domain). The planner can attach metadata filters to its plan so that downstream retrieval is scoped to the relevant namespace — acode_reviewerquery does not surface medical facts. Session-level knowledge from external tools (Claude Code) flows into the same pipeline via the/v1/memory/ingestendpoint and hook scripts.
The result is a system where the collective context is larger than any individual LLM can hold — knowledge compounds over time, is organised by domain, and persists across tool sessions rather than being lost after each response.