Deployment Overview¶
MoE Sovereign ships as a single OCI image, wrapped in multiple deployment formats, and parameterised by three profiles — so the same artefact runs everywhere from a Raspberry Pi inside a Proxmox LXC to a multi-AZ OpenShift cluster, without code forks or feature loss.
The universal deployment principle¶
flowchart TB
subgraph ART["One Artefact"]
IMG[["moe-sovereign/orchestrator:x.y.z<br/>multi-stage, non-root UID 1001<br/>OCI-compliant, read-only rootfs"]]
end
subgraph PROF["Three Profiles<br/>(env: MOE_PROFILE)"]
SOLO["solo<br/>• 1 replica<br/>• sub-1 GiB RAM<br/>• embedded data tier"]
TEAM["team<br/>• 2+ replicas<br/>• bundled Bitnami subcharts<br/>• homelab / KMU"]
ENT["enterprise<br/>• HA, HPA, PDB<br/>• external data clusters<br/>• multi-tenant JWT"]
end
subgraph WRAP["Four Wrappers"]
W1["LXC<br/>deploy/lxc/setup.sh"]
W2["Compose<br/>docker-compose.yaml<br/>+ .solo / .enterprise overrides"]
W3["Podman<br/>Quadlet + kube play"]
W4["Helm<br/>charts/moe-sovereign"]
end
IMG --> SOLO & TEAM & ENT
SOLO --> W1 & W2
TEAM --> W2 & W3 & W4
ENT --> W3 & W4
W4 --> K3S["k3s / k8s"]
W4 --> OCP["OpenShift<br/>(Route instead of Ingress)"]
classDef art fill:#eef2ff,stroke:#6366f1,font-weight:bold;
classDef prof fill:#f0fdf4,stroke:#16a34a;
classDef wrap fill:#fef3c7,stroke:#d97706;
classDef tgt fill:#fce7f3,stroke:#db2777;
class IMG art;
class SOLO,TEAM,ENT prof;
class W1,W2,W3,W4 wrap;
class K3S,OCP tgt;
Nothing in the code path changes between profiles. The image is identical byte-for-byte; only the environment and the surrounding wrapper differ. This is what guarantees "no performance or functional loss across all layers".
Choosing your tier¶
flowchart LR
Q{{What do you have?}}
Q --> A[Single Proxmox LXC<br/>or Debian VM]:::low
Q --> B[One Docker host<br/>or homelab server]:::mid
Q --> C[k3s / Podman<br/>on a small cluster]:::mid
Q --> D[Kubernetes cluster<br/>with separate data tier]:::hi
Q --> E[OpenShift<br/>with SCCs + Routes]:::hi
A --> A1["Use <b>lxc</b> wrapper<br/>Profile: <code>solo</code>"]
B --> B1["Use <b>compose</b> wrapper<br/>Profile: <code>team</code>"]
C --> C1["Use <b>podman kube play</b><br/>Profile: <code>team</code>"]
D --> D1["Use <b>Helm chart</b><br/>Profile: <code>enterprise</code>"]
E --> E1["Use <b>Helm chart</b> with<br/><code>openshift.enabled=true</code>"]
classDef low fill:#ecfdf5,stroke:#059669;
classDef mid fill:#fef9c3,stroke:#ca8a04;
classDef hi fill:#fef2f2,stroke:#dc2626;
| Tier | Wrapper | Profile | Typical target | RAM footprint |
|---|---|---|---|---|
| Hobbyist / Edge | deploy/lxc/setup.sh |
solo |
Proxmox LXC, Raspberry Pi 5, Debian VM | ~1.5 GiB |
| Homelab / KMU | docker-compose.yaml |
team |
1 Docker host | ~6 GiB |
| Rootless clusters | podman kube play |
team |
Podman 4.4+ on a few hosts | ~6 GiB |
| Enterprise k8s | charts/moe-sovereign |
enterprise |
k3s / k8s / OpenShift | variable |
Directory layout¶
The deployment assets live at the repository root:
moe-infra/
├── Dockerfile # multi-stage, non-root, OCI
├── docker-compose.yaml # existing team-profile stack
├── charts/
│ └── moe-sovereign/ # Helm chart
│ ├── Chart.yaml # Bitnami conditional subcharts
│ ├── values.yaml # profile: enterprise (default)
│ ├── values-solo.yaml
│ ├── values-team.yaml
│ └── templates/ # 14 Helm templates
└── deploy/
├── lxc/setup.sh # Proxmox/Debian bootstrap
├── podman/
│ ├── systemd/moe-orchestrator.container # Quadlet unit
│ └── kube.yaml # podman kube play manifest
└── alloy/
├── alloy.river # universal Grafana Alloy config
└── alloy.systemd.service # LXC service unit
What every wrapper delivers¶
Regardless of which wrapper you use, all four of these guarantees hold:
- Non-root execution — UID 1001, capabilities dropped to
ALL,no_new_privs. - Read-only root filesystem — writable paths are
emptyDir(k8s),tmpfs(Podman), or bind-mounts (LXC). Runtime code cannot mutate the image. - W3C
traceparentpropagation — a request that enters the orchestrator on LXC and fans out to a Kafka cluster on k8s retains the same trace ID, so logs correlate across tiers in a single Loki query. - Env-var driven configuration —
MOE_PROFILE,MOE_LOGS_DIR,MOE_CACHE_DIR,MOE_EXPERTS_DIR,KAFKA_URL,REDIS_URL,POSTGRES_CHECKPOINT_URL,NEO4J_URI,CHROMA_HOST,JWT_ISSUER,JWT_AUDIENCE. No hardcoded hostnames, no baked-in paths.
Maturity & Test Status¶
Not all deployment targets have been tested equally
Docker Compose is the primary, production-tested deployment method. Other wrappers are prepared but have varying levels of real-world validation.
| Wrapper | Status | Test Environment | Notes |
|---|---|---|---|
| Docker Compose | Tested | Production (5-node GPU cluster) | Primary deployment method. All features validated. |
| LXC / Proxmox | Tested | Proxmox CT with nesting=1, fuse=1 |
Docker-in-LXC works with correct cgroup2 config. GPU passthrough requires additional setup. |
| Podman (rootless) | Planned | macOS (Podman Desktop) | Prepared but not yet validated. UID mapping and GPU access are known challenges. |
| K3s | Planned | 3-node cluster (netcup VPS) | Helm chart prepared. Requires shared storage (Longhorn recommended). Internet-connected nodes only. |
| Kubernetes (managed) | Untested | No cluster available | Helm chart provided, community validation welcome. |
| OpenShift | Untested | No cluster available | SecurityContextConstraints and Route configuration documented but not validated. Contributions welcome. |
LXC Configuration Reference¶
For Docker-in-LXC on Proxmox, the container requires these settings in
/etc/pve/lxc/<CTID>.conf:
features: fuse=1,mount=nfs;cifs,nesting=1
lxc.cgroup2.devices.allow: c 10:200 rwm
lxc.mount.entry: /dev/net/tun dev/net/tun none bind,create=file
For GPU passthrough (NVIDIA), add:
lxc.cgroup2.devices.allow: c 195:* rwm
lxc.cgroup2.devices.allow: c 509:* rwm
lxc.mount.entry: /dev/nvidia0 dev/nvidia0 none bind,optional,create=file
lxc.mount.entry: /dev/nvidiactl dev/nvidiactl none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm dev/nvidia-uvm none bind,optional,create=file
K3s Storage Considerations¶
For multi-node K3s deployments, stateful services (Neo4j, PostgreSQL, Valkey, ChromaDB) require persistent volumes. Options:
- Longhorn (recommended): K3s-native distributed block storage. Works across internet-connected nodes but adds latency.
- NFS: Simple but not recommended over WAN for database workloads.
- Local path provisioner: Pin stateful pods to a single node. Simplest but no redundancy.