This page is no longer a graveyard of half-remembered ambition. The audit phase is complete enough that the roadmap can be narrower and more honest.
Sanctum’s core feature set is mechanically proven. The remaining work is not “make the Living Force real.” The remaining work is making Sanctum reproducible, legible, and releaseable without depending on one Mac Mini and one operator’s memory.
There is one deliberate gate before that work resumes: a one-week stabilization window. See Stability Window. If Sanctum cannot remain boring for seven days, it has not earned fresh ambition.
Sanctum is now mechanically stable as a single-operator system, but it is not yet productized in the stronger sense of being portable, repeatable, and supportable without inheriting Bert’s machine shape.
Stability: 8/10
The runtime, audit wall, calibration checks, and live-system probes now behave like real infrastructure instead of aspirational architecture.
Recoverability: 7.5/10
Self-heal paths are real and tested, but some of that resilience still depends on bespoke restart scripts and machine-local supervision conventions.
Reproducibility: 3.5/10
Too much of the system still assumes /Users/neo, local keychain state, ~/Projects/* repos, launchd behavior, and adjacent private runtime surfaces.
Productization readiness: 4.5/10
Sanctum is ready for continued hardening on the current machine and maybe a tightly controlled second-machine rollout, but not yet for external beta users or a supportable install story.
The practical conclusion is simple: the feature surface is no longer the bottleneck. The bottleneck is turning a highly capable personal runtime into a portable operator product.
Freeze the architecture and let it behave. Phase 2 work stays blocked until Sanctum survives a seven-day soak with the audit wall, runtime checks, and docs build still clean.
Turn sanctumctl from a strong operator surface into a genuinely reproducible install path. A fresh compatible machine should be able to bootstrap the checked-in slice, render manifests, sync calibration artifacts, and run the audit wall without hidden shell lore.
Separate machine-specific state from the core product surface. Paths, host capabilities, LaunchAgent assumptions, and runtime conventions need a clearer boundary so Sanctum can target more than one personal setup without pretending every machine is identical.
Make every release rerunnable instead of ceremonial. The current test wall is strong; the next step is one release-grade entrypoint that runs audits, runtime checks, docs build, and generates a coherent release note from the verified state.
Deliverables: release script or workflow, version stamp, audit summary, docs build gate.
sanctum-cli is productization-ready as of v0.7.1 (12 top-level commands, 4 cloud backends, 201 / 201 tests, R2 default with $0 egress, GitHub Tier 0, sanctum onboard --recipe family --yes runs end-to-end in ~3-5 minutes). v0.8 is packaging + polish so people who aren’t Bertrand can install it.
Deliverables: brew formula + homebrew-sanctum tap, Sigstore-signed releases with verified self-update, atomic-replace flow for cloud_backup, public README + demo recording, sanctum dashboard TUI for the wow-factor demo. Target: May.
Public Demo Path
Build the reveal layer. Sanctum needs a clean, intentional demo path that shows the parts worth believing: watchdog self-heal, Code Forge rollback, Jocasta context retrieval, Tech Lookout dispatch, and Force Flow delivery. The point is not theatrics. The point is making the real system legible.
Deliverables: scripted walkthrough, stable demo fixtures, screenshots or clips, one page tying the sequence together.
Cross-Repo Contract Cleanup
Reduce implicit coupling across sanctum, ~/.sanctum, openclaw-skills, and the support repos. Adjacent systems should declare what Sanctum expects from them instead of relying on shared memory and ambient convention.
The Kitchen Loop core is now implemented in the workspace slice. The next work is expansion: deeper L4 state-delta checks, broader scenario coverage, and tighter runtime attachment for real haushold automations and memory drift controls.
Why later: the mechanism exists; the remaining work is coverage depth rather than foundational productization.
Monorepo or Workspace Consolidation
There is still a plausible future where more of the Rust and support tooling lives in a tighter workspace. That may improve release ergonomics, but it is a secondary move. First the system needs clearer contracts; then it can choose a cleaner home.
Why later: consolidation without stronger boundaries just relocates confusion.
Context-Aware Automation Expansion
Features like Frigate-backed alert semantics and richer haushold automation remain appealing. They are real product features, but they are downstream of making Sanctum easier to install, verify, and trust.
Why later: new features should follow product discipline, not precede it.
Qwen3-TTS — Local Yoda Voice
Done 2026-04-19. Qwen3-TTS via mlx-audio replaces XTTS-v2 for Yoda’s voice agent. The XTTS plist is now .retired on disk; Qwen3-TTS runs as com.sanctum.yoda-tts-worker on :8008 (workers.tts_server Python module). Zero-shot voice cloning from reference audio. Lower memory pressure than XTTS at equal quality.
Firewalla Gold Pro Migration
Export current Firewalla Purple config and migrate to Gold Pro. Export/import scripts exist (tools/firewalla-export.sh, tools/firewalla-import.sh). Bridge has GET /export endpoint. SSH iptables fallback handles the current firmware’s broken policy:delete API — verify this is still needed on Gold Pro firmware.
Why later: Purple still works. Migration when hardware arrives.
Full Local Voice Pipeline
Replace all cloud dependencies in the voice agent: Deepgram STT → local Whisper via mlx-audio, Claude LLM → local Qwen3.6-35B-A3B-4bit via sanctum-mlx. Combined with Qwen3-TTS, this yields a zero-cloud voice pipeline. Phone calls to Yoda with no data leaving the haus.
Why later: each component needs to be fast enough individually first. Qwen3-TTS speed is the current bottleneck.
Council fallback — restored 2026-04-25
The dead council-guardian.sh::activate_fallback() automation was retired (script went 257 → 203 lines, three structural reasons documented). The replacement lives in the Smart Router cathedral where it always belonged: com.sanctum.mlx-py-fallback plist runs Python mlx_lm.server with mlx-community/Qwen3.5-9B-4bit on 127.0.0.1:8901 (plain HTTP loopback — mTLS is sanctum-server’s job at :8900), and services.sanctum_server.backends.council-secure.fallback_urls lists http://127.0.0.1:8901/v1 after the existing MBP shadow URL.
End-to-end drill on Mini (2026-04-25): Rust primary warm 0.51 s; failover (Rust booted out) cold-load 75 s on first request, 1.07 s warm thereafter; Rust restored 0.51 s. Quality: "2 + 2 equals 4." from Rust, "2 + 2 = **4**" from Python — both correct, minor stylistic difference (different model).
Memory cost: ~5 GB resident (Python holds the 9B model warm). Mini swap settled at ~10 GB after the cache cap was applied — ~/.sanctum/bin/mlx-server-with-caps.py wraps mlx_lm.server and calls mx.set_cache_limit(1 GB) + mx.clear_cache() before launch (mlx_lm.server has no equivalent flag and MLX honors no env var for this). Verified bounded: swap stayed flat at 10.85 GB across a 5-request burst. The wrapper is the same pattern as the Rust binary’s --metal-cache-limit-mb flag.
Receipts: project_failover_drill_2026_04_25.md in agent memory.
Gemma 4 (and other archs) in sanctum-mlx
Today sanctum-mlx::LoadedModel dispatches qwen3_5 and qwen3_5_moe only — anything else fails at config parse and falls back to Python mlx_lm.server. A gemma3.rs loader (Gemma 3/4 share the same Rust shape) plus a third LoadedModel::Gemma3 variant would land Gemma on the same fast path Qwen rides. Estimated 1–2 days of focused work. The fused TurboQuant attention kernel would need a small adapter — FullAttention::forward integration currently lives in qwen3_5.rs.
The honest reason it’s later, not next: this is vendor diversification and abstraction maturity, not the resilience win it sounds like. The shared substrate (MLX runtime, Metal driver, custom kernels, launchd, mTLS) is what actually fails — a second loader doesn’t help when the kernel takes both models down. The Python fallback already covers “the Rust path is broken” for any model upstream mlx-lm supports, including Gemma. Build it when there’s a concrete workload reason (Gemma 4 LoRAs that want fast serving, or a council member that needs Gemma’s tokenizer), not for generic resilience.
The actually-resilience-improving moves come first: exercise the Python fallback regularly (a drill is worth more than a duplicate loader), document the failover decision points, harden the shared kernel layer.
A native iPadOS app that turns any iPad into a full Sanctum control surface. Not a web view in a frame — a real app that feels like it belongs on the device.
Vision: Dock an iPad at any satellite location (chalet, office, cabin) and control the entire galaxy. Voice input via on-device mic, camera feeds, climate zones, agent conversations, security dashboard — all through one interface that works offline when Tailscale drops and syncs when it reconnects.
Core features:
Voice-first: wake word, local speech-to-text, response via AirPlay to HomePods
Dashboard: Camera feeds (Blink, Ring), HVAC zones, alarm status, agent health
Agent chat: Talk to any Jedi directly — routes to the right model via Smart Router
Offline mode: Full functionality on local network without internet
Multi-site: Switch between Manoir, Chalet, and any future satellite with one tap
Kiosk mode: Guided Access lockdown for always-on kitchen/hallway display
Tech stack: SwiftUI + local MLX inference + Home Assistant WebSocket + Sanctum API
Why Phase 3: The infrastructure (Smart Router, Model Tournament, 3-tier routing) must be solid before building a consumer surface on top. Phase 2 made the foundation trustworthy. Phase 3 makes it beautiful.