On 2026-04-23 the Mini kernel-panicked twice. On 2026-04-24 LM Studio stalled at 99 percent loading a model while the VM’s qemu process sat SIGSTOP’d for two hours in silence. Different symptoms, same disease: multiple memory-heavy services competing for the same unified RAM pool with no admission control. Each service trusted the kernel to find it space. The kernel, overwhelmed, either panicked or paused at random. We discovered the ceiling by hitting it.
The Capacity Doctrine is the immune-system answer: refuse bad loads up front instead of reacting to failures downstream.
Every heavyweight service has a ram_budget_mb number in ~/.sanctum/capacity.yaml. No budget, no admission. The number is measured with vmmap --summary at peak inference, then rounded up. A false refuse is strictly cheaper than an OOM freeze.
Rule 2: Admission controller gates every load
castellan on port 2189 computes free = pool − Σ phys_footprint(all loaded) − Σ peak_reserve(pinned) from phys_footprint (the same metric jetsam uses), not RSS — pinned services reserve their peak, everyone else only what they hold right now. A load is admitted only if its budget fits that free pool AND none of its exclusive_with neighbours are loaded. A budget shortfall returns HTTP 503 with the shortfall and the eviction order; an exclusion clash returns HTTP 409 with the blocking service name.
Rule 3: Nothing critical gets suspended
The council VM, sanctumd, sanctum-mlx, signal-cli, WindowServer are pinned via evictable: false in capacity.yaml. Castellan programs the kernel’s own arbiter: pinned services get jetsam band critical (190) and stay there. The active loop will never SIGSTOP them; the kernel itself spares them when memory pressure crests.
The admission ledger is not a state file that can drift from reality. On every /status or /services request — and before any /load/{svc} — the daemon re-scans ps output and reconciles against capacity.yaml. A service is loaded iff either:
its process_pattern regex matches some live command line, or
its loaded_check_cmd exits zero within two seconds.
This means the ledger self-corrects across daemon restarts, unclean exits, and manual kill -9 — there is no state to repair because there is no stored state to corrupt.
Exclusion is symmetric and enforced at admission. The higher-priority service wins the tie when both are requested. Priority is also the tiebreaker when the sentinel has to choose between evicting workloads under duress.
Program jetsam priority bands per-PID so the kernel itself shelters Cilghal and sheds Codestral
Every 2s via memorystatus_control; birth-floor via plist JetsamPriority
Admission
Refuse loads that exceed headroom
Before anything allocates
Monitor
Catch half-loaded states and signal-stopped processes
Every 2s tick — pinned-service probe, drift check, SIGSTOP scan
Recovery
Unstick what slipped through
kill -CONT on a stopped service; unload_cmd + load_cmd on a failed probe, rate-limited to one rescue per service per 30 min with backoff after two failures in an hour; deadman SIGCONT if the keeper himself dies
The layers are independent. The kernel arbitration alone would miss admission cases where everything fits but the wrong thing got loaded. Admission alone would miss a runaway allocation that grew after load. Monitoring alone would miss the load that never should have started. Recovery alone is the old watchdog — it reacts too late. Together they form a control loop where every pathway back to a known-good state is explicit. The Castellan implements all four — see The Castellan for the daemon’s anatomy.
The canonical way to load a heavy service is never launchctl kickstart … directly. It is:
Terminal window
castellanload<service-name>
The CLI posts to /load/{svc}. The daemon runs admission first and only then executes the service’s load_cmd, synchronously, returning 200 once the seat probes healthy. A refusal never reaches load_cmd: the CLI exits non-zero on the 503 or 409, remedy printed to stderr. Direct launchctl calls bypass admission — they’re permitted but discouraged, and the kernel-band layer will still steer jetsam toward the right victim if pressure crests.
Each host declares three numbers. The pool is derived:
hosts:
cabane: # slug for the home-server role
hostname: manoir.local# the actual machine on that slug
ram_physical_mb: 65536# Mac Mini M4 Pro, 64 GiB unified
ram_reserved_mb: 10240# macOS + WindowServer + user apps
ram_safety_mb: 4096# hard floor — never fill below
# pool_mb = 65536 − 10240 − 4096 = 51200 MB
The safety margin is non-negotiable. We don’t allocate into it even if admission math says we could — it’s the buffer that keeps the kernel’s own memory-pressure machinery working. Hit the safety margin and you’re in the territory where jetsam starts SIGSTOPping random processes, which is where the 2026-04-24 incident came from.
Before the doctrine, capacity was a hope. Services started on reboot, the OS sorted it out, and when it didn’t, the watchdog noticed eventually. The 2026-04-23 kernel panic and 2026-04-24 freeze were both perfectly-preventable incidents — we had the information to refuse those loads, we just weren’t asking.
After the doctrine, capacity is a refusal. The admission controller can point at any service, at any moment, and say “not this one, not now, here’s why, here’s the remedy.” That is the Apple-like part. That is the military-grade part.
The system no longer discovers the ceiling by hitting it.