Skip to content

Memory Vault

The memory vault — where knowledge goes to be preserved, indexed, and occasionally forgotten on purpose

The Memory Vault is Sanctum’s shared knowledge base, inspired by how human memory works — which is to say, it forgets things on purpose and considers this a feature. It uses a three-tier temporal architecture (working / short-term / long-term) sitting on top of a four-system backend stack (described below), with a daily consolidation process that acts like sleep: distilling raw observations into lasting knowledge.

At some point during development, we realized we’d built a system that remembers where you left your SSH keys, forgets trivial session data, and dreams at 4 AM. The horror movie writes itself.

Today’s stack is the local git vault, fronted by two access surfaces (a Rust REST service and a Python MCP server), plus the entity graph running on the VM. The doc once described a four-layer design that included a mem0.ai cloud tier; that integration was never wired up, and the privacy posture decided on 2026-05-16 (vault GitHub mirror archived, “the haus does not sync its thoughts to the cloud”) makes wiring it now an explicit policy violation. Honesty over aspiration.

LayerRoleBackendCost
Memory VaultGit-native versioned source of truth, primary cross-agent semantic memoryLocal-only git repo at ~/.sanctum/memory/ (GitHub mirror archived 2026-05-16 as part of the privacy scrub; the haus does not sync its thoughts to the cloud)Free
sanctum-memory RESTThe read/write data plane over the vault git repoRust service on 127.0.0.1:42069, serving /health and /v1/* HTTP routes (search, read, write, recall, ingest, consolidate)Free
memory_vault MCPTool surface — how an agent’s tool-call becomes a vault operationPython server (python -m memory_vault) over stdio, exposing the 8 memory_* tools belowFree
Neo4j / GraphitiEntity relationships & graph queriesReached via SSH tunnel on local port 31416 (graph runs on the VM, not the host)Free (self-hosted)
┌─────────────────────────────────────────────────────┐
│ LAYER 1: WORKING MEMORY (per-session, ephemeral) │
│ Agent's context window. Dies when session ends. │
├─────────────────────────────────────────────────────┤
│ LAYER 2: SHORT-TERM MEMORY (inbox/, days-weeks) │
│ inbox/{agent}/ — raw observations, session notes │
│ TTL: 7-30 days. Consolidated or discarded daily. │
├─────────────────────────────────────────────────────┤
│ LAYER 3: LONG-TERM MEMORY (consolidated, permanent)│
│ knowledge/{topic}/ — semantic facts (timeless) │
│ events/YYYY/MM/ — significant episodes │
│ procedures/ — how-to, runbooks │
│ Neo4j/Graphiti — entity relationship graph │
└─────────────────────────────────────────────────────┘
~/.sanctum/memory/
├── inbox/ # Short-term, per-agent
│ ├── claude-code/
│ ├── gemini-cli/
│ ├── openclaw/
│ └── home-assistant/
├── knowledge/ # Long-term semantic
│ ├── devices/
│ ├── network/
│ ├── systems/
│ ├── people/
│ └── preferences/
├── events/ # Long-term episodic
│ └── 2026/03/
├── procedures/ # Long-term procedural
│ ├── troubleshooting/
│ ├── maintenance/
│ └── automation/
├── archive/ # Expired, retained 90 days
└── meta/ # Schema, consolidation logs

Every note has standardized YAML frontmatter. Opinions may vary on whether your haus needs a metadata schema for its thoughts. Opinions are wrong.

---
type: semantic | episodic | procedural | observation | session_summary
source_agent: claude-code | gemini-cli | openclaw | user | system
created: 2026-03-20T17:00:00Z
updated: 2026-03-20T17:00:00Z
last_accessed: 2026-03-20T17:00:00Z
access_count: 3
importance: 0.85
consolidation_status: raw | reviewed | consolidated | archived
ttl_days: null # null = permanent, or integer days
superseded_by: null # path to newer version
tags: [network, infrastructure]
related_entities: [mac-mini, ubuntu-vm]
---
# Note Title
Content with [[wikilinks]] to related notes...

Every note has a computed importance score (0.0 to 1.0) that determines its TTL. The vault is, in essence, deciding what’s worth remembering. A power you’d think we’d reserve for sentient beings, but here we are.

ImportanceTTLNotes
> 0.8PermanentCore knowledge, user-stated facts
0.5 - 0.890 daysAgent-observed patterns
0.3 - 0.530 daysSingle observations
< 0.37 daysEphemeral session data

Score formula: source_weight x recency x access_frequency x link_density

Notes accessed 5+ times are protected from expiry regardless of score. If the system keeps coming back to a memory, there’s probably a reason. Even if that reason is paranoia.

The consolidation engine runs daily at 4:17 AM (LaunchAgent: com.sanctum.memory-consolidate). Like sleep for the brain, except it runs on schedule, never hits snooze, and files a report when it’s done.

  1. Scan inbox for notes older than 24 hours
  2. Deduplicate against existing long-term knowledge
  3. Classify and move to the appropriate long-term directory
  4. Recompute importance scores for all active notes
  5. Expire notes that have exceeded their TTL
  6. Enforce caps via bounded archive-eviction (see Hard Caps below)
  7. Clean archive — delete archived notes older than 90 days
  8. Generate report in meta/consolidation-log.md

Any agent can trigger consolidation manually via the memory_consolidate MCP tool.

To prevent bloat, each layer has a maximum note count. Without limits, a system that remembers everything eventually remembers nothing useful — a problem familiar to anyone who’s ever hoarded browser tabs.

LayerMax NotesAction at Cap
Inbox300Bounded archive-eviction, ≤50/run
Knowledge1000Bounded archive-eviction, ≤20/run
Events500Bounded archive-eviction, ≤100/run
Procedures200Bounded archive-eviction, ≤10/run

Eviction sorts candidates by (importance ascending, created ascending) and skips anything with importance > 0.8 — permanent knowledge survives cap pressure. Each run is rate-limited so a layer at 6× cap converges over many nightly cycles rather than collapsing in one massacre. Archived notes are retained for 90 days via ARCHIVE_RETAIN_DAYS before deletion — eviction is reversible during that window.

ToolDescription
memory_searchFull-text search with tag and folder filters
memory_readRead a note (auto-tracks access)
memory_writeCreate/update with schema enforcement
memory_deleteRemove a note
memory_listList notes by folder, tag, or type
memory_linksTraverse the wikilink graph
memory_consolidateTrigger consolidation (dry-run by default)
memory_healthVault health metrics dashboard

Earlier versions of this doc described a mem0.ai cloud tier as the primary cross-agent memory layer, with a planned set of MEM0_* cloud tools and a “push to Mem0” step in nightly consolidation. None of that integration was ever wired up — there’s no mem0 client in the repo, no API key, and no mem0.ai traffic from the host.

When the privacy posture was decided on 2026-05-16 (vault GitHub mirror archived, “the haus does not sync its thoughts to the cloud”), Mem0-as-cloud-tier became incompatible with explicit policy. The architecture quietly converged on the local-only model documented above; the doc just hadn’t caught up. This section exists so anyone reading old commit messages or older drafts knows what happened: the cloud tier is not coming back absent a deliberate privacy-posture reversal.

ToolVault Access
Gemini CLImemory_vault MCP over stdio (~/.gemini/settings.json)
Claude CodeNot currently wired — mcpServers is empty; reach the vault via the :42069 REST API or vault.sh until the MCP is added
OpenClaw agentsSSH skill (vault.sh)
Sanctum automations:42069 REST API (/v1/write, /v1/search) — the old GitHub path died with the 2026-05-16 archive
ObsidianFile system (~/.sanctum/memory/)
HolocronHausBrainPanel polls /vault/brain from Force Flow (:4077) every 5s; new-note particles fire in real time off an Electron IPC vault-event bridge, not a separate socket

After tonight’s reconciliation pass (2026-05-29), there are no open implementation gaps — the machinery does what this page says it does: short-term inbox + long-term knowledge/events/procedures, nightly consolidation with bounded archive-eviction, 90-day reversibility, the memory_* tool surface, the :42069 data plane.

One thing the caps table doesn’t show: the vault is currently over cap and grinding down. Events sits at roughly 8× its 500-note cap, and the nightly run evicts 100 events per cycle by design (MAX_EVICT_PER_RUN) rather than archiving thousands in one massacre. That’s the bounded-eviction design working, not a steady state — a system converging, not a system at rest.

The procedures layer was empty for months and is now seeded — 5 high-signal runbooks grounded in recent incidents live under procedures/{troubleshooting,maintenance,automation}/. The layer grows organically from there as incidents teach the haus new procedures worth remembering.

The only architectural decision left open is whether to ever re-introduce a cloud-sync tier (see On Mem0 above). That is a privacy-posture decision, not an engineering gap.

  • If derivable from code, logs, or sensors — don’t store it
  • If the same fact is stated multiple ways — dedup into one
  • If never accessed in 30 days and importance < 0.5 — archive
  • Raw data goes to time-series DBs, not memory
  • Every episodic memory needs context (who, what, when, why)
  • Every semantic memory should be self-contained