Architecture at a glance

Hot context
User memory
Auto-loaded every turn. Routing rules, preferences, audit-trail tail.
/memories/*.md
Session memory
Per-conversation scratch. Cleared on close.
/memories/session/
Repo memory
Codebase facts. Append-only, verified at write time.
/memories/repo/
Warm context
Reference memory
Loaded on routing trigger. Topic-scoped per project area. Listed but not auto-included.
/memories/reference/
Semantic brain
Tens of thousands of chunks. Fuzzy recall across SDKs, documentation, prior work, and decisions.
vector store · cloud-hosted
Cold-start brief
Local-model-distilled ~3KB summary of state files + last N audit rows + newest session journal.
coldstart distiller
Cold storage
Draft container
Mutable working backup. 30-day soft-delete window. Versioning + change feed.
object storage · Cool tier
Vault container
WORM-immutable 7yr. Snapshots, one-time promotes. Lifecycle to Archive at 30d.
object storage · Archive tier
Sealed local
.SEALED.zip with DO-NOT-READ.txt sentinel + .gitignore. Agent searches skip it.
_sealed-brain-backups/
Guardrails
Kill switch
Touch a sentinel file. All billable scripts abort exit 99 until removed.
.qwen-killswitch
Spend gate
Checks cloud month-to-date spend vs ceiling before any billable op. Fail-closed.
spend-gate.ps1
Cloud budget
Email alerts at 50% / 75% / 100% actual + 100% forecasted. Outside-in safety net.
provider budgets API
Discipline
Audit-trail
Every session writes one row. Open / closed / locked status. Survives all compaction.
audit-trail.md
Session journals
Per-session detail dossier. Concrete, dense, restorable. Archived when stale.
REVIVAL-*.md
Fix-at-root reflex
Encoded rule: notice bug in shared infra → fix source first, never route around.
fix-at-root-reflex.md

Scaling beyond one workstation: the networked brain

The pattern starts on a single workstation, but the semantic-brain pillar is designed to network. Run the brain as a containerized service exposing HTTP and Server-Sent Events, and any number of agents on any number of machines hit the same recall surface — no per-seat duplication of the corpus, no drift between team members, no re-embedding when a new agent joins.

Brain service
Frozen read collection
Curated corpus baked into the image at build time. Sealed vector segment, fast queries, deterministic across deploys. Restart-safe.
prebuilt-index/ in container
Sidecar write collection
Lazily created at first write in the same persistent client. Holds new memories, ad-hoc corpus loads, per-agent remember() calls. Always writable.
side-collection · same DB file
Merge at query time
Each recall() queries both collections in parallel, merges by distance, returns top-N. Caller never sees the seam.
recall.py merge layer
Multi-agent
Shared HTTP/SSE
Multiple agents — same model or different — connect to one brain URL. Every remember by one agent is queryable by every other within seconds.
/sse · /tool/recall · /tool/remember
Coordination primitives
Optional layer adds claim() / release() / handoff() / pulse_others() so agents don't step on each other on a shared task.
multi-agent coord wedge
Audit-trail still wins
Even with a shared brain, the canonical decision log is the per-repo audit-trail. Brain is for fuzzy recall; audit-trail is for ground truth.
audit-trail.md (per workspace)

This is how the same architecture scales from "one developer, one machine, one Copilot" to "a team of humans + agents working a multi-month codebase together" without redesigning anything below the brain layer.

How a session actually runs

  1. Trigger. User types a routing keyword. Agent loads scoped memory, runs brain pulse, fetches latest revival doc.
  2. Work. Edits, searches, runs commands, calls scripts. Branding stamp + version bump on every touched file.
  3. Checkpoint. Every ~5 exchanges, brain checkpoint. Discoveries written to durable memory immediately, not "noted for later."
  4. Touched-folder log. Anything outside the workspace gets registered so the next backup grabs it.
  5. Close. Audit-trail row + REVIVAL doc + brain remember(). Session memory archived.
  6. Overnight. Scheduled tasks run vault sync, weekly brain snapshot, and rolling-window memory archive — all gated by spend-gate.