What it replaces

Default agent experience

  • Cold start every chat. "Read the scrollback."
  • Auto-compaction picks what survives. Lossy by surprise.
  • Chat-only memory. Disk reality drifts from agent reality.
  • No cost ceiling. Bug in a loop = morning email from billing.
  • Backups are someone else's job.
  • Audit = scrollback search.

Agent OS

  • Cold start reads a 3KB curated brief. Zero context loss.
  • Compaction can happen safely — durable memory survives it.
  • Memory tiers map to disk, vector brain, and immutable cloud.
  • Three-layer cost gate. Worst case = abort, never overspend.
  • Nightly azcopy + weekly brain snapshot + sealed local copy.
  • Audit-trail with one row per session. Always greppable.

How this fits next to existing tools

Plenty of good work exists in this space. Calling out the baselines so the contribution here is precise — Agent OS is the combination, not any one piece.

GitHub Copilot Chat memory
tiered memory · per workspace
The actual host platform behind the reference implementation. Provides the memory tiers Agent OS leans on. What's missing in the box: durable cold storage, cost gates, the discipline conventions, the local-model distillation.
Cursor · .cursorrules
project-scoped instructions
Solves preference persistence and routing. No durable journal, no audit trail, no cold-start protocol, no cost gates. Sits at the same tier as the routing rules in /memories/.
Cline / Roo Code
custom-instructions + agentic loops
Strong on agentic tool use. Memory is per-conversation. Same gap as default Copilot: no durable layer between conversations, no cost gate, no immutable audit.
Aider · .aider.chat.history
conversation log + git commits
Audit-as-git is excellent — every change is a commit. Closest existing analog to the audit-trail pillar. Doesn't have the brain, the vault, or the cost gates.
Continue.dev
configurable IDE assistant
Configurable model routing and context providers. Comparable to the routing layer. Memory and audit story is lighter; no cold-storage tier.
LangChain / LlamaIndex memory
programmatic memory abstractions
Library-level building blocks for memory. You'd assemble something like the memory + brain pillars on top of these. Doesn't speak to discipline conventions or cost gates — that's outside their scope.
AutoGen · CrewAI · smol-agents
multi-agent orchestration
Solves how agents coordinate during a session. Agent OS is concerned with what survives between sessions. Complementary, not competing — wire either of these on top of the brain layer for multi-agent.

What's distinctive here: the tiered-memory + WORM-vault + fail-closed cost gate + local-model distiller + networked brain — assembled and disciplined as one architecture, with conventions that keep the agent honest. Each individual pillar has prior art. The integration is the point.

A note on which AI you're driving

The reference implementation was field-tested with Claude (Sonnet / Opus class) running inside GitHub Copilot Chat in VS Code. That combination shaped some of the conventions:

Run the same architecture against a different agent and expect to retune. The pattern is portable; the specific cadence isn't. Treat anything in the reference scripts as a starting point, not a contract.