Compare · Recall™ Agent OS

What it replaces

Default agent experience

Cold start every chat. "Read the scrollback."
Auto-compaction picks what survives. Lossy by surprise.
Chat-only memory. Disk reality drifts from agent reality.
No cost ceiling. Bug in a loop = morning email from billing.
Backups are someone else's job.
Audit = scrollback search.

Agent OS

Cold start reads a 3KB curated brief. Zero context loss.
Compaction can happen safely — durable memory survives it.
Memory tiers map to disk, vector brain, and immutable cloud.
Three-layer cost gate. Worst case = abort, never overspend.
Nightly azcopy + weekly brain snapshot + sealed local copy.
Audit-trail with one row per session. Always greppable.

How this fits next to existing tools

Plenty of good work exists in this space. Calling out the baselines so the contribution here is precise — Agent OS is the combination, not any one piece.

GitHub Copilot Chat memory
tiered memory · per workspace

The actual host platform behind the reference implementation. Provides the memory tiers Agent OS leans on. What's missing in the box: durable cold storage, cost gates, the discipline conventions, the local-model distillation.

Cursor · .cursorrules
project-scoped instructions

Solves preference persistence and routing. No durable journal, no audit trail, no cold-start protocol, no cost gates. Sits at the same tier as the routing rules in /memories/.

Cline / Roo Code
custom-instructions + agentic loops

Strong on agentic tool use. Memory is per-conversation. Same gap as default Copilot: no durable layer between conversations, no cost gate, no immutable audit.

Aider · .aider.chat.history
conversation log + git commits

Audit-as-git is excellent — every change is a commit. Closest existing analog to the audit-trail pillar. Doesn't have the brain, the vault, or the cost gates.

Continue.dev
configurable IDE assistant

Configurable model routing and context providers. Comparable to the routing layer. Memory and audit story is lighter; no cold-storage tier.

LangChain / LlamaIndex memory
programmatic memory abstractions

Library-level building blocks for memory. You'd assemble something like the memory + brain pillars on top of these. Doesn't speak to discipline conventions or cost gates — that's outside their scope.

AutoGen · CrewAI · smol-agents
multi-agent orchestration

Solves how agents coordinate during a session. Agent OS is concerned with what survives between sessions. Complementary, not competing — wire either of these on top of the brain layer for multi-agent.

What's distinctive here: the tiered-memory + WORM-vault + fail-closed cost gate + local-model distiller + networked brain — assembled and disciplined as one architecture, with conventions that keep the agent honest. Each individual pillar has prior art. The integration is the point.

A note on which AI you're driving

The reference implementation was field-tested with Claude (Sonnet / Opus class) running inside GitHub Copilot Chat in VS Code. That combination shaped some of the conventions:

The /memories/ tier scopes (user / session / repo) match Copilot Chat's memory-tool surface. Other hosts expose memory differently — the pattern still applies, the file paths and load triggers may not.
MCP tool calling, the specific way read_file / grep_search / run_in_terminal compose, and the brain's HTTP/SSE bridge were tuned against Claude's tool-use behavior. Other models route through tools with different latency, different parallelism, and different willingness to read large context.
The cold-start protocol assumes the agent will actually read a 3KB brief instead of paraphrasing it. That's been more reliable with Claude than with smaller or earlier-generation models in the same harness.
Branding stamps, audit-trail discipline, and the fix-at-root reflex are conventions encoded in instruction files. They depend on the host honoring instruction files at all — which varies a lot across editors, models, and product versions.

Run the same architecture against a different agent and expect to retune. The pattern is portable; the specific cadence isn't. Treat anything in the reference scripts as a starting point, not a contract.