Hardware reality · Recall™ Agent OS

By the numbers

<$1

Azure Blob storage

7 yr

WORM retention

~3 KB

Cold-start brief

Independent fail-safes

Profile-by-profile breakdown

Everything in Agent OS besides local-model distillation (memory tiers, vault, guardrails, audit-trail, continuity discipline) works on any machine that runs your editor. The table below is specifically about the distillation pillar.

Profile	Local model	Distill speed	Verdict
Workstation, 16 GB+ VRAM RTX 4080/5080/5090, M3/M4 Max, similar	14B coder model, full quality	~3–15 sec / summary	Reference-implementation experience.
Mid-range laptop, 8 GB VRAM RTX 3060/4060, M2 Pro, etc.	7B coder model, q4 quantized	~10–30 sec / summary	Workable. Summaries shorter, slightly noisier.
Integrated graphics no discrete GPU	3B model on CPU, or hosted small model	~30–90 sec / summary	Slow but functional. Or skip local entirely and use a cheap hosted model — the spend gate keeps it bounded.
Phone-class compute	—	—	Not the target. The pattern needs some available compute for distillation.

The five non-compute pillars don't care about the GPU. If you can run an editor and reach object storage, you can run Memory + Vault + Guardrails + Brain backup + Continuity unchanged. Local compute is the cherry on top, not the load-bearing wall.

Cost expectation

Reference implementation, on the documented setup, has run for months at well under $1/month in cloud spend. The line items:

Object storage (Cool tier): $0.01/GB/month for the active draft container. ~200 GB ≈ $2/mo nominal, but Archive lifecycle drops most of it to $0.00099/GB/month within 30 days.
Vector brain (cloud-hosted): a small container app instance, ~$0.30/mo for hobbyist usage. Network egress dwarfs compute; tens of millions of recall queries fit comfortably in the free tier of most provider networks.
Local compute: $0.00. The local model never bills.
Egress / API calls: negligible. The agent's expensive token stream is paid by the host editor, not Agent OS.

The reference spend-gate script ships with a $25/month ceiling as the default circuit breaker. It's not a target spend or a recommendation — it's a number that makes a runaway script hit a wall instead of a billing email. Replace it with whatever number you want; the gate just needs to fail-closed before it touches money.