By the numbers

<$1
Azure Blob storage
7 yr
WORM retention
~3 KB
Cold-start brief
3
Independent fail-safes

Profile-by-profile breakdown

Everything in Agent OS besides local-model distillation (memory tiers, vault, guardrails, audit-trail, continuity discipline) works on any machine that runs your editor. The table below is specifically about the distillation pillar.

ProfileLocal modelDistill speedVerdict
Workstation, 16 GB+ VRAM
RTX 4080/5080/5090, M3/M4 Max, similar
14B coder model, full quality ~3–15 sec / summary Reference-implementation experience.
Mid-range laptop, 8 GB VRAM
RTX 3060/4060, M2 Pro, etc.
7B coder model, q4 quantized ~10–30 sec / summary Workable. Summaries shorter, slightly noisier.
Integrated graphics
no discrete GPU
3B model on CPU, or hosted small model ~30–90 sec / summary Slow but functional. Or skip local entirely and use a cheap hosted model — the spend gate keeps it bounded.
Phone-class compute Not the target. The pattern needs some available compute for distillation.

The five non-compute pillars don't care about the GPU. If you can run an editor and reach object storage, you can run Memory + Vault + Guardrails + Brain backup + Continuity unchanged. Local compute is the cherry on top, not the load-bearing wall.

Cost expectation

Reference implementation, on the documented setup, has run for months at well under $1/month in cloud spend. The line items:

The reference spend-gate script ships with a $25/month ceiling as the default circuit breaker. It's not a target spend or a recommendation — it's a number that makes a runaway script hit a wall instead of a billing email. Replace it with whatever number you want; the gate just needs to fail-closed before it touches money.