The dumb, tireless driver — Box 2 of The Machine (the deterministic control plane), the piece that was missing for code work. It is the deliberately-stupid loop that keeps work moving without a human pressing the button. It spends zero model tokens; the model only runs inside the worker step.
Part of Frontier Infra — sibling to AVL (view), AAR (proof), and Conductor (the ops driver). machine-driver is the code-work driver: the same dumb-loop shape as Conductor — checkpoint/resume, per-task isolation, verify-by-result — pointed at repos instead of a help-desk queue.
Conformance to the six-box spec: see
CONFORMANCE.md· contributing/agents:AGENTS.md.
- State —
goal.jsonon disk. The goal never lives only in a context window. - Driver — the
whileloop. Deterministic, no judgment, runs for days. - Worker — one fresh process per task (
worker_cmd). The only place judgment is spent. - Verify —
verify_cmdexit code is ground truth. Pass → done. Fail → re-queue (fresh attempt) → block and surface aftermax_attempts. It cannot silently skip. - Autonomy —
mode:"propose"(leave the diff for review) or"commit"(proceed). Fail-closed to propose.
- Copy
goal.example.json→goal.json. Pointrepoat a real repo. - Set
worker_cmdto your coding agent, e.g.claude -p "{task}" --permission-mode acceptEdits. - Set
verify_cmdto your ground truth, e.g.npm test(or the goal-contract gate, orcargo test, etc.). - Break the goal into a few small
tasks(each one a fresh-context burst). python3 driver.py goal.json. Watch it take a step, get verified, take the next.
Start in "propose". Turn the dial to "commit" one goal at a time, as the verifier earns your trust. That is Part 5 — a setting, not a rebuild.
goal.json now carries two extra blocks, both deterministic, neither adds a model token:
contract(Box 0) —definition_of_done,acceptance_tests,immutable,autonomy_ceiling,proposed_by,ratified_by. The driver refuses to commit unless the contract is independently ratified (ratified_by != proposed_by) — the Council is the natural ratifier of the target; ground-truth tests stay the verifier.autonomy_ceiling:0=propose,1+=commit-allowed.budget(Box 5 governor) —max_worker_runs/max_wall_seconds. Breach → HALT + operator alert. This is the control the 131-duplicate incident lacked.
Every transition appends a hash-chained receipt to aar.jsonl — this is the driver-loop audit trail (a tamper-evident sequence of dispatch/requeue/halt), not yet a canonical signed AAR; see Iteration 2. Block / quarantine / budget-halt fire a Telegram alert (TELEGRAM_BOT_TOKEN / TELEGRAM_CHAT_ID), never a silent stdout line.
The hash-chain in aar.jsonl is loop telemetry, not the proof layer. The real proof is
the org's own standard: a per-task, Ed25519-signed Agent Attestation Record
(../agentcontrolplane, agentscontrolplane.org). Because AAR is our standard, the driver
must emit it — that's the conformance proof, not a nicety.
Two artifacts, split cleanly:
driver-log.jsonl— rename ofaar.jsonl; the hash-chained loop audit (dispatch / requeue / halt). Stays.aar/<task-id>.json— NEW: one canonical signed AAR per verified/contradicted task.
The verify step already produces L2-shaped material — verifier ≠ worker for free:
| AAR field | from the driver |
|---|---|
aar |
"0.02" |
subject |
the worker — did:web:<org>:machine-driver |
principal |
the signing org — did:web:<org> (= sig.by) |
task |
{ "id": task.id, "claim": task.goal } |
verdict |
"verified" (verify exit 0) · "rejected" (non-zero) |
ground_truth |
"confirmed" (exit 0) · "contradicted" (non-zero) |
reason |
one line, e.g. "verify_cmd exited 0 against repo HEAD" |
checks |
[{ source: repo, query: verify_cmd, observed_at: now, response_sha256: sha256(verify_output), excerpt: tail }] |
verifier |
{ id: did:web:<org>:<verifier>, independence: "same_principal" } — id != subject ⇒ L2 |
issued |
now() |
Integration (driver stays pure-Python; shell out to our own signer):
- Build the record from the verify result → write
aar/<task-id>.json. node ../agentcontrolplane/tools/aar.mjs sign aar/<task-id>.json --priv <key>→ addssig(Ed25519, JCS-canonical,sig.by = principal).- self-test:
node ../agentcontrolplane/tools/aar.mjs verify aar/<task-id>.json→ expect→ conformance: L2.
New goal.json keys: aar_tool (path to aar.mjs), aar_priv (key path), subject, principal. If absent ⇒ skip AAR (still write driver-log.jsonl), so the driver runs keyless.
The one human-gated input (irreducibly yours): the signing identity + key.
node ../agentcontrolplane/tools/aar.mjs keygen --did did:web:titaniumcomputing.com:machine-driver --out-priv secrets/machine-driver.jwk.json --out-did <domain>/.well-known/did.json, then publish did.json. The worker is replaceable; the principal is not.
Conformance target: L2 now (signed + independent verifier + committed evidence); L3 (transparency log) later. Mirror how Conductor already emits AAR so the two siblings match.
Iteration-1 hardened + smoke-tested (happy path emits AAR · governor halts + alerts · no-verifier ⇒ propose-only). To reach Level 3–4 conformance next:
- canonical signed AAR — see Iteration 2 above: shape per-task records + sign via our own
aar.mjs→ L2; - Council = Box-0 ratifier — wire
roundtable.shto draft/ratify the contract before the first real run; - idempotency enforcement against a real board/gateway (reject the duplicate ACTIVE row, not just record the key);
- feed
tasksfrom GitHub Issues / the ArgentOS board instead of a hand-written list; - keep task selection dumb (rules table only if ordering ever needs it — never a model in the loop).