Skip to content

feat(agentbox.yaml): idempotent tasks + replacement engine (render/carry)#80

Closed
madarco wants to merge 12 commits into
mainfrom
streamline-agentbox-yaml
Closed

feat(agentbox.yaml): idempotent tasks + replacement engine (render/carry)#80
madarco wants to merge 12 commits into
mainfrom
streamline-agentbox-yaml

Conversation

@madarco
Copy link
Copy Markdown
Owner

@madarco madarco commented Jun 7, 2026

What

Streamlines the common agentbox.yaml patterns so projects stop hand-rolling marker guards and sed commands. Three declarative primitives:

1. idempotent: tasks

Tasks re-run on every box start, so they must be idempotent. Now declarative:

  • idempotent: true — supervisor stores a marker keyed by a hash of the resolved command (/var/lib/agentbox/tasks/<name>, on rootfs, captured by checkpoints, off /workspace). Editing the command re-runs it.
  • idempotent: { check: <cmd> } — probe-first; exit 0 = skip, no marker. The right guard for state outside the checkpoint (e.g. a containerized DB), where a filesystem marker would desync.
  • run-task --force bypasses both.

2. Replacement engine (render + carry)

A declarative sed alternative, pure engine in @agentbox/core (re-exported by @agentbox/ctl):

  • agentbox-ctl render <src> — in-box: --env (whitelist {{AGENTBOX_*}} placeholders), --rules, --rule, --rule-regex, --out/--in-place.
  • carry replaceEnvs / replace / rules — host-side rendering of carried files to a temp before copy (file-only; original host file untouched).
  • top-level replacements: — reusable named rule-sets referenced by both.

Placeholders are a fixed whitelist (predictable; secrets never substitutable); arbitrary substitutions go through explicit rules.

Verified end-to-end on a real project (Evinto/optima)

Rewrote optima's agentbox.yaml to use all three (installidempotent: true, envrender --rules, restore/seedidempotent: { check }) and ran a full box:

  • .env rendered with BETTER_AUTH_URL/NEXT_PUBLIC_APP_URL pinned to the box host + generated secret ✓
  • install marker written; stop/start → idempotent: marker matches — skip
  • seedidempotent: check passed — skip; restore re-runs (no schema) ✓
  • postgres + Next.js dev reach ready ✓

The e2e surfaced (and this PR fixes) a real bug: the marker dir /var/lib/agentbox was root-owned but the daemon runs as vscode (EACCES → silent re-run). Fixed by baking+chowning /var/lib/agentbox on all providers (docker/hetzner/vercel/e2b) plus a writable <logDir>/state fallback (excluded from checkpoint-cleanup truncation).

Docs / skills

agentbox-yaml.mdx, in-box-supervisor.md, features.md, JSON schema + drift fixtures, and the agentbox-setup skill (now teaches the declarative fields over manual markers/sed) all updated in-PR.

Tests

New unit coverage for the engine, idempotent (marker/check/force/fallback), carry parse/resolve/render. Full suites green (core 20, ctl 221, sandbox-core 34, cli 489).


Note

Medium Risk
Changes supervisor task re-run behavior and host-side carry rendering at box create; misconfigured idempotent checks could skip needed seeds or re-run heavy installs, but scope is confined to box boot and file copy paths.

Overview
Adds declarative idempotent tasks and a shared text-replacement engine so agentbox.yaml can skip hand-rolled marker files and sed.

Tasks: idempotent: true stores a command-hash marker under /var/lib/agentbox/tasks/<name> (rootfs, checkpointed, not under /workspace); idempotent: { check: … } probes first and skips on exit 0 with no marker (for state outside checkpoints, e.g. containerized DB seeds). The supervisor implements the gate in TaskRunner; run-task --force bypasses it. Images now create/chown /var/lib/agentbox on all providers, with a writable <logDir>/state fallback and checkpoint cleanup that preserves that fallback.

Replacements: New @agentbox/core replace.ts (whitelist {{AGENTBOX_*}} + ordered rules), top-level replacements: in config/schema, in-box agentbox-ctl render, and carry file options replaceEnvs / replace / rules rendered host-side via renderCarryEntries before docker/cloud copy. The carry gate/resolver loads replacements and expands named rule refs.

Docs/skills (agentbox-setup, agentbox-yaml.mdx, features, supervisor) and tests/schema drift are updated accordingly. Removes an obsolete Claude-memory seed plan; adds an unrelated future user-defined shims design doc only.

Reviewed by Cursor Bugbot for commit 5b30be4. Configure here.

madarco added 5 commits June 7, 2026 09:48
…I + carry)

- tasks: idempotent: true (command-hash marker) | { check } (probe)
- shared pure replacement engine in @agentbox/core (env placeholders + rules)
- agentbox-ctl render CLI (declarative sed alternative)
- carry: replaceEnvs / replace / rules (host-side, file-only)
- top-level replacements: reusable named rule-sets
- schema + drift fixtures + unit tests
- agentbox-yaml.mdx: idempotent field, replacements section, carry replace fields, placeholder table
- in-box-supervisor.md + features.md: implementation notes
- agentbox-setup skill: teach idempotent:/render/replaceEnvs over manual markers + sed
- agentbox-info skill: one-line pointer to the new declarative fields
The \.optima\.localhost (leading-dot) regex wouldn't match a bare
optima.localhost; use optima\.localhost -> {{AGENTBOX_BOX_HOST}}.
E2E on a real box surfaced EACCES writing /var/lib/agentbox (root-owned,
daemon runs as vscode), so idempotent: true silently re-ran every boot.

- supervisor: resolve a writable stateDir at init, falling back to <logDir>/state
  (always daemon-writable, on rootfs/checkpointed, off /workspace) when the
  configured dir isn't creatable — works on every provider without an image bake
- Dockerfile.box: mkdir+chown /var/lib/agentbox to vscode so docker uses the
  clean default path
- test: fallback path covered
- bake /var/lib/agentbox on hetzner/vercel/e2b base images (was docker-only),
  so idempotent markers use the clean path on every provider, not the fallback
- checkpoint-cleanup: exclude /var/log/agentbox/state from truncation so the
  marker fallback survives a checkpoint
- core: single deriveBoxHost() shared by both placeholder-context builders
- carry-render: dedup the wantsRender predicate; build the log suffix with join
- carry.ts: fold parseRulesRefs/parseExclude into one parseStringList
- carry-resolve: drop redundant replaceFields guard
- carry-gate: read agentbox.yaml once, parse carry + replacements from one text
@vercel
Copy link
Copy Markdown

vercel Bot commented Jun 7, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
agentbox-web Ready Ready Preview, Comment Jun 7, 2026 2:57pm

Request Review

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 5b30be4. Configure here.

Comment thread packages/ctl/src/supervisor.ts Outdated
const idem = this.spec.idempotent;
if (!idem) return null;
if (idem.kind === 'check') {
return (await this.runCheck(idem.command, cwd)) ? 'check passed' : null;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check probes ignore placeholders

Medium Severity

idempotent: { check: … } runs the probe string verbatim via bash -c, but the PR’s docs and skills show checks like grep -q '{{AGENTBOX_BOX_HOST}}' …. Those {{AGENTBOX_*}} tokens are only expanded by agentbox-ctl render / carry rendering, not by the supervisor, so the probe searches for literal braces and never exits 0 after a successful render—warm boots keep re-running the task instead of skipping.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 5b30be4. Configure here.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — the supervisor runs the check verbatim via bash -c and never expands {{…}} (render-only). Fixed in 1749f78: dropped the unnecessary check from the naturally-idempotent render example, and documented that check probes are plain shell (use $AGENTBOX_BOX_NAME). The optima e2e config was unaffected (its checks use plain psql).

Cursor Bugbot: example checks used grep -q '{{AGENTBOX_BOX_HOST}}', but the
supervisor runs the probe verbatim via bash -c and never expands {{...}}
(render-only), so it matched literal braces and re-ran every boot. Drop the
unnecessary check from the naturally-idempotent render example, and document
that check probes use shell vars ($AGENTBOX_BOX_NAME).
The stateDir-fallback test used /proc/nope as the unwritable path; mkdir under
/proc behaves differently on Linux (slow) than macOS and timed out in CI. Use a
regular file as the parent dir (deterministic ENOTDIR, instant, cross-platform)
and add a 20s testTimeout cushion for the two-cycle idempotent tests.
madarco added 2 commits June 7, 2026 12:45
- services: image: form (ports/env/args/container_name) synthesizes the
  docker start-or-run shell; command|image mutually exclusive; reused by name
- render: {{AGENTBOX_AUTO_SECRET}} (fresh per render) and :name (persisted at
  <stateDir>/secrets/<name>, reused) — replaces openssl rand in env tasks
- shared resolveWritableStateDir (state-dir.ts) backs markers + secrets
- schema (oneOf command/image + dependentRequired) + drift fixtures
- unit tests (config image synth/xor, secret per-render vs persisted)
- docs + agentbox-setup skill
So a replacement rule can emit an {{AGENTBOX_AUTO_SECRET}} token that the
secret pass then resolves in a single render (e.g. 'your-secret-here=>{{...}}').
Verified e2e: optima's env task renders env.example with a box-host rule + a
persisted secret in one pass.
Clearer, less jargon (AgentBox is unreleased, so a clean rename — no alias).
Renames the YAML key, schema property, TS types (RunOnceSpec/parseRunOnce/
TaskSpec.runOnce), supervisor skip logic + log lines, docs, skills, and the
services-and-tasks guide (now recommends run_once over hand-rolled markers).
The marker write was fire-and-forget, so a task reached 'done' before its
marker hit disk — a CI race (slower fs) and a latent durability gap (a crash
in between would lose the marker and re-run next boot). Await the write, then
setState('done').
Per review, group the container config under image: instead of flat sibling
keys. image: is now either a bare ref string (image: redis:7) or a mapping
{ name, ports, env, args, container_name }. Container env moves to image.env
(top-level env on an image service is rejected). Schema/tests/docs/skill +
optima updated.
@madarco
Copy link
Copy Markdown
Owner Author

madarco commented Jun 7, 2026

Integrated into nightly via fast-forward (f87ad48..4173323, commits 987bcb4…4173323d6). Closing — the work is on nightly; no separate main merge needed for now.

@madarco madarco closed this Jun 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant