Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
111 changes: 82 additions & 29 deletions apps/cli/share/agentbox-setup/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,35 +46,56 @@ Look at `/workspace`:
- **Tasks** = one-shot. `pnpm install`, DB migrations, codegen, fixture loaders, install apt packages. Wire dependent services with `needs:` so they wait for the task to finish successfully.
- Names: must match `[A-Za-z0-9_-]+`. Task names and service names share a namespace — no collisions.
- No cycles in `needs:`.
- **Always generate a dependency-install task** and make it the root of the `needs:` graph (every service that needs deps gets `needs: [install, …]`). Future boxes start from a snapshot of the final filesystem so they won't need this, but updates or moving to a cloud provider might need to rebuild the container from scratch. The filesystem can be then later captured by `agentbox-ctl checkpoint --set-default`. The task must be **idempotent and self-healing**: `agentbox-ctl` re-runs pending tasks on every box stop/start (the daemon dies with the container and is relaunched), so a plain `rm -rf node_modules && install` would wipe + reinstall on every start. Guard the rebuild with a marker file *inside* `node_modules` (the `.agentbox-installed` convention AgentBox uses internally): rebuild only when the marker is absent (fresh box), and be a fast no-op once it exists. Detect the package manager from the lockfile — never hardcode `pnpm`. See the worked example below.
- **Always generate a dependency-install task** and make it the root of the `needs:` graph (every service that needs deps gets `needs: [install, …]`). Future boxes start from a snapshot of the final filesystem so they won't need this, but updates or moving to a cloud provider might need to rebuild the container from scratch. The filesystem can be then later captured by `agentbox-ctl checkpoint --set-default`. The task must be **idempotent**: `agentbox-ctl` re-runs pending tasks on every box stop/start (the daemon dies with the container and is relaunched), so an unguarded install would reinstall on every start. The clean way is the **`run_once: true`** field — the supervisor stores a marker keyed by a hash of the command and skips warm boots automatically (the marker lives at `/var/lib/agentbox/tasks/<name>`, on the box rootfs, captured by checkpoints, never polluting `/workspace`). Editing the command re-runs it. Detect the package manager from the lockfile — never hardcode `pnpm`. See the worked example below.
- **Add a comment to the beginning** of the file to explain what you did and what issues you encountered, so that future run might use this information in case the project evolves and you need to update the agentbox.yaml file.

### Stateful services: data persistence & re-seeding (read this for databases)

**Declare a containerized dependency with the `image:` service form** — AgentBox
generates the `docker start`-or-`run` shell (no hand-written `docker run … || docker
start …`). The container runs in the box's dockerd; a published port is reachable
from other in-box services at `127.0.0.1:<host port>`:

```yaml
services:
postgres:
image: # bare string (image: postgres:17-alpine) or a mapping:
name: postgres:17-alpine
ports: ["5432:5432"]
env:
POSTGRES_PASSWORD: postgres
POSTGRES_DB: app
args: "-c max_connections=200" # string or ["-c","max_connections=200"]
container_name: app_db # optional; default = service name
ready_when: { port: 5432 }
restart: always
```

The container is reused by name across box stop/start. (Changing `image`/`env`
reuses the existing container as-is; `docker rm <container_name>` + `agentbox-ctl
reload` to apply.) Install the DB client the migrate/seed tasks need (e.g.
`postgresql-client`) in the `install` task and reach the DB over TCP — don't
`docker exec` the container (nested exec fails with a `setns` error in a box).

**A checkpoint does NOT capture docker-in-docker data.** `agentbox checkpoint` is a `docker commit` of the box's writable filesystem (the system + `/workspace`). The in-box `dockerd` keeps its storage in a *separate* per-box volume (`/var/lib/docker`), which is **not** part of that image — it's fresh on every new box and wiped on `agentbox destroy`. So a database or cache you run as a **docker container** (e.g. `docker run … postgres`) starts **empty on every new box** created from a checkpoint (every `agentbox claude` / `agentbox create`), even though `/workspace` and any marker files you wrote were restored. (A DB run as a **native process** with its data dir on the box filesystem — e.g. `postgres -D /var/lib/postgresql/data` — *is* captured by the checkpoint, since it lives in the writable layer.)

**Consequence for migrate/seed tasks of a containerized DB: do not gate them on a filesystem marker.** A marker like `node_modules/.agentbox-installed` is correct for deps (they live in `/workspace`, which the checkpoint captures), but **wrong** for DB data living in a docker volume: the marker is restored from the checkpoint while the DB is empty, so a marker-guarded seed wrongly skips and the app boots against an empty database. Instead, **gate on the actual data** — connect to the DB and check whether a sentinel table/row exists, and seed only when it's missing:
**Consequence for migrate/seed tasks of a containerized DB: do NOT use `run_once: true` (the marker form).** A command-hash marker is correct for deps (they live in `/workspace`, which the checkpoint captures), but **wrong** for DB data living in a docker volume: the marker is restored from the checkpoint while the DB is empty, so a marker-guarded seed wrongly skips and the app boots against an empty database. Instead use the **`run_once: { check: <cmd> }`** form — the probe runs first and the seed runs unless the probe exits 0, and **no marker is written** (the DB is the source of truth). Gate on the actual data:

```yaml
seed:
# Re-seed when the DB is empty. The postgres data lives in the in-box
# docker volume, which is NOT captured by `agentbox checkpoint` — so a box
# started from a checkpoint has the workspace warm but an empty DB. We can't
# use a filesystem marker here (it would be restored while the DB is blank);
# instead probe the DB and seed only if the data is absent. Fast no-op once
# Re-seed when the DB is empty. The postgres data lives in the in-box docker
# volume, which is NOT captured by `agentbox checkpoint` — so a box started
# from a checkpoint has the workspace warm but an empty DB. The marker form
# would be restored while the DB is blank and wrongly skip; the `check` probe
# gates on the data itself. Exit 0 = already seeded, skip. Fast no-op once
# the data is present.
command: |
set -e
export PGPASSWORD=postgres
# Probe for existing data. If the table is missing the query errors,
# stderr is suppressed, stdout is empty, the grep fails — so we seed.
if psql -h 127.0.0.1 -p 5432 -U postgres -d app -tAc \
"SELECT EXISTS (SELECT 1 FROM users LIMIT 1)" 2>/dev/null | grep -q t; then
echo "data present — skip seed"
exit 0
fi
pnpm db:seed
command: pnpm db:seed
needs: [install, migrate]
run_once:
check: |
export PGPASSWORD=postgres
psql -h 127.0.0.1 -p 5432 -U postgres -d app -tAc \
"SELECT EXISTS (SELECT 1 FROM users LIMIT 1)" 2>/dev/null | grep -q t
```

**Lifecycle nuance (this is why the data check, not a marker, is right):**
Expand Down Expand Up @@ -148,22 +169,19 @@ tasks:
# Idempotent install. /workspace is the container's writable filesystem, so
# node_modules persists across pause/stop/start and is captured by
# `agentbox checkpoint`. The host's node_modules is macOS-native and is
# never copied in, so force a clean Linux build the first time — but skip
# on every subsequent box start (agentbox-ctl re-runs pending tasks after
# stop/start). Adjust the lockfile detection to the project's package
# manager.
# never copied in, so the first Linux install runs; `run_once: true` then
# skips it on every subsequent box start (the supervisor stores a marker
# keyed by a hash of the command). Adjust the lockfile detection to the
# project's package manager.
install:
command: |
set -e
MARKER=node_modules/.agentbox-installed
[ -f "$MARKER" ] && { echo "deps installed (marker present) — skip"; exit 0; }
apt-get update && apt-get install -y postgresql-client
rm -rf node_modules
sudo apt-get update && sudo apt-get install -y postgresql-client
if [ -f pnpm-lock.yaml ]; then
corepack enable >/dev/null 2>&1 || true
pnpm install --frozen-lockfile || pnpm install
fi
touch "$MARKER"
run_once: true

migrate:
command: pnpm db:migrate
Expand Down Expand Up @@ -258,6 +276,41 @@ On Vercel: this actually STOPS the sandbox, so warn the user about it. Also the

- For Nextjs/Vite/Tasnstack projects, makes sure to forward also websocket for hot reload.

- Service like flask, nextjs, BETTER_AUTH_URL, NEXT_PUBLIC_APP_URL should use the <boxname>.localhost url for the local development so that on the host it will use the same url as the box.
- Service like flask, nextjs, BETTER_AUTH_URL, NEXT_PUBLIC_APP_URL should use the `<boxname>.localhost` url for the local development so that on the host it will use the same url as the box. Render this automatically instead of hand-writing `sed` — see section 6c.

- The `install` task above uses `run_once: true`, so it is a no-op on warm boots. Do **not** wrap it in a manual marker check too. To force a one-off rebuild, run `agentbox-ctl run-task install --force` (which bypasses the run_once marker), or edit the command (a changed command invalidates the hash and re-runs).

## 11. Pin URLs / render config files (env, secrets)

Many apps hard-code a hostname (e.g. `optima.localhost`) or read a gitignored `.env`. Instead of long `sed` commands in a task, use the built-ins:

- **`agentbox-ctl render <src>`** — a declarative `sed` for files already in the workspace. `--env` substitutes `{{AGENTBOX_*}}` placeholders; `--rules <name>` applies a named rule-set from the top-level `replacements:` block; `--rule 'from=>to'` / `--rule-regex 'pat=>repl'` are inline. Write to `--out <path>` (or `--in-place`). The whitelist placeholders are `{{AGENTBOX_BOX_NAME}}`, `{{AGENTBOX_BOX_HOST}}` (= `<boxname>.localhost`), `{{AGENTBOX_BOX_ID}}`, `{{AGENTBOX_BOX_KIND}}`, `{{AGENTBOX_HOST_WORKSPACE}}`, `{{AGENTBOX_PROJECT_ROOT}}`.

Render a gitignored `.env` from a committed `env.example` on every boot, pinning the URLs to this box:

```yaml
replacements:
box-host:
- { from: 'optima\.localhost', to: '{{AGENTBOX_BOX_HOST}}', regex: true } # {{AGENTBOX_BOX_HOST}} = <box>.localhost

tasks:
env:
# The render is idempotent (the rules re-pin the same lines every boot), so
# no `run_once:` guard is needed — it self-corrects on a checkpoint-started
# box that carries a different box's host in .env.
command: agentbox-ctl render apps/saas/env.example --out apps/saas/.env --env --rules box-host
```

Note: an `run_once: { check: <cmd> }` probe runs verbatim via `bash -c` with the box env — use shell vars like `$AGENTBOX_BOX_NAME`, NOT `{{…}}` placeholders (those are only expanded by `render`/carry, never by the supervisor).

**Generated secrets:** put `{{AGENTBOX_AUTO_SECRET}}` in the template for a value like `BETTER_AUTH_SECRET` instead of shelling out to `openssl rand`. Unnamed → a fresh 32-byte base64url secret each render (stable when you render the template→`.env` once). `{{AGENTBOX_AUTO_SECRET:better-auth}}` → generated once, persisted at `/var/lib/agentbox/secrets/<name>`, reused on every render (stable even if you render every boot). Example `env.example` line: `BETTER_AUTH_SECRET="{{AGENTBOX_AUTO_SECRET:better-auth}}"`.

- **`carry:` + `replaceEnvs`/`replace`/`rules`** — for a host-only file (e.g. a real `.env` with secrets that never lives in the repo), carry it in and render it host-side in one step (file entries only):

- The `install` task is intentionally a no-op once `node_modules/.agentbox-installed` exists. Do **not** remove the marker guard to "force a fresh install" — that reinstalls on every box start. To force a one-off rebuild, delete `node_modules` (or just the marker) then run `agentbox-ctl reload`.
```yaml
carry:
- src: ~/secrets/optima.env
dest: /workspace/apps/saas/.env
replaceEnvs: true
rules: [box-host]
```
2 changes: 1 addition & 1 deletion apps/cli/share/host-skills/agentbox-info/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -237,7 +237,7 @@ Per-project numeric index (`1`, `2`, …) and friendly name (`review`, `smoke`)
2. **Use `-i` whenever the user asks for parallel agent work** rather than spawning multiple foreground sessions. Then point them at `agentbox dashboard` to watch progress.
3. **Pick the provider deliberately.** `docker` is the fast default. `--provider hetzner` gives a real VPS (heavier, isolated, requires `agentbox prepare --provider hetzner` once). `--provider vercel` is the managed cloud option.
4. **Cross-check before recommending a command.** If a flag isn't listed here, run `agentbox <command> --help` (it's safe and read-only) before suggesting it to the user.
5. **`/agentbox-setup` is a different skill.** It runs *inside* a box to generate `/workspace/agentbox.yaml`. Don't conflate it with `/agentbox` (host-side fork) or this reference skill.
5. **`/agentbox-setup` is a different skill.** It runs *inside* a box to generate `/workspace/agentbox.yaml`. Don't conflate it with `/agentbox` (host-side fork) or this reference skill. When authoring `agentbox.yaml`, prefer the declarative `run_once: true` / `run_once: { check }` task field over hand-rolled marker/probe guards, and `agentbox-ctl render` / carry `replaceEnvs` over `sed` for pinning env URLs to `{{AGENTBOX_BOX_HOST}}`.

## Reference

Expand Down
15 changes: 13 additions & 2 deletions apps/cli/src/lib/carry-gate.ts
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
import { readFile } from 'node:fs/promises';
import { join } from 'node:path';
import { log } from '@clack/prompts';
import { loadEffectiveConfig } from '@agentbox/config';
import { loadCarrySection } from '@agentbox/ctl';
import { parseCarrySection, parseReplacementsSection } from '@agentbox/ctl';
import type { ResolvedCarryEntry } from '@agentbox/core';
import { promptForCarry } from '../carry-prompt.js';
import { resolveCarry } from './carry-resolve.js';
Expand Down Expand Up @@ -37,13 +38,23 @@ export async function runCarryGate(args: CarryGateArgs): Promise<CarryGateResult
const emit = args.onLog ?? (() => {});
const yamlPath = join(args.projectRoot, 'agentbox.yaml');

const items = await loadCarrySection(yamlPath);
// Read agentbox.yaml once; parse both the carry and replacements sections
// from the same text (a single readFile + parse).
let yamlText = '';
try {
yamlText = await readFile(yamlPath, 'utf8');
} catch (err) {
if ((err as NodeJS.ErrnoException).code !== 'ENOENT') throw err;
}
const items = parseCarrySection(yamlText);
if (items.length === 0) return { decision: 'approve', entries: [] };

const cfg = await loadEffectiveConfig(args.projectRoot);
const replacements = parseReplacementsSection(yamlText);
const resolved = await resolveCarry(items, {
projectRoot: args.projectRoot,
maxBytes: cfg.effective.box.cpMaxBytes,
replacements,
});
if (resolved.errors.length > 0) {
const msg = ['carry: refused to proceed:', ...resolved.errors.map((e) => ` - ${e}`)].join('\n');
Expand Down
31 changes: 29 additions & 2 deletions apps/cli/src/lib/carry-resolve.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ import { realpath, stat } from 'node:fs/promises';
import { homedir } from 'node:os';
import { isAbsolute, join, normalize, relative, resolve } from 'node:path';
import { BUILT_IN_DEFAULTS } from '@agentbox/config';
import type { CarryItem } from '@agentbox/ctl';
import { resolveRuleRefs, type CarryItem, type ReplaceRule } from '@agentbox/ctl';
import { effectiveExcludes, isPathExcluded, toTarExcludes } from './dir-breakdown.js';

/**
Expand Down Expand Up @@ -31,6 +31,10 @@ export interface ResolvedCarryEntry {
symlinkInfo?: 'safe' | 'outside-home';
/** tar `--exclude` patterns applied when packing a dir entry. */
exclude?: string[];
/** Substitute `{{AGENTBOX_*}}` placeholders host-side before copy (file only). */
replaceEnvs?: boolean;
/** Final replacement rules (named refs already expanded). File only. */
replace?: ReplaceRule[];
}

export interface ResolveOptions {
Expand All @@ -44,6 +48,8 @@ export interface ResolveOptions {
* built-in `box.cpMaxBytes` when omitted.
*/
maxBytes?: number;
/** Top-level `replacements:` rule-sets, for expanding carry `rules:` refs. */
replacements?: Record<string, ReplaceRule[]>;
}

export interface ResolveResult {
Expand All @@ -61,14 +67,15 @@ export async function resolveCarry(
const home = opts.homeDir ?? homedir();
const cap = opts.maxBytes ?? BUILT_IN_DEFAULTS.box.cpMaxBytes;
const projectRoot = opts.projectRoot;
const replacements = opts.replacements ?? {};

const entries: ResolvedCarryEntry[] = [];
const errors: string[] = [];

for (const [i, item] of items.entries()) {
const where = `carry[${String(i)}]`;
try {
const entry = await resolveOne(item, { projectRoot, home, cap, where });
const entry = await resolveOne(item, { projectRoot, home, cap, where, replacements });
entries.push(entry);
} catch (err) {
errors.push(err instanceof Error ? err.message : String(err));
Expand All @@ -83,6 +90,7 @@ interface OneCtx {
home: string;
cap: number;
where: string;
replacements: Record<string, ReplaceRule[]>;
}

async function resolveOne(item: CarryItem, ctx: OneCtx): Promise<ResolvedCarryEntry> {
Expand All @@ -98,12 +106,25 @@ async function resolveOne(item: CarryItem, ctx: OneCtx): Promise<ResolvedCarryEn
const rawDest = item.dest;
const absDest = item.dest;

// Expand named rule-set refs + inline rules into a single ordered list.
const hasReplaceOpts = !!(item.replaceEnvs || item.replace || item.rules);
const replaceRules: ReplaceRule[] = [
...resolveRuleRefs(item.rules ?? [], ctx.replacements, `${ctx.where}.rules`),
...(item.replace ?? []),
];
const replaceFields = {
...(item.replaceEnvs ? { replaceEnvs: true } : {}),
...(replaceRules.length > 0 ? { replace: replaceRules } : {}),
};

let st: Awaited<ReturnType<typeof stat>>;
try {
st = await stat(absSrc);
} catch (err) {
if ((err as NodeJS.ErrnoException).code === 'ENOENT') {
if (optional) {
// A missing optional entry is skipped at transfer time, so replace
// options are moot — don't carry them onto the tombstone.
return {
rawSrc,
rawDest,
Expand Down Expand Up @@ -142,6 +163,11 @@ async function resolveOne(item: CarryItem, ctx: OneCtx): Promise<ResolvedCarryEn
}

if (st.isDirectory()) {
if (hasReplaceOpts) {
throw new Error(
`${ctx.where}: replaceEnvs/replace/rules are file-only (src "${absSrc}" is a directory)`,
);
}
// Default heavy-dir excludes + the entry's own patterns, applied to both
// the size accounting and the copy step so the cap weighs only what lands.
const tokens = effectiveExcludes(item.exclude ?? [], true);
Expand Down Expand Up @@ -184,6 +210,7 @@ async function resolveOne(item: CarryItem, ctx: OneCtx): Promise<ResolvedCarryEn
...(item.user !== undefined ? { user: item.user } : {}),
optional,
...(symlinkInfo ? { symlinkInfo } : {}),
...replaceFields,
};
}

Expand Down
Loading
Loading