A deterministic ralph-loop orchestrator for the BMAD-METHOD implementation phase
Plain Python drives the loop — pick story → implement → adversarially review → verify → commit — while LLMs do only the creative work, inside disposable, fresh-context coding-agent sessions you can attach to and watch.
The live TUI dashboard — run picker, sprint tree, deferred-work ledger, per-story task table, and a tailing journal. Jump to the TUI tour ↓
A tour of the dashboard — walking the runs table, unfolding the sprint tree, opening a deferred-work entry, answering a decision a past sweep left unanswered, typing a story into the start-run modal, a sweep blocked on a decision, and scrolling the policy editor out to its worktree-isolation + config-seeding knobs. More on the TUI ↓
Inspired by the original bmad-automator (a separate, legacy project), it takes a token-optimized approach in which the orchestrator is ordinary code rather than an LLM session in the control loop:
- 🧠 No LLM in the control loop. Story selection, retry budgets, gates, and completion checks are code, not prompts — so they're deterministic, debuggable, and free.
- 📡 No pane-scraping. Coding-agent hooks (
Stop/SessionStart/SessionEnd/PreCompact) write structured event files the orchestrator watches; skills in automation mode write a machine-readableresult.jsonat the end of each workflow. - 🔍 Trust nothing, verify everything. After each session the orchestrator checks artifacts on disk: spec frontmatter status, baseline-commit match (recorded independently — a cheap LLM-lie detector), non-empty diff, sprint-status sync, and your test/lint commands before any commit.
- 📒 One source of truth.
sprint-status.yamlis owned by the BMAD skills; the orchestrator only ever reads it. - 🪟 Fresh context per step. Dev and review are separate sessions — review never inherits the implementer's context, so there's no anchoring bias.
- ♻️ Resumable & multi-agent. Every run is a resumable state machine on disk, and a generic tmux adapter drives
claude,codex, orgemini(mix per stage). - 🌿 Optional worktree isolation. Opt in (
[scm] isolation = "worktree") and each story runs in its own git worktree/branch and merges back locally — your main checkout stays free while a run is in flight.
- Python 3.11+, tmux, and a supported coding CLI —
claudeby default;codexandgeminivia profiles. - A BMAD v6 project (
_bmad/bmm/config.yaml, asprint-status.yamlfrombmad-sprint-planning) with the automator skill module from this repo installed (bmad-auto-dev,bmad-auto-review,bmad-auto-sweep— see Installing the skill module). Standard BMAD skills stay untouched.
uv sync --extra tui # core is pyyaml-only; [tui] adds the dashboard
cd /path/to/your/bmad/project
bmad-auto init # installs bmad-auto-* skills + hooks + .automator/policy.toml + gitignore
bmad-auto validate # preflight: config, sprint-status, git, tmux, CLI, hooks
bmad-auto run --dry-run # print the plan without spawning anything
bmad-auto run # go
bmad-auto tui # …or drive everything from the dashboardOne-time setup: if the coding CLI has never run in the target project, start it once (
claude) and accept the workspace-trust dialog (and any hooks-approval prompt) beforebmad-auto run. Spawned sessions can't answer first-run dialogs, and a pending dialog reads as a session timeout to the orchestrator.
| Command | What it does |
|---|---|
bmad-auto init |
Install the bundled bmad-auto-* skills, the hook relay, .automator/policy.toml, and a runs-dir gitignore. --cli <profile> (repeatable) targets specific agents; --no-skills / --force-skills control skill copying. |
bmad-auto validate |
Preflight every prerequisite: BMAD config, sprint-status, git, tmux, CLI binary, hook registration. |
bmad-auto run |
Drive the dev → review → verify → commit loop. --epic N, --story KEY, --max-stories N, --dry-run. |
bmad-auto sweep |
Triage + execute open deferred-work.md entries. --no-prompt, --decisions-only, --max-bundles N, --repeat, --max-cycles N, --dry-run. |
bmad-auto resume <run-id> |
Continue a run paused at a gate, escalation, or interruption. |
bmad-auto resolve <run-id> |
Resolve a CRITICAL escalation: open an interactive resolve agent to fix the frozen spec, then re-arm the story and resume. --story KEY, --no-interactive, --resume / --no-resume. |
bmad-auto decisions |
Answer deferred-work decisions earlier sweeps left unanswered (skipped by --no-prompt, or an abandoned interactive sweep). Recorded so the next sweep acts on them without re-asking. --list shows them without answering. |
bmad-auto list (ls) |
List every run/sweep with its short ref, type, and status — the handle you pass to the commands below. |
bmad-auto status [<run-id>] |
Run + sprint summary with per-story token totals (plus a count of decisions awaiting an answer). |
bmad-auto attach [<run-id>] |
tmux-attach to a run's live agent session. |
bmad-auto stop <run-id> |
Stop a live run — the engine and its agent tmux session. |
bmad-auto delete <run-id> |
Delete a run directory. --force stops the run first if it is still live. |
bmad-auto archive <run-id> |
Compress a run into .automator/archive and remove the run dir. --force stops the run first if it is still live. |
bmad-auto cleanup |
Remove leftover tmux artifacts for the current project: kill bmad-auto-<id> sessions for finished/stopped/interrupted runs (and orphans whose run dir is gone) and close parked bmad-auto-ctl windows. --dry-run lists without killing. Live runs — and any session/window belonging to another project — are never touched. |
bmad-auto clean |
Reclaim disk from concluded runs per [cleanup]: tear down git worktrees a mid-flight stop orphaned (freeing their Unity Library/ + MCP-server builds), trim the heavy worktrees/ tree from runs kept for history (they stay viewable in the TUI), and archive/delete runs past the retention window. Only finished/stopped runs are touched; --dry-run previews, --keep <run-id> protects, --retain N overrides the window, --hard deletes instead of archiving. |
bmad-auto tui |
The interactive dashboard (needs the [tui] extra). --low-frame-rate caps it to 15fps + disables animations (fixes repaint tearing over slow/SSH links; also [tui] low_frame_rate). |
bmad-auto probe-adapter <cli> (collect-adapter-data) |
Collect + sanitize the data needed to finalize a CLI adapter profile (hook payload shape, transcript location/format, token schema). Default is a zero-launch scan; --probe opts into a live capture. --transcript, --session-dir, --binary (CLIs with no profile yet), --out, --json. See the adapter authoring guide. |
Every command takes --project <dir> (default: the current directory). Any <run-id> may be a
partial — the tail after the last - (e.g. a1b2), shortened to any prefix that stays unique;
bmad-auto list shows each run's short ref.
uv sync --extra tui # textual + tomlkit + pyte
bmad-auto tuiA live, read-only dashboard over everything below — and a launcher for new runs. It's the fastest way to understand what the orchestrator is doing.
The left column stacks the runs table (newest auto-selected), an expandable sprint tree (epics → stories/retro, completed items checked green), and the deferred-work ledger (severity colour-coded). The right column shows the selected run's header (status, epic, task counts, cost-weighted token total), a per-story table (phase · dev attempts · review cycles · tokens · commit/defer info), and tabs tailing the journal, the active session's pane log, and the ATTENTION file.
Sweeps run as their own [sweep]-tagged runs. When an attended sweep hits a "needs human decision" item it blocks on its own terminal prompt; the dashboard spots the decision-pending journal event and raises a banner + toast — press a to attach to the sweep's window, answer, and detach.
Unattended sweeps (--no-prompt) skip decisions, and an attended one can be abandoned mid-way — those answers would otherwise be lost. The Deferred Work pane shows the outstanding count (— N to answer (d)); press d (or run bmad-auto decisions) to walk each one. A close is applied immediately; a build / keep-open is saved to .automator/decisions.json and consumed by the next sweep with no re-prompt.
enter on any ledger row opens the full entry; r / s open modals to launch a run or sweep (epic, story, max-stories, dry-run).
Press g to edit .automator/policy.toml in a form grouped by section — comment-preserving (tomlkit), validated with the engine's own parser before saving, with unset keys showing their defaults as placeholders. Every section starts collapsed with a one-line description; ctrl+e expands/collapses all at once.
| Key | Action |
|---|---|
r / s |
start a run / sweep (modal for epic, story, max-stories, dry-run…) |
e |
resume the selected paused/interrupted run |
R |
resolve a run paused at an escalation (interactive, then re-arm) |
d |
answer deferred-work decisions past sweeps left unanswered |
a |
attach to the live agent session (or the orchestrator window) |
x |
stop the selected live run (engine + agent session) |
D / A |
delete / archive the selected run (force-stops a live run first) |
c |
clean up tmux sessions/windows for finished & stopped runs |
v |
run bmad-auto validate, output in a modal |
g |
settings editor for .automator/policy.toml |
M / q |
toggle theme (light/dark mode) / quit |
The TUI is an observer/launcher, never the engine. Runs started with r/s are detached bmad-auto processes in windows of a dedicated tmux session (bmad-auto-ctl), so they survive a TUI exit or crash; the dashboard watches runs purely through the run-dir artifacts the engine writes atomically, so runs started from a plain shell show up identically. Launch and attach need tmux; the dashboard itself does not. Pid-based liveness is local-only — a run whose engine died shows interrupted (press e); runs on other hosts show unknown.
📖 See docs/tui-guide.md for the full guide — layout, every key and modal, status glyphs, the settings field reference, and troubleshooting. Vector (SVG) versions of every screenshot live in
docs/images/.
sprint-status.yaml: 1-2-account-mgmt: ready-for-dev
│
├─ DEV tmux window: claude "/bmad-auto-dev 1-2-account-mgmt"
│ bmad-auto-dev: plans a 1.5–4k-token spec,
│ auto-approves it, implements, syncs sprint → review,
│ writes result.json … Stop hook signals the orchestrator
├─ VERIFY spec exists · status in-review · baseline matches · diff non-empty
│ · run [verify].commands (pytest, ruff…) — a broken build never
│ reaches review; a failure spawns a fix session fed the output
├─ REVIEW fresh window: claude "/bmad-auto-review <spec>"
│ static prefilter → 3 layers (Blind Hunter / Edge Case Hunter /
│ Acceptance Auditor) → verify findings against code → triage →
│ auto-apply patches → ledger → defer ambiguity → done when clean
│ (bounded loop, default 3 cycles)
├─ VERIFY spec done · sprint done · run [verify].commands again — a failure
│ routes a feedback-driven dev fix session, then a fresh review cycle
└─ COMMIT orchestrator commits (then, under [scm] isolation = "worktree",
merges the unit branch back into the target branch locally);
epic boundary → gate / retro notification
Failure handling: bounded dev retries (verify-command failures keep the tree and feed the failing output to the next session via --feedback; other failures roll back to baseline), plateau-defer when review won't converge (story skipped, spec stashed into the run dir, deferred-work.md additions preserved, run continues), and typed escalations — CRITICAL pauses the run and notifies you (desktop + ATTENTION file), PREFERENCE is journaled and the run continues.
Resolving a CRITICAL escalation: the escalated story is parked in a terminal escalated phase — resume skips it. To un-stick it, run bmad-auto resolve <run-id> (or press R in the TUI). That opens an interactive resolve agent seeded with the escalation and the frozen spec; you converse with it to disambiguate the spec, it records the resolution, and on your confirmation the orchestrator re-arms the story (escalated → pending, spec status reset to ready-for-dev) and resumes — a clean rebuild against the corrected spec, then on through the rest of the sprint. Already fixed the spec yourself? bmad-auto resolve <run-id> --no-interactive skips straight to re-arm + resume.
Skills accumulate an append-only ledger (deferred-work.md, DW-<n> entries) of split-off goals, pre-existing review findings, and items deferred as "needs human decision". bmad-auto sweep processes it:
bmad-auto sweep [--no-prompt] [--decisions-only] [--max-bundles N] [--repeat] [--max-cycles N] [--dry-run]
│
├─ TRIAGE fresh window: claude "/bmad-auto-sweep"
│ verifies EVERY open entry against the actual code (ledger
│ statuses are unreliable) and returns a machine-validated
│ partition: already-resolved (orchestrator closes them, with
│ evidence) · bundles (cohesive buildable groups) · blocked ·
│ skip · decisions (frozen-block renegotiations, scope reversals)
├─ DECIDE interactive runs walk you through each decision on the
│ terminal (build / close / keep-open per option, with a
│ recommendation); answers land in the ledger as `decision:`
│ lines. Unattended runs skip this and leave decisions open.
└─ BUNDLES each bundle runs the normal pipeline: bmad-auto-dev (--dw-bundle)
→ bmad-auto-review → verify commands → commit. The review gate also
checks every bundle entry is `status: done` in the ledger.
Answering missed decisions later. An unattended sweep (--no-prompt) skips decisions, and an interactive one can be abandoned before you answer them all — those answers would otherwise be lost, since triage re-derives the decision set from the ledger every run. bmad-auto decisions (or press d in the TUI) surfaces every decision past sweeps left unanswered, reconstructed from their triage output, and lets you answer them out of band. A close is applied immediately; a build/keep-open is saved to .automator/decisions.json and consumed by the next sweep (build → bundle, keep-open → recorded) with no re-prompt. --list shows them without answering; bmad-auto status reports the outstanding count.
Sweeps are their own resumable runs (bmad-auto resume <id>). [sweep] auto in the policy fires an unattended sweep automatically at epic boundaries or run end; a failed/paused child sweep never interrupts the parent run.
Bundle dev sessions can themselves append new deferred entries (split-off goals, review findings). With [sweep] repeat (or --repeat) the sweep re-triages after each cycle and keeps going on that newly generated work, stopping when a cycle completes nothing addressable — nothing closed as already-resolved or by decision, no bundle done — or at max_cycles. Bundles that failed in an earlier cycle and entries a human chose to keep open are never re-bundled.
The orchestrator drives its own forks of the BMAD dev/review skills — your standard BMAD install is never modified. The five skills are bundled in the bmad-auto wheel (canonical source: src/automator/data/skills/, BMAD module code bauto) so bmad-auto init lays them down for you:
| Skill | Role |
|---|---|
bmad-auto-dev |
unattended implementation (fork of bmad-quick-dev) |
bmad-auto-review |
unattended adversarial review (fork of bmad-code-review) |
bmad-auto-resolve |
interactive CRITICAL-escalation resolution (/bmad-auto-resolve <story>) |
bmad-auto-sweep |
deferred-work ledger triage (automation-only) |
bmad-auto-setup |
registers the module in _bmad/ config + help |
Via uv + bmad-auto init (self-sufficient). Installing the tool and running init is all you need — init installs the bmad-auto-* skills into .claude/skills/ (claude) and/or .agents/skills/ (codex/gemini) for the CLIs you select, alongside the hooks and policy:
# latest from main (tracks HEAD — newest features, less stable):
uv tool install "bmad-auto[tui] @ git+https://github.com/bmad-code-org/bmad-auto.git"
# OR a pinned release tag (reproducible — recommended for day-to-day use):
uv tool install "bmad-auto[tui] @ git+https://github.com/bmad-code-org/bmad-auto.git@v0.5.1"
bmad-auto init --project /path/to/project --cli claude # add --cli codex/gemini as needed
claude "/bmad-auto-setup accept all defaults" # registers _bmad/ config + helpThe [tui] extra pulls in the dashboard/settings UI (textual); drop it for a headless install. bmad-auto --version confirms what you've got. Existing skill dirs are left untouched (--force-skills to overwrite a stale copy, --no-skills to manage skills yourself).
Easiest — let the setup skill do it. Re-running /bmad-auto-setup (or /bmad-auto-setup upgrade) on an already-installed project performs the two-step ritual for you: it detects the existing install, upgrades the tool with --reinstall, re-lays the per-project skills with --force-skills, and re-stamps config — then reports the before → after version.
claude "/bmad-auto-setup upgrade"Manual — the two steps it runs. Use these directly for non-Claude CLIs, CI, or scripting. Upgrading is two steps — the tool and the per-project skill copies, which init froze at install time and a tool upgrade does not touch:
# 1. upgrade the tool. --reinstall is required for a git source: a plain
# `uv tool upgrade` reuses the cached commit and won't pull new code.
uv tool upgrade bmad-auto --reinstall # follows main or your pinned tag
# to move to a newer tag, re-run install with the new ref:
# uv tool install --force "bmad-auto[tui] @ git+https://github.com/bmad-code-org/bmad-auto.git@v0.5.1"
# 2. re-lay the refreshed skills into EACH project that uses bmad-auto:
bmad-auto init --project /path/to/project --force-skillsYour .automator/policy.toml is left untouched on upgrade — new keys are optional and fall back to their defaults, so configs survive. Check the CHANGELOG / releases for what changed between tags.
To remove bmad-auto from a project, see Uninstalling — it reverses what init laid down (state, skills, hooks, gitignore) and uninstalls the tool.
Via the BMAD-method installer. The installer also copies the five bmad-auto-* skills into your project (but not the orchestrator tool). Finish setup with /bmad-auto-setup, which installs the tool from Git, asks which coding CLIs to drive, registers their hooks (init skips the already-present skills), and runs the preflight:
claude "/bmad-auto-setup accept all defaults"See docs/setup-guide.md for the full walkthrough — choosing CLIs, installing the tool and TUI together or separately, and initializing codex/gemini.
The skills must be installed together: bmad-auto-review writes deferred-work entries per bmad-auto-dev/deferred-work-format.md (sibling skill directory). If you carry _bmad/custom/bmad-quick-dev.toml or bmad-code-review.toml customization overrides, duplicate them as bmad-auto-dev.toml / bmad-auto-review.toml — overrides are keyed by skill directory name.
To pull in upstream BMAD improvements, diff the upstream skill against the fork (diff -r <bmad-install>/bmad-quick-dev src/automator/data/skills/bmad-auto-dev) and merge manually; the forks keep the upstream file structure to make this easy.
bmad-auto init writes this template; running engines snapshot it at start, so edits apply to new runs and resumes (edit it live from the TUI with g).
[gates]
mode = "per-epic" # none | per-epic | per-story-spec-approval
retrospective = "notify" # never | notify | auto
[limits]
max_review_cycles = 3
max_dev_attempts = 2
session_timeout_min = 90
stop_without_result_nudges = 1 # times to re-prompt a session that stopped with no result.json
max_tokens_per_story = 2000000
cache_read_weight = 0.1 # cache reads bill at ~0.1x input everywhere; 1.0 = count raw
[verify]
commands = ["pytest -q", "ruff check ."]
[notify]
desktop = true # desktop notification on gate pauses / escalations
file = true # append the same alerts to the run's ATTENTION file
[review]
enabled = true # false = skip the separate review session; the dev pass
# runs quick-dev's own internal triple-review and finalizes to done
[adapter]
name = "claude" # CLI profile: claude | codex | gemini | custom
model = "" # empty = CLI default
cleanup_session_on_finish = true # kill the run's tmux session when it finishes (false keeps it for inspection)
# extra_args replaces the profile's default bypass flags when set:
# extra_args = ["--permission-mode", "bypassPermissions"]
# Optional per-stage overrides — run the review pass on a different CLI/model
# than the dev pass. Unset keys inherit from [adapter] when the stage runs the
# same client; switching client falls back to that profile's defaults (model
# and extra_args are client-specific).
# [adapter.dev]
# model = "opus"
# [adapter.review]
# name = "codex"
# model = "gpt-5-codex"
# [adapter.triage] # sweep triage stage
# model = "opus"
[sweep]
auto = "never" # never | per-epic | run-end (auto sweeps never prompt)
max_bundles = 5 # bundles executed per sweep; triage excess truncated
max_triage_attempts = 2 # triage validation retries before escalating
max_migration_attempts = 2 # legacy-ledger migration retries before escalating
repeat = false # re-triage after each cycle, continue on new deferred work
max_cycles = 5 # safety cap on cycles per sweep run when repeat = true
[cleanup] # disk reclamation for .automator/runs (terminal runs only)
run_retention = 10 # newest concluded runs kept whole; older ones trimmed/archived by `clean` (0 = none)
retention_days = 0 # 0 = off; else also keep runs newer than N days regardless of count
trim_artifacts = true # drop the heavy worktrees/ tree from concluded runs (run stays viewable in the TUI)
archive_old = true # archive (vs hard-delete) runs past the window
auto_clean_on_finish = true # reconcile worktrees leaked by a mid-flight stop at each run/sweep start
clean_tmp = true # let engine plugins clean their /tmp scratch on finish (e.g. Unity MCP zips)
[scm] # source-control isolation + merge-back; defaults = work in place
isolation = "none" # none | worktree
branch_per = "story" # story | run (worktree mode only; "run" forces delete_branch = false)
target_branch = "" # "" = the branch checked out at run start
merge_strategy = "merge" # ff | merge | squash (how a unit branch lands on the target)
delete_branch = true # delete the unit branch after a successful merge
keep_failed = true # keep a failed unit's worktree + branch mounted for inspection
failed_diff_max_mb = 5 # per-file cap (MB) for untracked files in a kept-failed unit's changes.patch
failed_diff_unlimited = false # true = no size cap on the failed-unit diff (warns when active)
commit_message_template = "" # {story_key} / {run_id} substituted; empty = built-in default
max_parallel = 1 # units in flight at once (parallel fan-out unbuilt; values > 1 clamp to 1)
seed_adapter_defaults = true # copy each loaded adapter's gitignored MCP/CLI configs into the worktree
worktree_seed = [] # extra project-relative gitignored files to seed, on top of adapter defaults
[tui]
low_frame_rate = false # true = cap to 15fps + disable animations (= bmad-auto tui --low-frame-rate)Gate modes: none runs everything unattended; per-epic (default) pauses at epic boundaries; per-story-spec-approval pauses after each spec is written so you approve it before implementation is reviewed.
Review: [review].enabled = false drops the separate fresh-context review session; the dev pass instead runs bmad-quick-dev's own internal triple-review (Blind Hunter / Edge Case Hunter / Acceptance Auditor) and finalizes the story straight to done — one session per story instead of two, verify commands still gating the commit. Governs deferred-work sweeps too.
bmad-auto init (without --cli) registers hooks for every CLI profile the policy references, so a dual-client setup needs no extra flags.
By default work happens in place on the checked-out branch ([scm] isolation = "none" — byte-for-byte the prior behavior). Set isolation = "worktree" and each story (and each sweep bundle) runs in its own git worktree on a dedicated automator/<run_id>[/<story>] branch cut from the target branch, then merges back into the target locally (merge_strategy = ff / merge / squash). The main checkout stays free while a run is in flight, and run state never moves into a worktree — .automator/ always lives in the main repo.
branch_per—story(a branch per story) orrun(one shared branch across the run; this forcesdelete_branch = falseso the shared branch survives between units).target_branch— the branch every unit merges into; empty means the branch checked out at run start. A configured branch is created if missing (a detached HEAD or unborn repo pauses the run rather than merging onto an unreferenced commit).keep_failed(default on) — a deferred/escalated unit's worktree + branch stay mounted for inspection, and its full diff (tracked + untracked) is preserved torun_dir/failed/<unit>/changes.patch.failed_diff_max_mbcaps the per-file size of untracked files in that patch (oversized files skipped with a marker);failed_diff_unlimitedlifts the cap.commit_message_template— when set, the message used for story/bundle commits ({story_key}/{run_id}substituted).seed_adapter_defaults(default on) /worktree_seed— a worktree checks out tracked files only, so a project's gitignored MCP/CLI configs (.mcp.json,.claude/settings.json,.codex/config.toml,.gemini/settings.json) are absent from a fresh worktree — without them an isolated session can't reach its MCP server and stalls on readiness. Withseed_adapter_defaultson, each loaded adapter's own configs are copied in from the main repo before the session launches (the defaults live in each CLI profile'sseed_files);worktree_seedadds extra project-relative paths on top. Seeding is copy-when-absent and runs before the signal-hook merge, so a seededsettings.jsonkeeps its real content and just gains the Stop hook — and the seeded paths are shielded from the unit'sgit add -A.
Merge-back is always serialized — max_parallel is a validated knob clamped to 1 until parallel fan-out lands. PRs aren't created automatically; open them by hand from the unit branches afterward if you want them.
For a monorepo or any layout where the git root differs from the project dir, set an optional repo_root key in _bmad/bmm/config.yaml — it decouples where git/code work happens from where run state lives (defaults to the project dir).
The orchestrator is extensible through a plugin system — a general layer that adapts the run/sweep cycle without touching the core loop. A plugin is a folder-drop plugin.toml manifest (metadata, declarative [hooks.<stage>] shell commands, a [[settings]] schema, and an optional in-process [python] module), bundled under automator/data/plugins/<name>/ and overridable per project at .automator/plugins/<name>/. At every run/sweep lifecycle stage a plugin can observe, veto (defer / pause / skip), and mutate a shared context; a zero-plugin run pays nothing (O(1) no-op fast path) and stays byte-identical to before.
Two trust tiers: a data-only / declarative plugin (settings + shell hooks) takes effect as soon as its folder is discovered, while a plugin that ships an in-process [python] module is never imported unless its name is listed in [plugins] enabled in .automator/policy.toml — dropping a folder in never runs code. Every hook is failure-isolated: a raise is caught, journalled, and disables that instance for the rest of the run rather than crashing it. A plugin's [[settings]] render in the TUI settings editor and persist under [plugins.<name>].
[plugins]
enabled = ["unity"] # only these plugins' [python] modules loadSee Writing a bmad-auto plugin for the manifest, hook, stage, settings, trust, and workflow reference; a complete worked example ships under examples/plugins/guardrails/.
A niche game-engine layer — built on the plugin system — for projects whose dev/sweep cycle needs the agent to drive a live engine Editor — e.g. a Unity project the agent manipulates through an Editor MCP (IvanMurzak/Unity-MCP or CoplayDev/unity-mcp). It's off by default; normal projects never list it in [plugins] enabled and nothing changes. Unity ships bundled at automator/data/plugins/unity/, overridable per project under .automator/plugins/unity/.
The core constraint: a live Editor MCP can only act on the folder its Editor has open, and Unity binds one Editor per folder and can't be repointed live. So editor_mode is coupled to [scm] isolation:
shared(default; requiresisolation = "none") — the agent works in place on the project your warm Editor already has open. Zero relaunches, full live MCP, the Editor stays open across stories. Before each unit runs, a readiness gate blocks until the Editor + MCP report ready (so a session never starts against a half-open Editor); if it never comes up the unit is deferred with anATTENTIONnotice instead of failing mid-session.per_worktree(requiresisolation = "worktree") — one managed Editor per worktree, run serially. For each unit a setup hook makes the fresh worktree a usable Unity project (launches its own Editor on the worktree path, writes the worktree's.mcp.json, primes the worktree'sLibrarywith a reflink/CoW copy of the warm mainLibraryso the import is incremental, not a crash-prone cold reimport), the readiness gate waits for it, the agent drives it, then a teardown hook quits that Editor — on completion and on pause/escalation, so it never outlives its worktree. The MCP server's generated skill tree is gitignored (absent from a fresh checkout), so the plugin seeds it into each worktree viaseed_globs. If setup fails the unit is deferred rather than run against no Editor.
Enable shared mode (the recommended Unity workflow) in .automator/policy.toml:
[plugins]
enabled = ["unity"]
[plugins.unity]
editor_mode = "shared" # requires [scm] isolation = "none" (the default)
mcp = "ivanmurzak" # ivanmurzak | coplaydev
unity_path = "" # explicit Editor binary for per_worktree; "" = auto-detect
ready_timeout_sec = 600
ready_grace_sec = -1 # delay before the first readiness probe; -1 = autoAll five [plugins.unity] keys are editable in the TUI settings editor (g) under the
Unity plugin's section (shown once unity is in [plugins] enabled). To run a project on
a different engine — or reshape the Unity plugin — see Writing a Game Engine
plugin (manifest schema, lifecycle hooks, a minimal
Godot example) and Writing a plugin for a specific Editor MCP
(IvanMurzak vs CoplayDev, readiness probing, per-worktree isolation, and the full
BMAD_AUTO_UNITY_* env-var reference). The legacy [engine] block still loads — it's
folded onto [plugins.unity] with a deprecation warning — but will be removed in a future
release; migrate to [plugins] enabled = ["unity"].
The readiness gate runs the plugin's ready_cmd (unity_ready.py), which for ivanmurzak shells out to the Unity-MCP CLI's wait-for-ready (with an explicit --timeout, since the CLI's own default is only 120s) and for coplaydev does a connectivity check against the MCP server. It first waits ready_grace_sec for the Editor to start before probing — -1 (the default) auto-picks 120s for a cold per_worktree Editor and 0s for a warm shared one — then retries so a fast connection-refused against a not-yet-listening Editor doesn't abort the gate; the grace counts against ready_timeout_sec. The exact CLI name/subcommand and endpoint move between MCP releases — verify against your installed version and override ready_cmd (or the whole plugin) under .automator/plugins/unity/ if they differ.
For per_worktree, set editor_mode = "per_worktree" with [scm] isolation = "worktree". The bundled Unity plugin wires the worktree-Editor lifecycle against the IvanMurzak CLI (open / setup-mcp / close, which key off the project path with auto port detection — verified against v0.81.1). A fresh worktree has no Library (it's gitignored), and opening Unity on an empty Library forces a cold full reimport that crashes the import workers on a real project — so the setup hook primes the worktree's Library with a reflink/CoW copy of your warm main Library (<repo>/Library), near-instant on btrfs/xfs, making the import incremental; it falls back to a deep copy, then to a symlinked empty cache under the gitignored .automator/cache/, off-CoW or when no warm Library exists. Tune this with BMAD_AUTO_UNITY_LIBRARY_SEED / …_SEED_MODE (and BMAD_AUTO_UNITY_LIBRARY_CACHE for the fallback cache root — see the Game Engine MCP guide for the full env reference); a Unity Accelerator helps further, and unity_path pins the Editor binary. A cold worktree Editor takes time to launch and import — bump ready_grace_sec/ready_timeout_sec if your project's first import runs long. CoplayDev's single shared-server model isn't wired for a managed per-worktree launch — point worktree_setup_cmd/worktree_teardown_cmd at your own scripts under .automator/plugins/unity/, or use shared mode.
Everything about a run lives in .automator/runs/<run-id>/ (gitignored): state.json (resumable engine state), journal.jsonl (every decision), events/ (hook signals), tasks/<id>/ (per-session prompt + result + escalations), logs/ (raw pane output, debugging only), deferred/ (stashed specs from deferred stories), resolve/<story>/ (escalation context.json + the resolve agent's resolution.json), ATTENTION (human-readable alerts).
Token usage is read from each CLI's local session transcript (selected by the profile's usage_parser) and aggregated per story (bmad-auto status).
Each run drives its agents inside a dedicated tmux session, bmad-auto-<run-id>. It is torn down automatically when the run finishes (disable with [adapter] cleanup_session_on_finish = false to inspect agent windows afterwards), and stop always kills it. A paused or interrupted run keeps its session for resume, which clears any stale session and spins up a fresh one. Sessions left behind by older runs — or by a cleanup_session_on_finish = false policy — can be swept any time with bmad-auto cleanup (or c in the TUI).
One generic driver (adapters/generic_tmux.py) runs any coding CLI that fits the tmux-injection + hook-signal transport; everything CLI-specific lives in a declarative profile (adapters/profile.py). Built-in profiles ship as TOML in automator/data/profiles/:
| Profile | Status | Notes |
|---|---|---|
claude |
supported | reference implementation |
codex |
supported, E2E-verified | Codex ≥ 0.139. No slash expansion in the initial prompt — the profile renders $skill-name mentions (plus a "use subagents as needed" nudge) instead. No SessionEnd hook; window-death fallback covers crashes. |
gemini |
supported, E2E-verified | Gemini CLI ≥ 0.46 (hooks on by default since then). Launches with -i to stay interactive; AfterAgent maps to canonical Stop. Usage parser validated against real chat logs. |
copilot |
supported, E2E-verified | GitHub Copilot CLI (the copilot binary, GA ≥ 2026-02) — not the VS Code extension. Launches with -i to stay interactive; turn-end is agentStop (per response turn); --allow-all-tools for unattended runs. copilot-events usage parser reads token totals from the trailing session.shutdown line, so the profile waits a short grace (usage_grace_s = 8) before tallying. Pin a capable model (see below). |
Copilot — pin a capable model: Copilot's free default (GPT-5 mini) is unreliable for the multi-step dev/review skills — it silently skips steps mid-workflow and fails the story. Set a capable model in policy, e.g. [adapter] model = "claude-sonnet-4-6" (passed through as --model), for end-to-end reliability. Because Copilot fires agentStop per response turn, a thorough multi-turn review needs more than one nudge to finish; the profile ships stop_without_result_nudges = 5, and you can tune it per stage (e.g. [adapter.review] stop_without_result_nudges = …). Both knobs are editable in the settings TUI under [adapter].
On budgets: agentic sessions are dominated by cache reads (80–90%+ of raw tokens), which every supported vendor bills at ~0.1x base input. The max_tokens_per_story check therefore uses a cost-weighted total — cache reads count at limits.cache_read_weight (default 0.1) — while displayed totals stay raw. Set the weight to 1.0 to budget raw tokens.
Shared prerequisites: the bmad-auto-* skills must be present in .agents/skills/ (codex and gemini read it; Claude Code reads .claude/skills/), and each CLI must have been run once interactively in the project for auth/trust — bmad-auto init --cli codex --cli gemini installs the skills into .agents/skills/, registers the hook relay, and prints the per-CLI first-run steps.
Adding a CLI without touching Python: drop a TOML file in <project>/.automator/profiles/<name>.toml (same fields as the built-ins: binary, prompt_template, bypass flags, a [hooks] block picking one of the config dialects claude-settings-json / codex-hooks-json / gemini-settings-json / copilot-settings-json, and a native→canonical event map). The hook relay script and orchestrator are CLI-agnostic — each registration passes the canonical event name as the script argument. A CLI whose hook config clones one of the existing dialects (the ecosystem trend) needs nothing else; a genuinely different transport gets its own adapter class instead (see the opencode HTTP+SSE design stub in adapters/opencode_http.py).
Finalizing a profile: the facts a profile needs that live in no doc — the CLI's exact hook payload shape, its transcript location/format, and the token schema a usage_parser reads — are collected and sanitized by bmad-auto probe-adapter <cli> (a zero-launch scan by default, or --probe for a live capture). The adapter authoring guide walks through using it end to end.
Cursor CLI is currently blocked on two gaps, for whoever picks it up: token usage is not exposed anywhere (hooks, JSON output, or on-disk chats), and slash-command expansion of the initial prompt argument is unverified — its sessionStart/stop hooks do fire in the CLI, so a profile using the window-death fallback plus usage_parser = "none" is feasible.
uv sync --all-extras # adds pytest, ruff, pytest-asyncio (+ the [tui] extra)
uv run pytest -q # unit + engine scenarios (mock adapter) + tmux integration
uv run ruff check src tests scriptsRegenerating the screenshots in this README: they're rendered headlessly from a populated mock project (no live engine needed) — see scripts/gen_screenshots.py.
uv sync --extra tui
uv run python scripts/gen_screenshots.py # writes docs/images/*.svg + *.png (PNG needs `resvg` on PATH)
uv run python scripts/gen_demo.py # writes docs/images/demo.gif (needs `resvg` + `ffmpeg`)The hero demo GIF (docs/images/demo.gif) is generated the same headless way — gen_demo.py drives the read-only TUI through a scripted walkthrough and stitches the frames with ffmpeg. (scripts/record-demo.sh is an alternative that records a real live run via VHS or asciinema, if you'd rather show actual agent sessions.)
- docs/FEATURES.md — full feature & functionality list and the capability matrix (feature → problem addressed).
- docs/setup-guide.md — installing the module + the
/bmad-auto-setupwalkthrough. - docs/tui-guide.md — the complete TUI reference.
- src/automator/data/skills/README.md — the
bautoskill module overview. - docs/ROADMAP.md — planned/deferred orchestrator work and the rationale behind it.
Contributions are welcome. Start with CONTRIBUTING.md — for anything bigger than a typo or small bug fix, talk to a maintainer on Discord first. By participating you agree to our Code of Conduct. To report a vulnerability, see SECURITY.md.
bmad-auto is released under the MIT License, © BMad Code, LLC. The BMad name and brand are trademarks of BMad Code, LLC and are not covered by the MIT License — see TRADEMARK.md.





![The settings editor with the [scm] section expanded: isolation, branch_per, merge_strategy, the seed-adapter-configs switch, and the extra-worktree-seed-files field.](/bmad-code-org/bmad-auto/raw/main/docs/images/settings-scm.png)