Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .claude-plugin/marketplace.json
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
"name": "bauto",
"source": "./src/automator/data/skills",
"description": "Automation-mode skills driven by the bmad-auto orchestrator: unattended dev (bmad-auto-dev), adversarial review (bmad-auto-review), and deferred-work sweep triage (bmad-auto-sweep)",
"version": "0.6.1",
"version": "0.6.2",
"author": {
"name": "pinkyd"
},
Expand Down
23 changes: 23 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,28 @@ All notable changes to `bmad-auto` are documented here. The format is based on
[Semantic Versioning](https://semver.org/spec/v2.0.0.html). While the project is pre-1.0,
breaking changes may land in a minor release.

## [0.6.2] — 2026-06-21

### Added

- **`bmad-auto probe-adapter` (alias `collect-adapter-data`).** A self-service command that
collects and sanitizes everything needed to finalize a CLI adapter profile — the hook payload
shape, transcript location/format, and token-usage schema for a `usage_parser` — so a user of
any coding CLI can paste back a clean, content-free report. A default zero-launch **scan** reads
on-disk conventions; opt-in `--probe` does a live capture in an ephemeral workspace. All output
passes through one audited PII sanitizer (token counts and field names survive; paths, prose, and
emails are redacted).
- **GitHub Copilot CLI profile.** Bundled `copilot` profile (Copilot CLI ≥ 2026-02): `-i`
interactive launch, VS Code-compatible `Stop` hook, `--allow-all-tools` for unattended runs.
Still pending live E2E and a `usage_parser` — `probe-adapter` captures the token schema to write
one.

### Docs

- **Adapter authoring guide.** New [adapter authoring guide](docs/adapter-authoring-guide.md)
walks through finalizing a CLI profile with `probe-adapter` (scan vs probe, the PII model, and
the parser-writing loop); `probe-adapter` is added to both command references.

## [0.6.1] — 2026-06-20

### Added
Expand Down Expand Up @@ -429,6 +451,7 @@ enforced in CI.
implementation phase, driven by a Python control loop with hook-based session transport and
resumable on-disk run state.

[0.6.2]: https://github.com/bmad-code-org/bmad-auto/releases/tag/v0.6.2
[0.6.1]: https://github.com/bmad-code-org/bmad-auto/releases/tag/v0.6.1
[0.6.0]: https://github.com/bmad-code-org/bmad-auto/releases/tag/v0.6.0
[0.5.0]: https://github.com/bmad-code-org/bmad-auto/releases/tag/v0.5.0
Expand Down
2 changes: 1 addition & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -182,7 +182,7 @@ Keep messages under 72 characters. Each commit = one logical change.
- **Tests** live under `tests/`; add or update them for behavior changes. The mock adapter lets most of the loop run without a live CLI.
- **Skills** ship as markdown under `src/automator/data/skills/` (the `bmad-auto-*` automation skills).
- **Plugins** extend the orchestrator via a `plugin.toml` manifest — see the [plugin authoring guide](docs/plugin-authoring-guide.md).
- **New coding CLIs** are usually a TOML profile, not Python — see the CLI adapter section in the [README](README.md).
- **New coding CLIs** are usually a TOML profile, not Python — see the CLI adapter section in the [README](README.md) and the [adapter authoring guide](docs/adapter-authoring-guide.md) (use `bmad-auto probe-adapter` to collect the hook/transcript/token data a profile needs).

---

Expand Down
52 changes: 28 additions & 24 deletions README.md

Large diffs are not rendered by default.

3 changes: 3 additions & 0 deletions docs/FEATURES.md
Original file line number Diff line number Diff line change
Expand Up @@ -113,8 +113,10 @@ See [README.md](../README.md) for the narrative overview and [setup-guide.md](se

- Generic tmux adapter drives any CLI fitting the tmux-injection + hook-signal transport; CLI specifics live in declarative TOML profiles.
- Supported, E2E-verified: `claude` (reference), `codex` (≥ 0.139), `gemini` (≥ 0.46).
- Bundled but pending live E2E verification: `copilot` (GitHub Copilot CLI ≥ 2026-02; VS Code-compatible `Stop` hook, `-i` interactive launch, `--allow-all-tools`).
- Per-stage CLI/model overrides: run dev on one CLI/model, review on another (`[adapter.dev]`, `[adapter.review]`, `[adapter.triage]`).
- Add a CLI without touching Python: drop a TOML profile in `.automator/profiles/<name>.toml` (binary, prompt template, bypass flags, hook dialect, native→canonical event map).
- `bmad-auto probe-adapter` collects + sanitizes the data needed to finalize/add a profile (hook payload shape, transcript location/format, token schema): a zero-launch scan by default, opt-in `--probe` for live capture. See the [adapter authoring guide](adapter-authoring-guide.md).

### Budgeting & cost tracking

Expand Down Expand Up @@ -170,4 +172,5 @@ See [README.md](../README.md) for the narrative overview and [setup-guide.md](se
- `bmad-auto cleanup` — remove leftover tmux artifacts for finished/stopped runs.
- `bmad-auto clean` — reclaim disk from concluded runs per `[cleanup]`: tear down worktrees a mid-flight stop orphaned, trim heavy `worktrees/` from runs kept for history, archive/delete past the retention window (`--dry-run`, `--keep`, `--retain N`, `--hard`).
- `bmad-auto tui` — the interactive dashboard (`--low-frame-rate` for slow/SSH links).
- `bmad-auto probe-adapter <cli>` (`collect-adapter-data`) — collect + sanitize adapter-finalization data for a CLI profile; default zero-launch scan, opt-in `--probe` live capture.
- Every command takes `--project <dir>` (default: current directory). Any `<run-id>` accepts a partial — the tail after the last `-`, shortened to any unique prefix.
1 change: 1 addition & 0 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ guides below go deeper, roughly in the order you'll need them.

## Extending bmad-auto

- **[Finalizing a CLI adapter profile](adapter-authoring-guide.md)** — using `bmad-auto probe-adapter` to collect + sanitize the hook payload shape, transcript location, and token schema a new CLI profile needs.
- **[Writing a bmad-auto plugin](plugin-authoring-guide.md)** — the plugin system: `plugin.toml` manifest, hooks, lifecycle stages, settings, the trust model, and workflow injection, with a worked walkthrough.
- **[Writing a Game Engine plugin](game-engine-plugin-guide.md)** — the game-engine layer (built on the plugin system): driving a live engine Editor, the `editor_mode` ↔ `[scm] isolation` coupling, a minimal Godot example.
- **[Writing a plugin for a specific Editor MCP](game-engine-mcp-guide.md)** — Editor-MCP specifics for the bundled Unity plugin: IvanMurzak vs CoplayDev, readiness probes, `per_worktree` isolation, and the full `BMAD_AUTO_*` env-var reference.
Expand Down
164 changes: 164 additions & 0 deletions docs/adapter-authoring-guide.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,164 @@
# Finalizing a CLI adapter profile with `probe-adapter`

bmad-auto drives any coding CLI that fits the **tmux-injection + hook-signal**
transport through one generic adapter (`adapters/generic_tmux.py`); everything
CLI-specific lives in a declarative **TOML profile** (`adapters/profile.py`). The
[README adapter section](../README.md#other-coding-clis) covers the profile fields
and how to drop one in without touching Python.

The hard part of a new profile isn't the TOML — it's the **facts that live in no
doc**: the CLI's exact hook payload shape (field names and casing, whether
`session_id` / `transcript_path` / `cwd` are present), where it writes its session
transcript and in what format, and the token-usage schema a `usage_parser` has to
read. Historically the only way to get these was to hand a volunteer a manual
recipe and ask them to sanitize the output by hand — error-prone and PII-risky.

**`bmad-auto probe-adapter`** (alias `collect-adapter-data`) pulls all of that and
runs it through an audited sanitizer, so a user of any coding CLI can run one
command and paste back a clean, content-free report.

```bash
bmad-auto probe-adapter <cli> --project . # default: zero-launch scan
bmad-auto probe-adapter <cli> --probe --project . # opt-in live capture
```

---

## Two modes

Both modes emit the **same single sanitized report** (markdown to stdout, or to a
file with `--out`; add `--json` for a machine-readable block).

### SCAN (default — no process launch)

Runs `<binary> --version` / `--help`, locates the newest **already-existing**
session transcript by convention, reads the declared hook config, and infers the
token schema from the transcript. Works whenever you've used the CLI before, with
zero execution risk. This is the right first step for any CLI that already has a
profile (claude/codex/gemini/copilot) or that you've run by hand.

### PROBE (`--probe` — opt-in live capture)

In an ephemeral `mkdtemp` workspace, `probe` registers a full-payload capture hook
for every native event in the profile, launches **one trivial content-free turn**
(`Reply with exactly: OK`) in a tmux window, captures each hook event's complete
payload, locates the transcript, then tears everything down. Use it to confirm the
**exact hook payload shape** and that the CLI actually **accepts the hook dialect**
your profile declares — facts scan can't see without running the CLI.

`--probe` needs a known profile (it uses the profile's hook dialect and event map).
If `tmux` or the binary is missing, probe degrades gracefully to a scan.

---

## PII safety model

The report is built to be **safe to paste into an issue or PR**. A single audited
sanitizer (`src/automator/sanitize.py`) is the only chokepoint:

- **numbers, booleans, and `null` pass through** — token _counts_ are not PII;
- **dict keys are kept verbatim** — field names and casing are the whole point of
a payload probe;
- every **leaf string** is `$HOME`→`~` redacted and then kept **only if** it looks
like a short machine identifier (e.g. `claude-opus-4-8`, `session-abc_123`);
anything else — prose, code, paths, emails — becomes `<redacted:str>`;
- **list lengths are preserved**, contents are scrubbed element by element;
- `--help` / `--version` text and log tails have the home dir and any emails
redacted, with a line cap.

In PROBE mode the raw capture exists **only transiently** inside the temp dir,
which is `rmtree`'d in a `finally` (even on exception or Ctrl-C). The CLI's own
transcript stays in its home dir — the command reads its _structure_, never copies
it. A hidden `--keep-temp` flag retains the raw temp dir for debugging and prints a
loud **"raw retained — do not share"** warning; never paste a `--keep-temp` run.

---

## Walkthrough: finalizing a profile

### 1. Draft a profile

Drop a TOML file in `<project>/.automator/profiles/<name>.toml` with the fields
described in the [README adapter section](../README.md#other-coding-clis). The
contract is the `CLIProfile` / `HookSpec` dataclasses in
[`src/automator/adapters/profile.py`](../src/automator/adapters/profile.py): a
`binary`, a `prompt_template`, bypass flags, a `[hooks]` block picking one of the
config dialects (`claude-settings-json` / `codex-hooks-json` /
`gemini-settings-json` / `copilot-settings-json`) and a native→canonical event
map, and a `usage_parser` (start with `"none"` until you've written one).

### 2. Scan

```bash
bmad-auto probe-adapter <cli> --project .
```

Read three sections of the report:

- **CLI flags** — your profile's launch/bypass flags plus the scrubbed
`--version` / `--help`, so you can confirm the flags you chose exist.
- **Transcript** — the redacted location, format, size, line count, and modified
date of the newest transcript the convention glob found.
- **Token usage schema** — the structural key paths (types only, never values) and
the **token-field candidates** (int leaves whose names look token-ish). When a
real parser is already declared, its parsed counts are shown as a self-check.

### 3. Probe (confirm the live payload + dialect)

```bash
bmad-auto probe-adapter <cli> --probe --project /tmp/scratch
```

The **Hook payload shape** section now shows, per captured event, the native→
canonical pairing, the payload keys, and the scrubbed payload — so you can confirm
`session_id` / `transcript_path` casing and that the CLI accepted the hook config
for your dialect. If the CLI rejects the config or never fires a hook, the report
says so (with a scrubbed log tail) instead of silently producing nothing.

### 4. Write the `usage_parser`

Turn the report's `token_field_candidates` into a parser in
[`src/automator/tokens.py`](../src/automator/tokens.py), following the existing
ones (`tally` for claude, `tally_codex_rollout`, `tally_gemini_chat`) and
registering it in `read_usage`. The report flags **per-call vs cumulative** as a
human call — a `token_count`-style event that carries running totals (codex) is
read differently from per-message blocks that are summed (claude/gemini). Re-run
scan after wiring the parser: the **parsed counts** self-check should now appear.

---

## Flags reference

| Flag | Purpose |
| ------------------- | -------------------------------------------------------------------------------- |
| `--probe` | Opt-in live capture (default is scan). Needs a known profile. |
| `--transcript PATH` | Inspect this exact transcript file, bypassing convention discovery. |
| `--session-dir DIR` | Glob this dir (`**/*.jsonl` then `*.json`, newest) — for custom/unknown CLIs. |
| `--binary NAME` | Binary to probe for a CLI that has no profile yet (enables a reduced report). |
| `--model NAME` | Model passed to the probe turn (PROBE mode). |
| `--timeout SECONDS` | Probe turn timeout (default 90). |
| `--out FILE` | Write the report to a file instead of stdout (the only file the command writes). |
| `--json` | Append a machine-readable JSON block to the report. |
| `--keep-temp` | (hidden, debug) keep the raw probe temp dir — prints a "do not share" warning. |

Exit codes mirror `validate`: `0` whenever a report is produced (warnings are
fine), `1` only when nothing could be produced. An **unknown CLI with `--binary`**
still yields a _reduced_ report (version/help + discovery, no hook events); an
unknown CLI without `--binary` fails and lists the available profiles.

---

## Worked example: copilot

The bundled `copilot` profile ships with `usage_parser = "none"` — Copilot's
token-usage schema hadn't been captured when the profile landed. That's exactly
the gap `probe-adapter` closes:

```bash
bmad-auto probe-adapter copilot --probe --project /tmp/scratch
```

captures the `Stop` payload (confirming `session_id` / `transcript_path` casing),
locates `~/.copilot/session-state/*/events.jsonl`, and infers its token schema —
the data needed to write a `copilot-*` parser in `tokens.py` and flip the profile's
`usage_parser` off `"none"`. Confirm the `mkdtemp` dir is gone afterward.
2 changes: 1 addition & 1 deletion module.yaml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
code: bauto
name: BMAD Auto Skills
description: "Automation-mode skills driven by the bmad-auto orchestrator: unattended dev (bmad-auto-dev), adversarial review (bmad-auto-review), and deferred-work sweep triage (bmad-auto-sweep)"
module_version: 0.6.1
module_version: 0.6.2
default_selected: false
module_greeting: >
BMAD Auto installed — both the four automation skills and the
Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ build-backend = "hatchling.build"

[project]
name = "bmad-auto"
version = "0.6.1"
version = "0.6.2"
description = "Deterministic ralph-loop orchestrator for the BMAD implementation phase"
readme = "README.md"
license = "MIT"
Expand Down
2 changes: 1 addition & 1 deletion src/automator/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,4 @@
spec files, and the per-run directory under .automator/runs/.
"""

__version__ = "0.6.1"
__version__ = "0.6.2"
7 changes: 6 additions & 1 deletion src/automator/adapters/profile.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,12 @@
from pathlib import Path

USAGE_PARSERS = {"claude-jsonl", "codex-rollout", "gemini-chat", "none"}
HOOK_DIALECTS = {"claude-settings-json", "codex-hooks-json", "gemini-settings-json"}
HOOK_DIALECTS = {
"claude-settings-json",
"codex-hooks-json",
"gemini-settings-json",
"copilot-settings-json",
}
CANONICAL_EVENTS = {"SessionStart", "Stop", "SessionEnd", "PreCompact"}
USER_PROFILES_REL = Path(".automator") / "profiles"

Expand Down
80 changes: 80 additions & 0 deletions src/automator/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -885,6 +885,56 @@ def cmd_tui(args: argparse.Namespace) -> int:
return run_tui(project)


def cmd_probe(args: argparse.Namespace) -> int:
from . import probe as probe_mod
from .adapters.profile import ProfileError, get_profile

project = _project(args)
hints = probe_mod.Hints(
binary=args.binary,
transcript=args.transcript,
session_dir=args.session_dir,
model=args.model,
)

profile = None
try:
profile = get_profile(args.cli, project)
except ProfileError as e:
if not args.binary:
print(f"FAIL: {e}", file=sys.stderr)
return 1
print(f" ok: unknown profile {args.cli!r}; reduced report from --binary {args.binary}")

if args.probe:
if profile is None:
print("FAIL: --probe needs a known profile (its hook dialect/events)", file=sys.stderr)
return 1
Comment on lines +903 to +912

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Avoid printing a success message right before guaranteed --probe failure.

With unknown profile + --binary + --probe, Line 907 prints an “ok” message before Line 911 fails. Gate that message to non-probe mode to keep CLI output consistent.

Proposed fix
     except ProfileError as e:
         if not args.binary:
             print(f"FAIL: {e}", file=sys.stderr)
             return 1
-        print(f"  ok: unknown profile {args.cli!r}; reduced report from --binary {args.binary}")
+        if not args.probe:
+            print(f"  ok: unknown profile {args.cli!r}; reduced report from --binary {args.binary}")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
except ProfileError as e:
if not args.binary:
print(f"FAIL: {e}", file=sys.stderr)
return 1
print(f" ok: unknown profile {args.cli!r}; reduced report from --binary {args.binary}")
if args.probe:
if profile is None:
print("FAIL: --probe needs a known profile (its hook dialect/events)", file=sys.stderr)
return 1
except ProfileError as e:
if not args.binary:
print(f"FAIL: {e}", file=sys.stderr)
return 1
if not args.probe:
print(f" ok: unknown profile {args.cli!r}; reduced report from --binary {args.binary}")
if args.probe:
if profile is None:
print("FAIL: --probe needs a known profile (its hook dialect/events)", file=sys.stderr)
return 1
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/automator/cli.py` around lines 903 - 912, The success message printed on
line 907 (ok: unknown profile...) will be displayed even when --probe is
subsequently used and will fail, creating contradictory output. Gate this print
statement to only execute when args.probe is False, so the "ok" message is not
printed in scenarios where the --probe check will immediately fail due to the
unknown profile. This keeps the CLI output consistent by avoiding success
messages that are followed by guaranteed failure.

finding = probe_mod.probe(
cli=args.cli,
profile=profile,
project=project,
hints=hints,
timeout_s=args.timeout,
keep_temp=args.keep_temp,
)
else:
finding = probe_mod.scan(cli=args.cli, profile=profile, project=project, hints=hints)

report = probe_mod.render_markdown(finding)
if args.json:
report = report + "\n\n## JSON\n\n```json\n" + probe_mod.render_json(finding) + "\n```\n"

if args.out:
out_path = Path(args.out)
out_path.write_text(report, encoding="utf-8")

@augmentcode augmentcode Bot Jun 21, 2026

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

src/automator/cli.py:930 writes the --out report with Path.write_text() but doesn’t handle OSError (missing parent dir, permissions, etc.), so probe-adapter may crash with a traceback instead of returning a clean failure. That’s especially painful since the report content is otherwise already assembled and paste-safe.

Severity: medium

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

print(f" ok: report written to {out_path} ({len(finding.warnings)} warning(s))")
else:
print(report)
print(f" ok: {finding.mode} report for {args.cli} ({len(finding.warnings)} warning(s))")
return 0


def cmd_init(args: argparse.Namespace) -> int:
from .install import install_into

Expand Down Expand Up @@ -935,6 +985,36 @@ def add(name: str, func, help: str, *, aliases=()) -> argparse.ArgumentParser:
)
add("validate", cmd_validate, "preflight checks; exit non-zero on failure")

probe_p = add(
"probe-adapter",
cmd_probe,
"collect + sanitize adapter-finalization data for a coding CLI",
aliases=["collect-adapter-data"],
)
probe_p.add_argument(
"cli", help="CLI profile name (claude | codex | gemini | copilot | custom)"
)
probe_p.add_argument(
"--probe",
action="store_true",
help="opt-in LIVE capture: launch one trivial content-free turn in a temp "
"workspace and capture real hook payloads (default: zero-launch scan)",
)
probe_p.add_argument(
"--transcript", help="exact transcript file to inspect (overrides discovery)"
)
probe_p.add_argument(
"--session-dir", help="dir to glob for the newest transcript (custom CLIs)"
)
probe_p.add_argument("--binary", help="binary name for a CLI with no profile yet")
probe_p.add_argument("--model", help="model passed to the probe turn (probe mode)")
probe_p.add_argument(
"--timeout", type=float, default=90, help="probe turn timeout (default: 90s)"
)
probe_p.add_argument("--out", help="write the report to this file instead of stdout")
probe_p.add_argument("--json", action="store_true", help="append a machine-readable JSON block")
probe_p.add_argument("--keep-temp", action="store_true", help=argparse.SUPPRESS)

run_p = add("run", cmd_run, "run the orchestration loop")
run_p.add_argument("--epic", type=int, help="only stories from this epic")
run_p.add_argument("--story", help="story: E-S / E.S, a slug fragment, or full key")
Expand Down
Loading
Loading