Release 0.7.0#11
Conversation
Replace the bmad-quick-dev fork (interactive workflow + an automation-mode.md decision table that existed only to override that interactivity) with a first-class machine-first skill: lean steps (resolve/plan/implement/review/ finalize), the orchestrator contract stated up front, no greeting/menus/HALTs. Preserves epic-context compilation, previous-story continuity, and the inline three-layer adversarial review: with review.enabled=false the dev session runs that inline triple-review itself before finalizing to done (a judge that did the planning is a better-informed judge); with review.enabled=true the orchestrator runs a separate fresh-context review session instead. Mirrors the upstream draft bmad-code-org/BMAD-METHOD#2498 (renamed bmad-dev-auto -> bmad-auto-dev to match the orchestrator's /bmad-auto-dev invocation). No engine change required. Also make the dev result.json `workflow` an enforced contract instead of a documented-but-ignored string: verify_dev/verify_dev_bundle reject a mismatch against verify.DEV_WORKFLOW ("auto-dev"), and the skill emits "auto-dev" rather than the misleading legacy "quick-dev". Review's "code-review" is documented as informational by design (verify_review is purely disk-derived and is never handed the result.json). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ordering Remediate review comments on PR #9: - verify_dev: the sprint-status gate accepted both 'review' and 'done' regardless of review_enabled, so a skip-review run that left the sprint at 'review' could pass dev verification. Make the expected sprint state conditional (mirrors the spec-status `expected`) and fix the hardcoded error text. Add a regression test + a docstring. (Augment, medium) - step-05-finalize.md / SKILL.md: "result.json is your last action" + "End the turn" contradicted the On Complete hook that runs afterward. Reword so result.json is the last *artifact* and the on_complete hook is an explicit numbered terminal step before ending the turn. (Augment, low) - skills/README.md: intro claimed all skills are upstream forks; bmad-auto-dev is standalone and resolve/sweep are automator-native. Soften the sentence to match the component table. (Augment, low) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- conftest.py: bundle_dev_effect docstring referenced a nonexistent step-04-finalize; the terminal finalize step is step-05-finalize. (Augment) - CHANGELOG: "a judge that did the planning" -> "who" (person referent). (CodeRabbit) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…eam #2500 Fold three refinements from bmad-code-org/BMAD-METHOD#2500 into the skill: - Spec Boundaries gain a frozen "Block If" tier enumerating decisions that cannot be made unattended. Triggering one in step-03 (implement) or step-04 (review) escalates CRITICAL (type: block-if); step-02 populates it during planning. Reuses the existing CRITICAL pause path — no engine change. - sync-sprint-status now writes a machine-readable `sprint-sync-skipped` frontmatter warning when a story key is absent (was log-only). - compile-epic-context uses a deterministic fallback sentence when planning artifacts are missing, and drops "Edit freely" from the regenerated banner. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Integrate Alex Verhovsky's generic bmad-dev-auto skill (BMAD-METHOD #2500) as a dev primitive the orchestrator drives, selected via policy.dev.skill (= "bmad-auto-dev" default | "bmad-dev-auto"). The skill self-finalizes its spec on disk; the orchestrator synthesizes the result, owns sprint-status and the deferred-work ledger, and verifies on real on-disk state. - devcontract.py: synthesize_result / parse_auto_run_result / find_result_artifact + reset_spec_status; maps blocked -> CRITICAL, forges workflow:auto-dev. - sprintstatus.advance(): never-regress line editor over STATUS_ORDER, epic-lift. - GenericDevAdapter: synthesizes the result from the on-disk spec (skill writes no result.json), _stop_nudges = 0; cli wires it for the generic dev role. - engine: _generic_dev / _dev_review_enabled / _post_dev_state_sync (sprint + ledger single-writer), B6 spec reset+resume repair prompt. - sweep: generic-path bundles (orchestrator marks the ledger), flat deferred-work normalizer in parse_legacy. Validated end-to-end against real claude sessions (parity vs legacy, blocked -> CRITICAL escalation, B6 repair). Full suite green, trunk clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…auto (0.6.5)
Retire the bundled bmad-auto-dev fork entirely; the orchestrator now drives the
upstream bmad-dev-auto skill unmodified as its sole dev primitive. All
automator-specific concerns (escalation, result synthesis, sprint/ledger
bookkeeping, repair-resume) stay orchestrator-side. bmad-auto-review stays
bundled and runs by default, gated on review.enabled.
- DEV_SKILLS={bmad-dev-auto}; the generic dev path is now the only path —
GenericDevAdapter synthesizes result.json from the spec the skill leaves on
disk; engine/sweep/cli always take the generic branch.
- Removed bmad-auto-dev from MODULE_SKILLS and deleted the bundled fork dir;
relocated deferred-work-format.md into bmad-auto-review/ (a sibling dependency
of the review and sweep skills, not dev).
- Added install.missing_base_skills() preflight wired into validate +
run/sweep/resume: requires the upstream bmad-dev-auto skill, plus the review
hunters (bmad-review-adversarial-general, bmad-review-edge-case-hunter) when
review is enabled — fails loudly with remediation instead of stalling on an
Unknown command.
- Worktree provisioning copies the non-bundled upstream skills from the main
repo's installed tree, since they are no longer in the wheel.
- Bumped 0.6.4 -> 0.6.5 (sync_version.py) and updated docs + CHANGELOG. Full
suite (1008) green, trunk clean.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…mended
Consume the new `followup_review_recommended` spec frontmatter signal from
bmad-dev-auto (BMAD-METHOD #2505). The skill self-reviews inline every story
and flags an independent follow-up review only when its review-driven changes
were significant; the orchestrator now gates its separate bmad-auto-review pass
on that signal.
- devcontract.synthesize_result: carry the flag into the result dict on done.
- model.StoryTask: persist followup_review_recommended (legacy state -> False).
- engine: capture the flag post-dev; _review_and_commit skips the review loop
(journalling review-not-recommended, validating gates + committing) unless
recommended.
- policy.ReviewPolicy.trigger ("recommended" default | "always") + validation +
REVIEW_TRIGGER_MODES; exposed in the TUI via core.toml and policy.toml.
- Cap reuses limits.max_review_cycles (default 3) as the oscillation guard;
no new knob.
Tests: dev-session fixtures emit the flag so the review-flow suite still runs
review under the new default; added devcontract/model/policy/settings-schema/
engine coverage. Docs + CHANGELOG updated. Full suite (1018) green, trunk clean.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…in-place Re-arming a resolved escalation requests a clean rebuild against the corrected spec; in non-worktree runs that means resetting the in-place tree to the story baseline. With the default scm.rollback_on_failure=false the orchestrator paused for a manual reset but never cleared baseline_commit, so following the notice (git reset --hard, then resume) re-paused on the next resume — an endless loop. - verify.attempt_dirty: true only when a safe_rollback would change anything (tracked diff vs baseline + run-created untracked beyond the snapshot). - _rollback_or_pause no-ops when the tree is already at baseline, so a clean tree (including one the operator just reset) proceeds to the re-drive, and an escalation that left no changes never triggers a spurious prompt. - Manual-recovery notice no longer claims the story "failed"; it states the real cause (resolved-escalation rebuild vs. stopped attempt) via a new StoryTask.rearmed flag set by runs.rearm_escalation. Tests: attempt_dirty unit coverage, a clean-skip test, a wording test, and an end-to-end resolve->resume regression proving the loop is gone. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…flips blocked status verify_dev only set task.spec_file after every check passed, so a dev session that HALTed with a `blocked` spec (the common escalation case) left spec_file unset. rearm_escalation then had no spec path to flip to `ready-for-dev`, so the resolved-escalation re-drive re-HALTed on the still-`blocked` frontmatter — a second loop after the manual-rollback one. The orchestrator now captures the spec the dev session produced (the synthesized result names it even on a HALT — devcontract.synthesize_result) at the escalate and defer decision points, before _escalate/_defer persist state. So re-arm flips the status and the re-drive proceeds into implementation, and a deferred story's spec is stashed as intended. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…m (Phase 1) Introduce the transport-axis seam so a future non-POSIX backend (native-Windows "psmux") can replace tmux without the adapters shelling out directly. WSL is Linux, so this is the seam, not a Windows backend; tmux stays the only backend. - adapters/multiplexer.py: TerminalMultiplexer ABC (full op contract) + cached get_multiplexer() factory. - adapters/tmux_backend.py: TmuxMultiplexer, the only file allowed to invoke tmux. Implements the generic adapter's op subset + the parked-window sh -c trailer (new_parked_window, ready for Phase 2); remaining ops stubbed. - adapters/generic_tmux.py -> generic.py: GenericAdapter routes every tmux body through self.mux; GenericTmuxAdapter alias kept for import stability. - cli._make_adapters: share one get_multiplexer() across roles. - tests: retarget monkeypatches to the backend; add test_multiplexer.py proving the adapter drives only the mux with no tmux on PATH. Behavior-preserving on Linux; runs.py/tui/launch.py/probe.py/tui/data.py deferred to Phase 2. Full suite green (1028), trunk clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…iplexer seam (Phase 2)
Quarantine every tmux invocation behind TmuxMultiplexer: runs.py, tui/launch.py,
probe.py, and tui/data.py now drive sessions/windows/clients through
get_multiplexer() instead of shelling out to tmux directly. After this phase no
file outside adapters/tmux_backend.py issues a raw tmux invocation
(shutil.which("tmux") presence checks aside). Behavior-preserving on Linux.
- tmux_backend: implement the Phase-2 contract surface (kill_session,
list_sessions, session_options, list_windows, window_alive, select_window,
set/unset/show_window_option, attach_target_argv, current_*, detach_client,
switch_client); new_session geometry is now optional (the control session
takes tmux's default size).
- multiplexer: add MultiplexerError (TmuxError subclasses it, so call sites
catch the seam type without importing the backend) and unset_window_option.
- runs/launch/probe/data: delegate; the sh -c parked-window trailer moves into
new_parked_window. tmux backend errors surface as LaunchError at the seam.
- tests: patch the tmux subprocess/which seam on tmux_backend; StubMux gains
unset_window_option.
- docs: finish the adapter-authoring-guide multiplexer seam section (the
transport contract a psmux backend author implements).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…latform_util (Phase 3) Add platform_util.py with portable process primitives so a future native-Windows multiplexer backend can slot in without re-auditing call sites: terminate_pid() (SIGTERM on POSIX, taskkill degrade on Windows), detach_kwargs() (start_new_session vs CREATE_NEW_PROCESS_GROUP), and a guarded SIGKILL constant. Linux/WSL behavior is unchanged. Route runs.stop_run's engine-pid kill through terminate_pid; swap the git-diff --no-index /dev/null literal in verify.py for os.devnull; add a portability ack to the shutil.which-guarded notify-send in gates.py. Unity-plugin POSIX-isms (the SIGKILL/start_new_session call sites that detach_kwargs and SIGKILL are built for) are deferred to Phase 4. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…uards (Phase 4) Quarantine the Unity plugin's POSIX-isms behind sys.platform branches while keeping every Linux/WSL fast path byte-identical (none of the Windows branches are exercised yet — no native-Windows backend ships this pass). - unity_teardown.py: keep the /proc scan on Linux; add a psutil-based discovery path for non-Linux (lazy import of the optional `windows` extra with a clear error if missing). SIGKILL guarded via getattr; SIGTERM/SIGKILL routed through terminate/hard-kill helpers (taskkill on Windows); liveness uses psutil.pid_exists on Windows where os.kill(pid, 0) is destructive. - unity_cleanup.py: keep /tmp/<company>/<product> on Linux (Unity's temporaryCachePath); derive from tempfile.gettempdir() elsewhere, with a note that native Windows Unity uses a different cache root (documented follow-up). - unity_setup.py: cp -a --reflink fast path kept on POSIX, else shutil.copytree(dirs_exist_ok=True); symlink fallback wrapped in try/except OSError with a real-dir fallback + log; detach via a start_new_session/CREATE_NEW_PROCESS_GROUP helper. - pyproject.toml: add [project.optional-dependencies] windows = ["psutil>=7.2.2"] (current release); relock uv.lock. - docs: platform-behavior notes in plugin-authoring-guide.md and game-engine-plugin-guide.md. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…se 5) Add tests/test_portability_guard.py — an AST/byte scan of src/automator that runs in the existing pytest matrix and fails if a hard POSIX dependency sneaks back in after the multiplexer-seam + portability work (Phases 1-4): - no ["tmux", ...] argv outside adapters/tmux_backend.py (the quarantine); - no bare /tmp, /proc, /dev/null literal outside the platform-guarded Unity files (subpath- and shell-string-aware so 2>/dev/null and ~/.gemini/tmp/... don't false-positive); - no bare signal.SIGKILL attribute access (must go through the getattr guard); - no start_new_session=True outside the two detach helpers; - shell=True only in verify.py and plugins/bus.py. Each sanctioned exception (outside the wholesale tmux quarantine) now carries a `portability:` ack on its line, which the guard verifies is present — so exceptions stay deliberate. Adds those acks to the six existing exception sites. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Final documentation sweep for the POSIX-decoupling work. Phases 1-5 (seam, portability fixes, Unity, CI guard) landed; this aligns the prose with what shipped. - README: reframe requirements to "tmux backend today, multiplexer- pluggable, WSL supported"; fix generic_tmux.py -> generic.py. - setup-guide: add a platform-prerequisites section (tmux required for the current backend, WSL = supported Linux target, native Windows is a future backend). - FEATURES: describe the CLI vs transport axes; tmux is one backend behind a pluggable seam. - tui-guide: note tmux is the only multiplexer backend today; WSL works. - ROADMAP: mark the multiplexer seam + portability hardening done (v0.6.5); native-Windows (psmux) backend is the remaining open item. - CHANGELOG: terse 0.6.5 entry for the seam + portability hardening. Verify: 1034 tests pass; full trunk check clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…mad-auto-review bmad-dev-auto now commits its own work each iteration (BMAD-METHOD#2506) and routes a `done` spec to a fresh step-04 review pass (#2508). The orchestrator's follow-up review is now just `/bmad-dev-auto <done spec>` — no separate review skill — and the review session synthesizes its result from the spec like the dev session does. - verify.finalize_commit: squash the skill's per-iteration commits + the orchestrator's sprint/ledger writes back onto the pre-dev baseline into one story commit; pre_commit/post_commit hooks + message template stay authoritative - _review_and_commit converges on status + followup_review_recommended (was the bmad-auto-review `clean` field); review role gets a spec-synthesizing GenericDevAdapter - retire bmad-auto-review: delete the skill, relocate its deferred-work-format.md into bmad-auto-sweep, fold the two always-needed review hunters into DEV_BASE_SKILLS, scrub all references (install/marketplace/module/docs) - tests: review_effect/bundle_review_effect emit the synthesized shape; finalize_commit + review-adapter-wiring regression tests; sync dogfood bmad-dev-auto copies Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…tion gaps
Close the remaining doc gaps after the dev/review skill retirement and the
POSIX-decoupling sweep:
- Fix the stale FEATURES line that still claimed the orchestrator drives "its
own forks of the dev/review skills" — it drives upstream bmad-dev-auto
unmodified and re-invokes it on the done spec for review.
- Expand the adapter authoring guide (retitled "Authoring CLI adapters &
profiles") with a complete CLIProfile/HookSpec field reference and a
"Writing a new adapter class" section for non-tmux transports (CodingCLIAdapter
ABC, the SessionSpec/Handle/Result contracts, required/optional methods).
- Link the three authoring guides from the README Documentation index and
repoint the "Other coding CLIs" profile-fields claim at the guide as the single
canonical schema home.
- Correct the setup-guide uninstall block ("three bundled skills", not five) and
note dev/review are retired.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…strator now dri…
Pure prettier reformat of the new Profile/HookSpec reference tables — no content change. trunk fmt produced it during release prep; release.py only stages version files + CHANGELOG, so it lands here. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
This pull request is abnormally large and would use a significant amount of tokens to review. If you still wish to review it, comment "augment review" and we will review it. |
|
Warning Review limit reached
More reviews will be available in 26 minutes and 28 seconds. Learn how PR review limits work. Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file). ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits. 🚦 How do rate limits work?CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability. For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (9)
WalkthroughThis PR moves terminal control behind a multiplexer seam, switches orchestration to upstream ChangesUpstream
Estimated code review effort🎯 5 (Critical) | ⏱️ ~90+ minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 13
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
docs/setup-guide.md (1)
36-42: 📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick winUpdate the installer count.
The BMAD-method installer no longer copies five bundled skill dirs; the wheel now ships only
bmad-auto-resolve,bmad-auto-sweep, andbmad-auto-setup(src/automator/install.py). Please fix this count so the setup docs don't tell users to expect retireddev/reviewbundles.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/setup-guide.md` around lines 36 - 42, The setup guide still says the BMAD-method installer copies five bundled skill directories, but the installer behavior now only includes the remaining skill bundles. Update the installer count and wording in the setup docs to match the current bundle set referenced by src/automator/install.py, and remove any mention of retired dev/review packages so the description stays aligned with what the installer actually copies.src/automator/runs.py (1)
233-241: 🩺 Stability & Availability | 🟡 MinorRefactor liveness probe to use a portable helper.
The wait loop in
src/automator/runs.pycurrently usesos.kill(pid, 0)to check process existence. While signal 0 is non-destructive on Windows,os.killsemantics for existence checks vary by OS and Python version (raisingPermissionErrororProcessLookupError). Encapsulate this logic inplatform_util.pid_alive()to ensure consistent, portable behavior across the codebase.Current implementation
# src/automator/runs.py:239-242 while time.monotonic() < deadline: try: os.kill(pid, 0) except OSError:🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/automator/runs.py` around lines 233 - 241, The liveness check in runs.wait-for-termination still uses os.kill(pid, 0) directly, which should be replaced with the shared portable helper. Update the wait loop in the process termination flow to call platform_util.pid_alive() instead of probing with os.kill, and keep the existing fallback behavior when the helper reports the pid is gone. Use the existing terminate_pid and _STOP_WAIT_S logic as-is, only swapping the existence check to the helper so process liveness is handled consistently across platforms.
🧹 Nitpick comments (2)
tests/test_devcontract.py (1)
145-170: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winAdd a direct fallback-artifact test.
find_result_artifact()has special behavior forbmad-dev-auto-result-*.mdfiles without an Auto Run Result heading; the unit tests currently only cover heading-based discovery and ignoring non-heading files.Suggested test
+def test_find_artifact_accepts_no_spec_fallback_prefix(tmp_path): + fallback = tmp_path / "bmad-dev-auto-result-unclear-1234.md" + fallback.write_text( + "---\nstatus: blocked\n---\n\n# BMad Dev Auto Result\n\n" + "Status: blocked\nBlocking condition: unclear intent\n", + encoding="utf-8", + ) + assert devcontract.find_result_artifact(tmp_path, since_ns=0) == fallback +🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/test_devcontract.py` around lines 145 - 170, Add a focused unit test for find_result_artifact that covers the fallback bmad-dev-auto-result-*.md path: create a matching artifact file without the Auto Run Result heading, set its mtime so it is the newest candidate, and assert devcontract.find_result_artifact returns it. Place the new test alongside the existing test_find_artifact_* cases and use the same tmp_path/_spec patterns so the fallback behavior in find_result_artifact is exercised directly.tests/test_policy.py (1)
18-18: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low valueOptional: silence RUF043 on
pytest.raises(match=...).The
match=strings contain unescaped regex metacharacters (.), which Ruff flags (RUF043). The current patterns still match (.is any-char), so this is cosmetic — escape the dot or mark the strings raw if you want the lint clean, e.g.match=r"dev\.skill".Also applies to: 21-21, 36-36
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/test_policy.py` at line 18, The pytest.raises match patterns in test_policy.py use unescaped dots, which triggers RUF043 even though the assertions still work; update the affected match strings in the policy tests to use escaped dots or raw regex strings, and locate the changes in the pytest.raises calls that reference policy.PolicyError and the dev.skill pattern.Source: Linters/SAST tools
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@CHANGELOG.md`:
- Around line 18-23: The CHANGELOG entry for the review flow is inconsistent
about the control path, so reconcile the wording in the bullet that mentions
`followup_review_recommended` and the later review-trigger section to describe a
single mechanism. Update the release note text around `bmad-dev-auto`,
`review.enabled`, and `review.trigger` so it matches the actual behavior and
does not claim both that the flag is no longer used and that it still gates the
flow.
In `@docs/adapter-authoring-guide.md`:
- Around line 71-72: The documentation text contains a typo where “aliveness” is
used instead of the actual contract method name window_alive, which can confuse
adapter authors. Update the wording in the surrounding explanation to refer
consistently to window_alive and keep the liveness description aligned with the
list-windows membership behavior mentioned in this section.
In `@docs/FEATURES.md`:
- Around line 50-54: The capability matrix entry for “Fresh-context adversarial
review” is inconsistent with the rewritten review section. Update the matrix
text in FEATURES.md so it matches the current review flow described by the
review-related bullets and the follow-up review behavior, keeping the layer
count and wording aligned with the Blind Hunter / Edge Case Hunter model used
elsewhere in the doc.
In `@docs/ROADMAP.md`:
- Line 12: The status line in ROADMAP is attributing the multiplexer seam and
portability hardening to the wrong release, so update the foundation version
stamp to match the release that actually introduces those changes. Adjust the
text in the ROADMAP entry so the provenance note aligns with the 0.7.0 release,
and verify the wording around the seam/hardening mention stays consistent with
the version naming used elsewhere.
In `@pyproject.toml`:
- Around line 21-24: The non-Linux teardown path in unity_teardown.py uses
psutil on macOS as well as Windows, but pyproject.toml only exposes it under the
windows extra, so update the dependency extras to a broader non-Linux scope (or
add it to both windows and macos) and make sure the install guidance matches
that new extra. Also adjust the fallback/import error messaging in
unity_teardown.py so the suggested pip install command reflects the actual extra
name used for _lingering_pids_psutil and the non-Linux branch.
In `@src/automator/adapters/generic.py`:
- Around line 291-298: The GenericDevAdapter._result_json method is ignoring the
wait flag and returning too early when the result artifact is not yet flushed to
disk. Update _result_json to mirror the polling logic used in _read_result: when
wait=True, keep retrying devcontract.find_result_artifact with a short sleep
loop until the artifact appears or the deadline is reached, then synthesize the
result; when wait=False, preserve the current immediate return behavior. Keep
the change localized to _result_json and use the existing handle.launched_ns,
spec, and story_key flow.
In `@src/automator/data/plugins/unity/unity_setup.py`:
- Around line 207-222: The fast-copy branch in _copy_library currently treats
any non-Windows system with cp installed as safe for _cp_library_tree, but
macOS’s BSD cp does not support --reflink. Update the platform check around
_copy_library and _cp_library_tree so the cp reflink path is only used on Linux
or when GNU coreutils is explicitly available; otherwise fall back to
_copytree_library or another macOS-safe clone/deep-copy path.
In `@src/automator/data/skills/bmad-auto-setup/SKILL.md`:
- Around line 74-76: Clarify the handoff description in SKILL.md where bmad-auto
orchestrator and bmad-dev-auto are mentioned: the current wording implies
bmad-dev-auto itself performs the follow-up review pass, but the new flow
re-invokes it on the done spec as a separate pass. Update the sentence around
the bmad-auto orchestrator and bmad-dev-auto references so it explicitly says
the orchestrator starts a second bmad-dev-auto run for follow-up review rather
than attributing that pass to the upstream primitive itself.
In `@src/automator/platform_util.py`:
- Around line 31-33: The Windows branch in platform_util’s process-termination
logic resolves taskkill by name, which can be hijacked via PATH or the working
directory. Update the subprocess.run call in the Windows-specific kill path to
invoke the System32 taskkill binary explicitly, keeping the existing win32 check
and taskkill behavior otherwise unchanged.
In `@src/automator/probe.py`:
- Around line 416-417: The probe window handoff in AutomatorProbe still lets
MultiplexerError escape from mux.pipe_pane, which breaks the intended race
tolerance. Update the code around the pipe_pane call in AutomatorProbe to catch
backend failures and fall back to the normal log-less probing path when a window
exits instantly, using the mux.pipe_pane and related probe-window logic as the
fix point.
In `@src/automator/sprintstatus.py`:
- Around line 200-217: The status update flow in story_status()/advance() can
report success even when _set_mapping_value() did not actually rewrite the story
entry, especially for keys that match STORY_RE but are not writable. Make
advance() track whether the specific story_key mapping was changed first, and
only return target when that rewrite succeeded; otherwise preserve the original
status or signal no-op. Keep the epic and last_updated updates gated by the same
changed state in the story-status path so the function does not claim success
unless the story line was actually updated.
In `@src/automator/tui/data.py`:
- Around line 81-90: The `_session_liveness` helper only treats `OSError` as a
failure path, so confirm whether `mux.has_session(session_name(run_id))` can
raise `MultiplexerError` and, if so, catch that exception (or its base) as well
so transient backend failures still return "unknown" instead of escaping through
`liveness()` and `discover_runs()`. Also verify `session_name` is imported and
available in this module since it is used by `_session_liveness`.
In `@src/automator/verify.py`:
- Around line 802-814: The squash-commit flow in verify.py should preserve the
original branch state if the commit step fails: in the helper that runs git
add/reset/commit, capture the current HEAD before the soft reset, and if the
`git commit` call in that sequence fails, restore HEAD back to that saved ref
before raising `GitError`. Use the existing `_git` helper and the commit path
around the `git reset --soft` and `git commit` calls to make the rollback happen
consistently.
---
Outside diff comments:
In `@docs/setup-guide.md`:
- Around line 36-42: The setup guide still says the BMAD-method installer copies
five bundled skill directories, but the installer behavior now only includes the
remaining skill bundles. Update the installer count and wording in the setup
docs to match the current bundle set referenced by src/automator/install.py, and
remove any mention of retired dev/review packages so the description stays
aligned with what the installer actually copies.
In `@src/automator/runs.py`:
- Around line 233-241: The liveness check in runs.wait-for-termination still
uses os.kill(pid, 0) directly, which should be replaced with the shared portable
helper. Update the wait loop in the process termination flow to call
platform_util.pid_alive() instead of probing with os.kill, and keep the existing
fallback behavior when the helper reports the pid is gone. Use the existing
terminate_pid and _STOP_WAIT_S logic as-is, only swapping the existence check to
the helper so process liveness is handled consistently across platforms.
---
Nitpick comments:
In `@tests/test_devcontract.py`:
- Around line 145-170: Add a focused unit test for find_result_artifact that
covers the fallback bmad-dev-auto-result-*.md path: create a matching artifact
file without the Auto Run Result heading, set its mtime so it is the newest
candidate, and assert devcontract.find_result_artifact returns it. Place the new
test alongside the existing test_find_artifact_* cases and use the same
tmp_path/_spec patterns so the fallback behavior in find_result_artifact is
exercised directly.
In `@tests/test_policy.py`:
- Line 18: The pytest.raises match patterns in test_policy.py use unescaped
dots, which triggers RUF043 even though the assertions still work; update the
affected match strings in the policy tests to use escaped dots or raw regex
strings, and locate the changes in the pytest.raises calls that reference
policy.PolicyError and the dev.skill pattern.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 5b5141ba-b10b-498f-bf9d-5e6114b9ffce
⛔ Files ignored due to path filters (12)
docs/images/dashboard.pngis excluded by!**/*.pngdocs/images/dashboard.svgis excluded by!**/*.svgdocs/images/demo.gifis excluded by!**/*.gifdocs/images/settings-scm.pngis excluded by!**/*.pngdocs/images/settings-scm.svgis excluded by!**/*.svgdocs/images/settings.svgis excluded by!**/*.svgdocs/images/start-run-modal.pngis excluded by!**/*.pngdocs/images/start-run-modal.svgis excluded by!**/*.svgdocs/images/sweep-decision.pngis excluded by!**/*.pngdocs/images/sweep-decision.svgis excluded by!**/*.svgsrc/automator/data/skills/bmad-auto-setup/assets/module-help.csvis excluded by!**/*.csvuv.lockis excluded by!**/*.lock
📒 Files selected for processing (88)
.claude-plugin/marketplace.json.gitignoreCHANGELOG.mdREADME.mddocs/FEATURES.mddocs/ROADMAP.mddocs/adapter-authoring-guide.mddocs/game-engine-plugin-guide.mddocs/plugin-authoring-guide.mddocs/setup-guide.mddocs/tui-guide.mdmodule.yamlpyproject.tomlscripts/gen_screenshots.pysrc/automator/__init__.pysrc/automator/adapters/generic.pysrc/automator/adapters/multiplexer.pysrc/automator/adapters/tmux_backend.pysrc/automator/cli.pysrc/automator/data/plugins/unity/unity_cleanup.pysrc/automator/data/plugins/unity/unity_setup.pysrc/automator/data/plugins/unity/unity_teardown.pysrc/automator/data/profiles/copilot.tomlsrc/automator/data/settings/core.tomlsrc/automator/data/skills/README.mdsrc/automator/data/skills/bmad-auto-dev/SKILL.mdsrc/automator/data/skills/bmad-auto-dev/automation-mode.mdsrc/automator/data/skills/bmad-auto-dev/compile-epic-context.mdsrc/automator/data/skills/bmad-auto-dev/customize.tomlsrc/automator/data/skills/bmad-auto-dev/spec-template.mdsrc/automator/data/skills/bmad-auto-dev/step-01-clarify-and-route.mdsrc/automator/data/skills/bmad-auto-dev/step-02-plan.mdsrc/automator/data/skills/bmad-auto-dev/step-03-implement.mdsrc/automator/data/skills/bmad-auto-dev/step-04-review.mdsrc/automator/data/skills/bmad-auto-dev/step-05-present.mdsrc/automator/data/skills/bmad-auto-dev/step-auto-finalize.mdsrc/automator/data/skills/bmad-auto-dev/step-oneshot.mdsrc/automator/data/skills/bmad-auto-dev/sync-sprint-status.mdsrc/automator/data/skills/bmad-auto-resolve/SKILL.mdsrc/automator/data/skills/bmad-auto-review/SKILL.mdsrc/automator/data/skills/bmad-auto-review/automation-mode.mdsrc/automator/data/skills/bmad-auto-review/customize.tomlsrc/automator/data/skills/bmad-auto-review/steps/step-01-gather-context.mdsrc/automator/data/skills/bmad-auto-review/steps/step-02-review.mdsrc/automator/data/skills/bmad-auto-review/steps/step-03-triage.mdsrc/automator/data/skills/bmad-auto-review/steps/step-04-present.mdsrc/automator/data/skills/bmad-auto-setup/SKILL.mdsrc/automator/data/skills/bmad-auto-setup/assets/module.yamlsrc/automator/data/skills/bmad-auto-sweep/deferred-work-format.mdsrc/automator/data/skills/bmad-auto-sweep/migration-mode.mdsrc/automator/deferredwork.pysrc/automator/devcontract.pysrc/automator/engine.pysrc/automator/escalation.pysrc/automator/gates.pysrc/automator/install.pysrc/automator/model.pysrc/automator/platform_util.pysrc/automator/plugins/bus.pysrc/automator/policy.pysrc/automator/probe.pysrc/automator/runs.pysrc/automator/sprintstatus.pysrc/automator/sweep.pysrc/automator/tui/data.pysrc/automator/tui/launch.pysrc/automator/verify.pysrc/automator/workspace.pytests/conftest.pytests/test_cli.pytests/test_deferredwork.pytests/test_devcontract.pytests/test_engine.pytests/test_engine_worktree.pytests/test_generic_tmux.pytests/test_install.pytests/test_model.pytests/test_multiplexer.pytests/test_policy.pytests/test_portability_guard.pytests/test_profile.pytests/test_runs.pytests/test_settings_schema.pytests/test_sprintstatus_advance.pytests/test_sweep.pytests/test_tui_data.pytests/test_tui_launch.pytests/test_verify.py
💤 Files with no reviewable changes (20)
- src/automator/data/skills/bmad-auto-dev/step-03-implement.md
- src/automator/data/skills/bmad-auto-review/steps/step-03-triage.md
- src/automator/data/skills/bmad-auto-review/steps/step-02-review.md
- src/automator/data/skills/bmad-auto-dev/SKILL.md
- src/automator/data/skills/bmad-auto-dev/step-auto-finalize.md
- src/automator/data/skills/bmad-auto-dev/step-01-clarify-and-route.md
- src/automator/data/skills/bmad-auto-dev/spec-template.md
- src/automator/data/skills/bmad-auto-dev/compile-epic-context.md
- src/automator/data/skills/bmad-auto-review/automation-mode.md
- src/automator/data/skills/bmad-auto-dev/step-04-review.md
- src/automator/data/skills/bmad-auto-review/SKILL.md
- src/automator/data/skills/bmad-auto-dev/step-02-plan.md
- src/automator/data/skills/bmad-auto-review/customize.toml
- src/automator/data/skills/bmad-auto-dev/step-oneshot.md
- src/automator/data/skills/bmad-auto-dev/customize.toml
- src/automator/data/skills/bmad-auto-dev/sync-sprint-status.md
- src/automator/data/skills/bmad-auto-dev/step-05-present.md
- src/automator/data/skills/bmad-auto-review/steps/step-04-present.md
- src/automator/data/skills/bmad-auto-dev/automation-mode.md
- src/automator/data/skills/bmad-auto-review/steps/step-01-gather-context.md
| # pipe-pane may race a window that dies instantly; tolerate failure. | ||
| self._tmux("pipe-pane", "-t", window_id, "-o", f"cat >> {shlex.quote(str(log_file))}") | ||
| self.mux.pipe_pane(window_id, log_file) |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟡 Minor | ⚡ Quick win
Catch pipe_pane backend failures.
Line 417 can still raise MultiplexerError, despite the comment saying this race is tolerated. Wrap it so instantly-exiting probe windows degrade to normal log-less probing instead of crashing.
Proposed fix
except MultiplexerError:
return None
# pipe-pane may race a window that dies instantly; tolerate failure.
- self.mux.pipe_pane(window_id, log_file)
+ try:
+ self.mux.pipe_pane(window_id, log_file)
+ except MultiplexerError:
+ pass
return window_id📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # pipe-pane may race a window that dies instantly; tolerate failure. | |
| self._tmux("pipe-pane", "-t", window_id, "-o", f"cat >> {shlex.quote(str(log_file))}") | |
| self.mux.pipe_pane(window_id, log_file) | |
| # pipe-pane may race a window that dies instantly; tolerate failure. | |
| try: | |
| self.mux.pipe_pane(window_id, log_file) | |
| except MultiplexerError: | |
| pass |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/automator/probe.py` around lines 416 - 417, The probe window handoff in
AutomatorProbe still lets MultiplexerError escape from mux.pipe_pane, which
breaks the intended race tolerance. Update the code around the pipe_pane call in
AutomatorProbe to catch backend failures and fall back to the normal log-less
probing path when a window exits instantly, using the mux.pipe_pane and related
probe-window logic as the fix point.
Validated all 13 inline findings against the tree; 12 valid, 1 skipped (probe pipe_pane — the seam contract + tmux backend already tolerate the window-death race, so re-wrapping is dead code). Correctness / stability: - generic: GenericDevAdapter._result_json honors `wait` — polls for the spec artifact instead of reading once, so a slow flush isn't a false stall. - sprintstatus: advance() returns the unchanged status (not target) when the story line couldn't be rewritten (quoted/block keys YAML resolves but the line-regex can't), and leaves the file untouched. - tui/data + tmux_backend: liveness degrades to "unknown" on a backend failure — has_session wraps timeout/OSError into TmuxError and _session_liveness catches MultiplexerError — instead of crashing the poll. - verify: finalize_commit restores HEAD if the squash commit fails, so a rejecting hook can't drop the skill commit chain. Security: - Resolve taskkill from %SystemRoot%\System32 (platform_util + unity_teardown x2) so PATH/CWD can't hijack it. macOS portability: - unity_setup: gate the cp --reflink fast path to Linux (BSD cp rejects --reflink, which silently forced cold Unity imports on macOS). - Rename the `windows` extra to `non-linux` (psutil is needed on macOS too); update the error message, docs, and uv.lock. Docs: - Reconcile the CHANGELOG review-trigger wording; fix the aliveness -> window_alive typo; FEATURES matrix 3 -> 2 review layers; correct three stale 0.6.5 -> 0.7.0 version stamps; clarify the follow-up-review handoff in bmad-auto-setup SKILL.md. Adds targeted tests for each behavioral fix. Full suite green (1042); trunk clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Addressed CodeRabbit review (commit 11411ff)Validated all 13 inline findings against the tree — 12 addressed, 1 skipped. Full suite green (1042 tests), FixedCorrectness / stability
Security
macOS portability
Docs
Each behavioral fix has a targeted test. Skipped (1)
The docstring-coverage pre-merge warning is left as-is — an advisory threshold, not a correctness issue. |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/automator/data/skills/bmad-auto-setup/SKILL.md`:
- Line 74: Update the setup description in SKILL.md so it no longer says the
orchestrator spawns “Claude Code sessions”; in the bmad-auto orchestrator
wording, replace that with adapter-neutral language such as “coding CLI
sessions” or “adapter sessions” to cover claude, codex, and gemini consistently.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 6f61f86c-fcc6-476b-99f6-e83a88209f5e
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (20)
CHANGELOG.mddocs/FEATURES.mddocs/ROADMAP.mddocs/adapter-authoring-guide.mddocs/game-engine-plugin-guide.mddocs/setup-guide.mdpyproject.tomlsrc/automator/adapters/generic.pysrc/automator/adapters/tmux_backend.pysrc/automator/data/plugins/unity/unity_setup.pysrc/automator/data/plugins/unity/unity_teardown.pysrc/automator/data/skills/bmad-auto-setup/SKILL.mdsrc/automator/platform_util.pysrc/automator/sprintstatus.pysrc/automator/tui/data.pysrc/automator/verify.pytests/test_generic_tmux.pytests/test_sprintstatus_advance.pytests/test_tui_data.pytests/test_verify.py
✅ Files skipped from review due to trivial changes (4)
- CHANGELOG.md
- docs/ROADMAP.md
- docs/game-engine-plugin-guide.md
- docs/adapter-authoring-guide.md
🚧 Files skipped from review as they are similar to previous changes (11)
- tests/test_tui_data.py
- src/automator/sprintstatus.py
- tests/test_sprintstatus_advance.py
- src/automator/tui/data.py
- src/automator/data/plugins/unity/unity_setup.py
- docs/setup-guide.md
- tests/test_generic_tmux.py
- docs/FEATURES.md
- src/automator/adapters/generic.py
- tests/test_verify.py
- src/automator/verify.py
|
augment review |
🤖 Augment PR SummarySummary: Release 0.7.0 — a major reshaping of how the orchestrator drives development and review. Key changes:
Technical notes: The 🤖 Was this summary useful? React with 👍 or 👎 |
| **Gate modes:** `none` runs everything unattended; `per-epic` (default) pauses at epic boundaries; `per-story-spec-approval` pauses after each spec is written so you approve it before implementation is reviewed. | ||
|
|
||
| **Review:** `[review].enabled = false` drops the separate fresh-context review session; the dev pass instead runs `bmad-quick-dev`'s own internal triple-review (Blind Hunter / Edge Case Hunter / Acceptance Auditor) and finalizes the story straight to `done` — one session per story instead of two, verify commands still gating the commit. Governs deferred-work sweeps too. | ||
| **Review:** `[review].enabled = false` drops the separate fresh-context review session; the dev pass instead runs `bmad-quick-dev`'s own internal triple-review (Blind Hunter / Edge Case Hunter / Acceptance Auditor) and finalizes the story straight to `done` — one session per story instead of two, verify commands still gating the commit. Governs deferred-work sweeps too. When review is enabled, `[review].trigger` decides _when_ that separate pass runs: `recommended` (default) only when the `bmad-dev-auto` session flags `followup_review_recommended` — it already triple-reviews inline and recommends an independent pass only when its changes were significant; `always` runs it every story. The follow-up loop is bounded by `limits.max_review_cycles` (default 3), which caps oscillation. |
There was a problem hiding this comment.
This modified line still references bmad-quick-dev and the three-layer review ("Blind Hunter / Edge Case Hunter / Acceptance Auditor"), but the PR retires bmad-quick-dev in favor of bmad-dev-auto and drops the Acceptance Auditor layer — the rest of the PR (FEATURES.md, CHANGELOG, SKILL.md) consistently references two layers (Blind Hunter + Edge Case Hunter). Consider updating this sentence to match.
Severity: medium
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
Reconcile docs with the 0.7.0 behavior change and close a real Windows portability bug surfaced by the automated reviews: - README: review fallback runs bmad-dev-auto's two-layer review (Blind Hunter / Edge Case Hunter), not bmad-quick-dev's triple-review. - setup SKILL.md: adapter-neutral "coding CLI sessions" (claude/codex/gemini). - setup-guide: installer copies three skill dirs, not five. - platform_util.pid_alive(): os.kill(pid, 0) is destructive on Windows (TerminateProcess); route the three unguarded liveness checks (runs.py x2, tui/data.py) through it, and extend test_portability_guard to police bare os.kill(_, 0) so it can't regress. - test_devcontract: cover the no-spec fallback-artifact (bmad-dev-auto-result-*) discovery path. Skipped as invalid: probe.py pipe_pane (already tolerant per the multiplexer contract); RUF043 nitpick (not selected, lint green). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Addressed the remaining CodeRabbit + Augment findings in 38380c5 (prior batch was 11411ff): Fixed
Skipped (invalid / no-op)
Full |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/automator/platform_util.py`:
- Around line 47-50: Reject non-positive PIDs before any process probe or
termination in pid_alive() and terminate_pid() within platform_util.py. Add an
early guard for pid <= 0 so os.kill() and any platform-specific termination path
are never reached with 0 or negative values, and make both methods fail fast for
invalid PIDs. Use the existing pid_alive, terminate_pid, and os.kill call sites
as the places to update.
In `@src/automator/runs.py`:
- Around line 47-49: The current write_pid() implementation stores only
os.getpid(), which can be reused by another process and cause engine_alive() and
stop_run() to act on the wrong process. Update the PID_FILE format to include a
unique engine identity token in addition to the numeric PID, and have
write_pid(), engine_alive(), and stop_run() read and verify that token before
treating the run as active or sending a stop signal. Use a stable per-run marker
such as a UUID or process start timestamp, and reject stale PID files whose
identity does not match the current engine instance.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 8934d8a8-e1c5-4e64-a15f-52a3955d5661
📒 Files selected for processing (9)
README.mddocs/setup-guide.mdsrc/automator/data/plugins/unity/unity_teardown.pysrc/automator/data/skills/bmad-auto-setup/SKILL.mdsrc/automator/platform_util.pysrc/automator/runs.pysrc/automator/tui/data.pytests/test_devcontract.pytests/test_portability_guard.py
✅ Files skipped from review due to trivial changes (2)
- docs/setup-guide.md
- README.md
🚧 Files skipped from review as they are similar to previous changes (3)
- tests/test_devcontract.py
- tests/test_portability_guard.py
- src/automator/data/plugins/unity/unity_teardown.py
| """Record the engine process pid. Never deleted: a stale pid that | ||
| ``pid_alive`` reports as gone is the signal that a run was interrupted.""" | ||
| (run_dir / PID_FILE).write_text(str(os.getpid()), encoding="utf-8") |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟠 Major
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Description: Check whether engine.pid has any companion identity/start-time/heartbeat validation.
rg -n -C3 'engine\.pid|PID_FILE|write_pid|read_pid|pid_alive|terminate_pid' src testsRepository: bmad-code-org/bmad-auto
Length of output: 20505
🏁 Script executed:
#!/bin/bash
# Verify if platform_util.pid_alive or terminate_pid has any token verification
cat src/automator/platform_util.pyRepository: bmad-code-org/bmad-auto
Length of output: 4165
🏁 Script executed:
#!/bin/bash
# Suggestion: Modify write_pid to store (pid + start_time) or UUID
# Modify read_pid to return the tuple/string
# Modify engine_alive/stop_run to verify the identity before acting
# This prevents false positives from PID reuse
# Example improvement in src/automator/runs.py:
# 1. On write, include `time.time()` or `uuid4()`.
# 2. On read, parse both.
# 3. Before `terminate_pid`, verify the running process's start time or check a mapped session.Repository: bmad-code-org/bmad-auto
Length of output: 161
Don't treat a bare PID as engine identity.
write_pid() currently stores only the numeric ID. Because operating systems recycle process IDs, a terminated engine's PID can be reassigned to an unrelated process. If this happens, engine_alive() will incorrectly report the run as active, and stop_run() may terminate a process that has nothing to do with the project.
Persist a unique marker (such as a random UUID, the process start time, or a combined tuple) inside engine.pid. Before relying on the engine_alive check or executing stop_run(), read and verify that this token still matches the current engine's identity rather than trusting the numeric ID alone.
See current snippet
def write_pid(run_dir: Path) -> None:
"""Record the engine process pid. Never deleted: a stale pid that
``pid_alive`` reports as gone is the signal that a run was interrupted."""
(run_dir / PID_FILE).write_text(str(os.getpid()), encoding="utf-8")🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/automator/runs.py` around lines 47 - 49, The current write_pid()
implementation stores only os.getpid(), which can be reused by another process
and cause engine_alive() and stop_run() to act on the wrong process. Update the
PID_FILE format to include a unique engine identity token in addition to the
numeric PID, and have write_pid(), engine_alive(), and stop_run() read and
verify that token before treating the run as active or sending a stop signal.
Use a stable per-run marker such as a UUID or process start timestamp, and
reject stale PID files whose identity does not match the current engine
instance.
There was a problem hiding this comment.
Valid concern. The pid <= 0 aspect is fixed in 6492cdf. The full process-identity verification (PID-reuse defense) is a pre-existing, multi-file change that re-introduces an acked /proc read and changes read_pid's signature across engine_alive/stop_run/tui + migration tolerance — out of scope for this release PR. Tracked as #12.
There was a problem hiding this comment.
@pbean, thanks for the clarification. Agreed — the broader PID-reuse defense is a legitimate follow-up, but it sounds too cross-cutting for this release PR given the engine.pid format, read_pid() signature, migration tolerance, and TUI touchpoints. Good call to keep the narrower pid <= 0 fix here and track the full identity hardening separately in #12.
I won’t push for that larger change on this PR.
CodeRabbit (Major) on #11: a corrupt engine.pid read as 0 or negative would make os.kill target a process group — for 0, the orchestrator's own group — so terminate_pid could signal unintended processes. Guard pid <= 0 at the top of both pid_alive() (-> False) and terminate_pid() (-> no-op) before os.kill is reached. Adds tests/test_platform_util.py asserting os.kill is never called for non-positive pids. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Went through the latest CodeRabbit re-review (2 actionable, both Major, on the code the prior commit touched): Fixed — 6492cdf
Deferred — #12
Full |
…at the dev stage Copilot fires agentStop for every subagent turn too — with an empty transcriptPath and a tool-use session id, not the main session's turn-end. With dev decoupled to bmad-dev-auto (which implements via subagents), that premature Stop reached the dev stage, where 0 nudges made wait_for_completion declare an outright stall before the skill wrote its terminal spec (same root cause as the 0.6.4 review stall, now one stage earlier). Add a per-profile subagent_stop_without_transcript flag (true for copilot): a Stop carrying no transcript is treated as a subagent stop and skipped, so the main session's real turn-end drives completion — and usage tallying is restored, since that Stop carries the transcript. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Document the genuinely new fixes made during the #11 review cycle that hardened pre-existing behavior rather than just polishing the PR's own new code: guard non-positive PIDs in pid_alive/terminate_pid, and route the remaining os.kill(pid, 0) liveness checks through pid_alive (destructive on Windows). The Copilot subagent agentStop fix is already recorded. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Release 0.7.0
Minor release (breaking, pre-1.0): retires two bundled skills and reshapes how the orchestrator drives development.
Highlights
bmad-auto-devfork — the orchestrator now drives the upstreambmad-dev-autoskill unmodified as its sole dev primitive. No fork to keep in sync.bmad-dev-autoon thedonespec (a fresh step-04 review pass), not a separate skill.bmad-auto-reviewskill — its adversarial review is covered bybmad-dev-auto's inline step-04; the two hunters are now always-required base skills.bmad-auto initinstalls three bundled skills (bmad-auto-resolve,bmad-auto-sweep,bmad-auto-setup);bmad-dev-autois a hard upstream prerequisite.review.triggerrecommended/always) gated onfollowup_review_recommended.CLIProfile/HookSpecfield reference.blockedspec).Full details in CHANGELOG.md under
[0.7.0].Release mechanics
0.7.0across all files (sync_version.py --checkgreen).trunk check(full, no filter) → no issues.main, the Release workflow auto-creates thev0.7.0tag and GitHub release from the CHANGELOG.Post-review fixes
Addressed CodeRabbit's review (commit
11411ff): 12 of 13 inline findings fixed —_result_jsonwait-polling, sprint-status no-op guard, TUI liveness degradation, squash-commit HEAD restore,taskkillSystem32 resolution, macOScp --reflinkgating +non-linuxextra rename, plus doc reconciliations — each with targeted tests. One finding skipped (probe.pipe_paneis already contract-tolerant). See the resolution comment for the full breakdown.🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
recommendedvsalways, bounded bylimits.max_review_cycles.bmad-dev-autoand refreshed the bundled skill set (nowbmad-auto-resolve,bmad-auto-sweep,bmad-auto-setup).Bug Fixes
result.jsonsynthesis, workflow contract enforcement, escalation replay/state handling, and sprint-status advancement.Documentation
Chores
0.7.0, updated marketplace/module metadata, and expanded.gitignorefor build output.