test: add deterministic smoke coverage#32
Conversation
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
WalkthroughAdds a complete deterministic smoke coverage system for the Story Automator. Changes include: an epic parser extension for broader header formats and ID-based selection; a new ChangesDeterministic Smoke Coverage System
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Possibly related PRs
Poem
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@augmentcode review |
🤖 Augment PR SummarySummary: This PR adds a comprehensive deterministic smoke/readiness testing suite for the Story Automator control plane and integrates fast gates into the default Changes:
Technical Notes: The production code changes are in 🤖 Was this summary useful? React with 👍 or 👎 |
|
@augmentcode review |
|
@augmentcode review |
|
✅ Action performedReview finished.
|
|
@augmentcode review |
|
@augmentcode review |
|
This pull request is abnormally large and would use a significant amount of tokens to review. If you still wish to review it, comment "augment review" and we will review it. |
|
@coderabbitai review |
|
✅ Action performedReview finished.
|
|
augment review |
|
@coderabbitai review |
|
✅ Action performedReview finished.
|
There was a problem hiding this comment.
Actionable comments posted: 14
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
docs/development.md (1)
14-19:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winUpdate the
verifychecklist.The current bullets still document the older
test:python/pack:dry-run/test:smokeflow, butpackage.jsonnow runs version, package, CLI, helper-contract, mode, Python, and shell-smoke gates undernpm run verify.♻️ Proposed fix
- `npm run verify` expands to: - -- `npm run test:python` -- `npm run pack:dry-run` -- `npm run test:smoke` + `npm run verify` expands to: + +- `npm run test:python` +- `npm run version:check` +- `npm run pack:assert` +- `npm run test:cli` +- `npm run smoke:contracts` +- `npm run smoke:modes` +- `npm run test:smoke`🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/development.md` around lines 14 - 19, Update the bulleted list under the `npm run verify` documentation to reflect the actual gates that are currently executed in package.json. Replace the outdated entries (test:python, pack:dry-run, test:smoke) with the actual list of gates that are run: version, package, CLI, helper-contract, mode, Python, and shell-smoke gates. This ensures the documentation accurately represents what npm run verify performs.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/plans/automator-deterministic-smoke-coverage/gate-map.md`:
- Line 14: In the gate-map.md file, row 14 for the Runner lifecycle matrix
currently states "tmux must be available because skipped tests fail this gate,"
which is inaccurate. Update this text to clarify that while tmux is preferred
for full test coverage, it is not required for the gate to pass. The
scripts/run-smoke-contracts.py script allows skipped tests when tmux is
unavailable and only fails on unexpected skips, so the gate will pass with zero
exit code even when tmux-dependent tests are skipped as expected. Revise the
wording to reflect that tmux availability affects test coverage completeness,
not gate success.
In `@docs/plans/automator-deterministic-smoke-coverage/handoff-log.md`:
- Around line 160-164: The Decisions And Assumptions section references the
non-existent `npm run pack:identity` command, which readers cannot run since it
does not exist in package.json. Remove the phrase "or an explicit `npm run
pack:identity`" from the bullet point about tarball identity binding, keeping
only the reference to the valid `npm run pack:assert` command that actually
exists in package.json.
In `@docs/plans/automator-deterministic-smoke-coverage/README.md`:
- Around line 88-92: The Clean Context Agent Protocol section does not include
gate-map.md in the list of required context files that phase agents must read
before starting work. Add gate-map.md to the list of files that every phase
agent must read as part of the protocol, placing it appropriately alongside the
other required context files like README.md, the assigned phase file, the
assigned phase TODO file, implementation-notes.md, and relevant earlier entries
in handoff-log.md. This ensures that gate ownership and pass-fail contract
information is not missed when starting work on a new phase.
- Around line 65-67: The Target repo line contains an absolute machine-specific
path `/Users/joon/.codex/worktrees/9b27/bmad-story-automator` that will not work
across different environments. Replace this absolute path with a relative
reference or environment-agnostic placeholder that describes where the target
repository should be located, ensuring the documentation remains valid
regardless of the local machine's directory structure.
In `@docs/versioning.md`:
- Line 151: The documentation correctly identifies that
`skills/bmad-story-automator/workflow.md` must be included in version bumps on
line 151, but the sample `git add` commands shown in the preview and stable
release sections omit this file. Update both the preview and stable `git add`
command examples in the versioning.md file to include
`skills/bmad-story-automator/workflow.md` alongside the other files being added,
ensuring consistency between the documented requirements and the example
commands provided to users.
In `@scripts/check-smoke-inputs.py`:
- Around line 18-27: The code accesses nested dictionary keys without validation
on the inputs dictionary and the gunz and bmad objects. If the smoke_inputs
payload is malformed or missing expected keys, uncaught KeyError or TypeError
exceptions will be raised, bypassing the controlled error handling. Wrap the
nested key accesses in a try-except block or validate that all required keys
exist before dereferencing them, then ensure any exceptions during this
validation are caught and reported through the same controlled failure path as
other smoke input determinism failures.
In `@scripts/run-smoke-dev-loop.py`:
- Around line 303-314: The _write_dev_log method attempts to write to a file
path that may not have its parent directory created in a fresh workspace. Before
calling write_text() to persist the log file, ensure the parent directory exists
by invoking .parent.mkdir(parents=True, exist_ok=True) on the target path object
to create any missing intermediate directories.
In `@scripts/smoke_prep/gunz.py`:
- Around line 11-30: The code at the gunz_dir.exists() check reuses an existing
clone without verifying that its origin URL still matches REPO_URL, which could
allow running smoke prep against an unintended repository if the cached
directory was manually modified. After the exists check but before reusing the
clone, add a validation step that runs a git command to retrieve the current
origin URL from the gunz_dir and compare it against REPO_URL; only proceed with
reusing the clone if the URLs match, otherwise handle the mismatch appropriately
(either delete and reclone, or raise an error).
In `@scripts/smoke_prep/inputs.py`:
- Around line 16-39: The subprocess.run() call for npm view is missing a timeout
parameter which can cause indefinite hangs if the npm registry stalls. Add a
timeout parameter set to 60 seconds to the subprocess.run() call, then wrap the
entire subprocess.run() statement in a try-except block to catch
subprocess.TimeoutExpired exceptions and convert them to a SmokeError with a
clear message indicating the npm command timed out.
In `@scripts/smoke_prep/package_contracts.py`:
- Around line 199-208: Add a timeout parameter to the subprocess.run() call
within the _npm_pack_json function to prevent npm pack from hanging
indefinitely. Set an appropriate timeout value (in seconds) as a parameter to
subprocess.run() alongside the existing parameters like check, stdout, and
stderr. This ensures that if npm pack stalls due to network issues or other
problems, the process will be interrupted with a subprocess.TimeoutExpired
exception rather than blocking indefinitely.
In `@scripts/smoke_prep/process.py`:
- Around line 32-49: The run() function has an unbounded subprocess.run() call
that can hang indefinitely on stalled git/npm/npx invocations. Add a timeout
parameter to the run() function signature with a default value of 900 seconds,
then pass this timeout parameter to the subprocess.run() call. Additionally,
wrap the subprocess.run() call in a try-except block to catch
subprocess.TimeoutExpired exceptions and convert them to SmokeError for
consistent failure handling across the codebase.
In `@scripts/smoke_prep/workspace.py`:
- Around line 29-31: The git check-ignore command in the subprocess.run call is
missing the `--` path delimiter before the path argument. Add `--` as a separate
list element before `relative.as_posix()` in the command list passed to
subprocess.run. This ensures that if the workspace path starts with `-`, it will
be correctly interpreted as a path argument rather than a git option.
In `@tests/test_smoke_script_contracts.py`:
- Around line 253-291: The test methods
test_dev_loop_story_slug_ignores_unfound_sprint_status_story and
test_dev_loop_story_slug_ignores_unfound_sprint_status_story_echo create
SmokeRunner and DevLoopSmokeRunner instances respectively, but never call
close() on them to clean up resources. Wrap the patched code block (the with
statement and assertions) in a try/finally block for each test method, and call
runner.close() in the finally block to ensure proper cleanup and prevent
resource leaks, similar to the pattern used in other runner-based tests in this
file.
- Line 18: Remove the sys.path.insert(0, str(SCRIPTS)) call from the
module-level scope (line 18). Instead, move this path insertion inside the
load_script_module function where it is actually needed, and consider using a
context manager or temporary patch to ensure the sys.path modification is scoped
appropriately and doesn't persist globally after the function call completes.
---
Outside diff comments:
In `@docs/development.md`:
- Around line 14-19: Update the bulleted list under the `npm run verify`
documentation to reflect the actual gates that are currently executed in
package.json. Replace the outdated entries (test:python, pack:dry-run,
test:smoke) with the actual list of gates that are run: version, package, CLI,
helper-contract, mode, Python, and shell-smoke gates. This ensures the
documentation accurately represents what npm run verify performs.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: c68c64ca-f3ad-4816-83db-e927ddeee140
📒 Files selected for processing (47)
.gitignoredocs/changelog/260602.mddocs/development.mddocs/plans/automator-deterministic-smoke-coverage/01-baseline-and-version-determinism.mddocs/plans/automator-deterministic-smoke-coverage/02-package-and-prepared-repo-contracts.mddocs/plans/automator-deterministic-smoke-coverage/03-runtime-helper-contract-smokes.mddocs/plans/automator-deterministic-smoke-coverage/04-create-dev-resume-validate-edit-coverage.mddocs/plans/automator-deterministic-smoke-coverage/05-automate-review-finish-retro-coverage.mddocs/plans/automator-deterministic-smoke-coverage/06-gate-integration-and-readiness-review.mddocs/plans/automator-deterministic-smoke-coverage/README.mddocs/plans/automator-deterministic-smoke-coverage/TODO.mddocs/plans/automator-deterministic-smoke-coverage/TODO/phase-01.mddocs/plans/automator-deterministic-smoke-coverage/TODO/phase-02.mddocs/plans/automator-deterministic-smoke-coverage/TODO/phase-03.mddocs/plans/automator-deterministic-smoke-coverage/TODO/phase-04.mddocs/plans/automator-deterministic-smoke-coverage/TODO/phase-05.mddocs/plans/automator-deterministic-smoke-coverage/TODO/phase-06.mddocs/plans/automator-deterministic-smoke-coverage/coverage-baseline.mddocs/plans/automator-deterministic-smoke-coverage/gate-map.mddocs/plans/automator-deterministic-smoke-coverage/handoff-log.mddocs/plans/automator-deterministic-smoke-coverage/implementation-notes.mddocs/versioning.mdpackage.jsonscripts/assert-package-contracts.pyscripts/check-smoke-inputs.pyscripts/check-version-alignment.pyscripts/prepare-smoke-test.pyscripts/run-smoke-automator.pyscripts/run-smoke-contracts.pyscripts/run-smoke-dev-loop.pyscripts/run-smoke-finish-loop.pyscripts/run-smoke-modes.pyscripts/smoke_prep/__init__.pyscripts/smoke_prep/automator.pyscripts/smoke_prep/cli.pyscripts/smoke_prep/config.pyscripts/smoke_prep/gunz.pyscripts/smoke_prep/inputs.pyscripts/smoke_prep/package_contracts.pyscripts/smoke_prep/process.pyscripts/smoke_prep/report.pyscripts/smoke_prep/workspace.pyskills/bmad-story-automator/src/story_automator/core/epic_parser.pyskills/bmad-story-automator/workflow.mdtests/test_cli_contracts.pytests/test_runtime_helper_contracts.pytests/test_smoke_script_contracts.py
|
@coderabbitai review |
|
✅ Action performedFull review finished. |
|
@augmentcode review |
|
Sorry, Augment failed to review this pull request. |
Summary
Adds deterministic smoke coverage for the Story Automator control plane on top of
bma-d/e2e-tests.This PR hardens the package/install, helper-contract, mode, dev-loop, finish-loop, retrospective, wrapup, and host-isolation smoke surfaces so smoke readiness is backed by structured assertions instead of command-exit-only success.
Changes
docs/plans/automator-deterministic-smoke-coverage/.npm run verify.npm run smoke:deterministic-fullfor the explicit prepared-repo release smoke.### 1.1: ....Docs For Review
docs/plans/automator-deterministic-smoke-coverage/README.md: plan intent, boundaries, and release-blocking priorities.docs/plans/automator-deterministic-smoke-coverage/gate-map.md: command ownership, pass/fail signal, diagnostics, and reset/cache policy for each gate.docs/plans/automator-deterministic-smoke-coverage/implementation-notes.md: implementation decisions, tradeoffs, and remaining live/manual boundaries.docs/development.md: prepared.smoke/gunzsetup and local smoke usage.docs/changelog/260602.md: chronological implementation summary.Local Verification
Route:
T4 broad/source/unknown.Rationale: this branch changes package scripts, smoke runners, runtime helper contracts, parser behavior, tests, and release-readiness docs.
Passed:
git diff --check origin/bma-d/e2e-tests...HEADnpm run verifynpm run smoke:deterministic-fullpython3 -m venv <tmp>/venv<tmp>/venv/bin/python -m pip install --upgrade pip build<tmp>/venv/bin/python -m build skills/bmad-story-automator --outdir <tmp>/dist<tmp>/venv/bin/python -m pip install <tmp>/dist/*.whl<tmp>/venv/bin/python -m story_automator --help<tmp>/venv/bin/story-automator --helpPYTHONPATH=skills/bmad-story-automator/src python3 scripts/run-smoke-finish-loop.py --target-repo /Users/joon/.codex/worktrees/9b27/bmad-story-automatorunsafe commit repo outside smoke workspaceSmoke artifacts checked:
.smoke/PACKAGE_IDENTITY.json.smoke/INSTALLED_AUTOMATOR_MANIFEST.json.smoke/MODE_SMOKE_REPORT.json.smoke/FINISH_LOOP_SMOKE_REPORT.json.smoke/AUTOMATED_SMOKE_REPORT.md.smoke/AUTOMATED_DEV_LOOP_SMOKE_REPORT.mdKey asserted outcomes:
bmad-story-automator@1.15.0, 195 packed entries.claude/skillsSkipped by route:
Reviewer Guide
Suggested review order:
gate-map.mdfirst to understand which behavior each command proves.scripts/smoke_prep/package_contracts.pyandscripts/prepare-smoke-test.pyfor package/install identity guarantees.scripts/run-smoke-contracts.py,tests/test_runtime_helper_contracts.py, and related helper tests for fail-closed contract coverage.scripts/run-smoke-modes.pyfor create/resume/validate/edit assertions.scripts/run-smoke-finish-loop.pyfor commit isolation, review/finalize, retrospective skip semantics, wrapup, and unsafe-target rejection.package.jsongate wiring last.High-risk areas:
.smokereset and package install flowcommit-storysprint-status.yamlsource-of-truth fallback behaviorbmad-method@nexthandling as a recorded/asserted moving inputverifyfast while preserving explicit release smoke coverageSplit Decision
Kept as one PR. The docs, gate wiring, smoke runners, helper contracts, and tests all support one technical goal: deterministic Story Automator smoke readiness. Splitting would separate the proof scripts from the gate map and make review less coherent.
Known Boundaries
gunzcommit while recording/asserting the resolvedbmad-method@nextpackage identity.Summary by CodeRabbit
Release Notes
New Features
Documentation
Tests