Add opt-in TEA workflow support to story automator#24
Conversation
|
Warning Review limit reached
More reviews will be available in 49 minutes and 14 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
WalkthroughAdds opt-in TEA (Test Execution Automation): TEA prompt/schema assets, runtime-policy extensions (inline overrides, step pruning, sequence helpers), CLI commands for run-policy and workflow detection, markdown progress updater (state-progress), agent/task wiring from pinned policy, updated orchestration docs/templates, and broad tests and test helpers. ChangesTEA Workflow Opt-In Support
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Follow-up changes pushed in response to review findings, kept as tight as possible:
Validation:
|
|
Final update before readying this PR: Latest fixes kept scope tight and addressed the remaining review findings:
Validation after the final fixes:
A final focused review pass returned |
🤖 Augment PR SummarySummary: Adds opt-in TEA workflow support to the story automator by selecting a per-run workflow track and snapshotting the resolved policy into the state document so execution follows a pinned sequence. Changes:
🤖 Was this summary useful? React with 👍 or 👎 |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (2)
skills/bmad-story-automator/steps-c/step-03-execute.md (1)
208-212: 💤 Low valueConsider redundant progress update for
create=done.Line 210 sets
create=doneagain, but this was already set at line 151 when the create step completed. While this defensive redundancy ensures consistency, it's unnecessary and could be simplified to only updatedev=done.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@skills/bmad-story-automator/steps-c/step-03-execute.md` around lines 208 - 212, The progress update call to the state-progress helper is redundantly re-setting create=done; modify the invocation of "$scripts" orchestrator-helper state-progress (the command that passes --story "${story_id}" and --set flags) to remove the --set create=done flag and only include --set dev=done and --set status=in-progress so the create step isn’t updated twice.skills/bmad-story-automator/steps-c/step-02a-preflight-config.md (1)
73-92: ⚡ Quick winClarify
validate-create-storyselection behavior.Lines 78-82 state that
validate-create-storyis "advisory only and is not automated in v1", yet line 86 allows users to include it inselected_optional_steps. This could confuse users who select it expecting automated execution.Consider either:
- Removing it from the selectable options and documenting it as future-scoped
- Adding explicit guidance that selecting it records user intent but doesn't trigger automation in v1
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@skills/bmad-story-automator/steps-c/step-02a-preflight-config.md` around lines 73 - 92, The wording around validate-create-story is ambiguous: update the TEA options block so that validate-create-story is either removed from selectable optional steps or clearly marked as non-automated intent; specifically, leave selected_optional_steps as the list of zero-or-more of retro and nfr only (remove validate-create-story from that enumerated selectable list) and add one sentence after the list explaining that selecting validate-create-story (if still referenced elsewhere) only records user intent and will not trigger any automated action in v1 — reference the symbols selected_optional_steps, validate-create-story, and workflow_track when making this clarification.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@skills/bmad-story-automator/src/story_automator/commands/orchestrator.py`:
- Around line 526-530: The --set parsing block in orchestrator.py can raise
ValueError when args[idx + 1] lacks an '='; before calling split("=", 1)
validate that "=" is present (e.g., if "=" not in args[idx + 1]) and handle it
by raising a clear ValueError or argparse-style error mentioning the bad
argument, or skip with a logged warning; then only call key, value = args[idx +
1].split("=", 1) and continue updating updates[_normalize_progress_key(key)] =
value and advancing idx. Ensure the fix targets the block handling args, idx,
updates and uses _normalize_progress_key to locate where to apply the
validation.
In `@skills/bmad-story-automator/src/story_automator/commands/state.py`:
- Around line 633-645: The branch currently gates TEA on asset/skill heuristics
even when an explicit override exists; change the condition so that when
explicit_policy and explicit_policy_valid (from _explicit_tea_policy_valid) are
true you accept the resolved explicit policy as source-of-truth for capability
(i.e., do not require assets_ok and not missing_skills to set
tea_capable/recommended_track). If you still need missing_assets/missing_skills
in the payload, compute them from the resolved explicit policy returned by
_explicit_tea_policy_valid (or mark them advisory) instead of using
_tea_assets_complete() and _tea_skill_availability() which only use hard-coded
roots/aliases; update the branch that currently checks explicit_policy &&
explicit_policy_valid && assets_ok && not missing_skills to rely on
explicit_policy_valid and use the resolved-policy-derived checks for advisory
fields.
---
Nitpick comments:
In `@skills/bmad-story-automator/steps-c/step-02a-preflight-config.md`:
- Around line 73-92: The wording around validate-create-story is ambiguous:
update the TEA options block so that validate-create-story is either removed
from selectable optional steps or clearly marked as non-automated intent;
specifically, leave selected_optional_steps as the list of zero-or-more of retro
and nfr only (remove validate-create-story from that enumerated selectable list)
and add one sentence after the list explaining that selecting
validate-create-story (if still referenced elsewhere) only records user intent
and will not trigger any automated action in v1 — reference the symbols
selected_optional_steps, validate-create-story, and workflow_track when making
this clarification.
In `@skills/bmad-story-automator/steps-c/step-03-execute.md`:
- Around line 208-212: The progress update call to the state-progress helper is
redundantly re-setting create=done; modify the invocation of "$scripts"
orchestrator-helper state-progress (the command that passes --story
"${story_id}" and --set flags) to remove the --set create=done flag and only
include --set dev=done and --set status=in-progress so the create step isn’t
updated twice.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 0b9b8934-08dc-444c-b652-e94a885b71b8
📒 Files selected for processing (18)
skills/bmad-story-automator/data/tea-story-automator/parse/tea_step.jsonskills/bmad-story-automator/data/tea-story-automator/prompts/tea_step.mdskills/bmad-story-automator/src/story_automator/cli.pyskills/bmad-story-automator/src/story_automator/commands/orchestrator.pyskills/bmad-story-automator/src/story_automator/commands/orchestrator_epic_agents.pyskills/bmad-story-automator/src/story_automator/commands/state.pyskills/bmad-story-automator/src/story_automator/core/agent_config.pyskills/bmad-story-automator/src/story_automator/core/runtime_policy.pyskills/bmad-story-automator/steps-c/step-01b-continue.mdskills/bmad-story-automator/steps-c/step-02a-preflight-config.mdskills/bmad-story-automator/steps-c/step-03-execute.mdskills/bmad-story-automator/steps-c/step-03a-execute-review.mdskills/bmad-story-automator/steps-c/step-03b-execute-finish.mdskills/bmad-story-automator/templates/state-document.mdskills/bmad-story-automator/workflow.mdtests/test_orchestrator_parse.pytests/test_runtime_policy.pytests/test_state_policy_metadata.py
|
augment review |
|
Update since the last CodeRabbit-reviewed commit (
Validation on the current head (
Scope note:
|
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
|
@coderabbitai review |
|
augment review |
|
Latest follow-up for PR reviews is now on This round stayed narrowly scoped to three items:
Validation on this head:
All currently open review threads for this round have been replied to and resolved. |
|
augment review |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
|
Latest follow-up is now on This round stayed limited to two state-layer issues:
Validation on this head:
Replying on the two new threads now and resolving them. |
|
@coderabbitai review |
|
augment review |
✅ Actions performedReview triggered.
|
|
Addressed the latest unresolved review threads in Changes in this round:
Validation:
|
|
@coderabbitai review |
|
augment review |
✅ Actions performedReview triggered.
|
bma-d
left a comment
There was a problem hiding this comment.
Thanks for the contribution. The opt-in TEA workflow direction is useful, but I’m requesting changes because the current implementation centralizes too much policy logic in the state command path and leaves several execution/policy contracts inconsistent.
I left inline comments with concrete repros where the issue is executable. The two blocking themes are: keep the pinned workflow policy authoritative, and move TEA policy generation/detection out of broad command/helper files into a single owner with focused contract coverage.
| lowered = value.strip().lower() | ||
| if lowered in {"1", "true", "yes", "y", "on"}: | ||
| return True | ||
| if lowered in {"0", "false", "no", "n", "off"}: |
There was a problem hiding this comment.
[P1] This command module is now taking ownership of TEA policy selection and contract construction. The PR adds TEA asset-root probing (272-332), prompt/schema selection (335-345), skill alias resolution (347-368), full step contract generation (371-451), run-policy branching (454-542), explicit policy detection (571-612), and workflow recommendation (696-758) here. Static repro: state.py grows from 308 LOC on origin/main to 908 LOC on this PR. Please move TEA capability detection/policy selection/contract generation behind a core policy-selection module or data-backed policy path, and keep this command focused on state-doc IO/rendering.
| "nfr": ("bmad-testarch-nfr", "bmad-tea-testarch-nfr"), | ||
| } | ||
|
|
||
| STEP_DISPLAY_NAMES = { |
There was a problem hiding this comment.
[P2] These display labels duplicate policy-derived data. STEP_DISPLAY_NAMES hard-codes labels even though policy step contracts carry labels, while _build_run_policy() hard-codes TEA sequence generation and orchestrator.py adds another alias table for progress columns. Caveat: the runtime step allowlist can stay as schema validation, but labels/progress headers and generated sequence should come from one policy source instead of multiple Python tables.
| @@ -461,6 +468,692 @@ def test_build_state_doc_returns_json_on_policy_snapshot_failure(self) -> None: | |||
| self.assertFalse(payload["ok"]) | |||
| self.assertEqual(payload["error"], "policy_snapshot_failed") | |||
|
|
|||
There was a problem hiding this comment.
[P2] The TEA tests are being appended to an already broad mixed test file. Static repro: this file grows from 555 LOC on origin/main to 1370 LOC on the current PR head. The same file now mixes state policy metadata, orchestrator helper behavior, tmux command rendering, agent config, TEA detection/run-policy/progress, and TEA fixtures. Please split TEA policy/detection/progress tests and reusable TEA fixtures into focused files.
| ### A.1 ATDD | ||
| *Run only if the pinned policy sequence includes `atdd`* | ||
|
|
||
| Use the same spawn/monitor/parse pattern as other session-exit steps: |
There was a problem hiding this comment.
[P2] The ATDD path bypasses the per-task agent/model resolver used by the adjacent create/dev paths. create resolves with resolve_agent_for_task and conditionally passes --model, but this new ATDD block spawns/builds with $current_agent directly and never resolves atdd or applies the primary model. The TEA quality loop in step-03a has the same shape. Please route every policy-defined TEA step through the same resolver/model path as the standard steps.
|
|
||
|
|
||
| def _render_markdown_row(cells: list[str]) -> str: | ||
| return "| " + " | ".join(cells) + " |" |
There was a problem hiding this comment.
[P2] orchestrator-helper now owns markdown progress-table parsing and mutation. _state_progress finds the table by line.startswith("| Story "), splits cells with line.split("|"), finds the story row by markdown prefix, rewrites the row, and writes the state file. That duplicates state document schema ownership outside the state-doc layer. Please keep this CLI as a thin entry point and move progress-table parse/update/render behind the canonical state/state-document layer.
| @@ -140,9 +195,11 @@ config_json=$(jq -n \ | |||
| --arg currentStep "preflight" \ | |||
| --arg aiCommand "$agent_cmd" \ | |||
| --arg customInstructions "$custom_instructions" \ | |||
There was a problem hiding this comment.
[P2] This --argjson requires selected_optional_steps to already be valid JSON, but the TEA collection instructions only say “zero or more of retro, nfr”. Repro: selected_optional_steps='retro nfr'; jq -n --argjson selectedOptionalSteps "$selected_optional_steps" ... fails with jq: invalid JSON text passed to --argjson. This works only if the executor happens to store a JSON array like ["retro"]; please make that explicit or delegate config JSON construction to a helper.
| For the TEA track, state clearly: | ||
| - **Mandatory automated TEA core:** `atdd`, `test_automate`, `test_review`, `trace` | ||
| - **Optional automated TEA add-on:** `nfr` | ||
| - **Optional epic-level add-on:** `retro` |
There was a problem hiding this comment.
[P1] This introduces retro as an optional TEA add-on, but the finish path still triggers retro solely from epic completion (step-03b-execute-finish.md: trigger_retro=true then spawn retro) and does not gate that block on the pinned workflow sequence containing retro. The trigger block existed before, but this PR creates the optional policy/selection contract without honoring it. A TEA run that did not select retro can still spawn the retrospective. Please either gate the finish step on the policy sequence or remove the optional retro path and make it explicitly always-on.
| config = json.loads(config_json) | ||
| except json.JSONDecodeError: | ||
| write_json({"ok": False, "error": "invalid_config_json"}) | ||
| return 1 |
There was a problem hiding this comment.
[P2] build-run-policy can return ok:true for a TEA run even when mandatory TEA core skills/assets are missing. Repro on the current PR: with no TEA skills installed, cmd_build_run_policy(["--config-json", "{"workflowTrack":"tea"}"]) returns ok:true with create, atdd, dev, test_automate, test_review, trace, review and bmad-testarch-* contracts. The later state-doc snapshot path then fails on missing required skill assets. Please validate the generated override through the same policy resolution path before returning success, or check mandatory TEA capability in _build_run_policy.
| ```bash | ||
| session=$("$scripts" tmux-wrapper spawn atdd {epic} {story_id} \ | ||
| --agent "$current_agent" \ | ||
| --command "$("$scripts" tmux-wrapper build-cmd atdd {story_id} --agent "$current_agent" --state-file "$state_file")") |
There was a problem hiding this comment.
[P2] This parses the ATDD result into parsed, but the following branch reads next_action without ever assigning it from parsed. Repro: parsed='{ "next_action":"proceed" }'; [ "$next_action" = proceed ] does not proceed (and fails under set -u). The standard dev path below correctly does next_action=$(echo "$parsed" | jq -r '.next_action'). Please bind next_action (and any other normalized fields) before branching, ideally through one shared session-exit helper/pattern used by all TEA steps.
| track = str(config.get("workflowTrack") or "standard").strip().lower() | ||
| if track not in {"standard", "tea"}: | ||
| track = "standard" | ||
| selected = set(_normalize_option_list(config.get("selectedOptionalSteps"))) |
There was a problem hiding this comment.
[P2] A valid explicit project TEA policy is detected, but state creation still generates a fresh TEA policyOverride for any workflowTrack == "tea". Because cmd_build_state_doc() snapshots with this inline override and load_effective_policy() merges inline override after the project override, generated policy wins. Repro: an explicit project sequence create, atdd, dev, trace, review with custom atdd label snapshots as generated create, atdd, dev, test_automate, test_review, trace, review, and the label becomes atdd. Please separate track selection from policy generation and snapshot explicit TEA project policy as-is unless a caller supplies an explicit per-run override.
|
Follow-up on top of What changed:
Validation on this head:
|
|
@coderabbitai review |
|
augment review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
skills/bmad-story-automator/src/story_automator/commands/orchestrator.py (1)
590-594: ⚡ Quick winPersist the updated state file atomically.
Line 591 rewrites the full orchestration document in place. If this helper is interrupted after truncation, the state file can be left partial and future resume/parse steps will fail. Please use
atomic_writehere, as you already do in_marker.♻️ Proposed fix
try: - Path(state_file).write_text("\n".join(lines) + "\n", encoding="utf-8") + atomic_write(state_file, "\n".join(lines) + "\n") except OSError: print_json({"ok": False, "error": "state_file_unwritable"}) return 1🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@skills/bmad-story-automator/src/story_automator/commands/orchestrator.py` around lines 590 - 594, The current code writes the orchestration state in place via Path(state_file).write_text(...) which can leave a truncated/partial file on interruption; replace that write with the existing atomic_write helper used by _marker so the new content is written to a temp file and atomically moved into place (e.g. call atomic_write(state_file, "\n".join(lines) + "\n", encoding="utf-8")), keep the same OSError handling to print_json and return 1, and ensure you import or reuse the same atomic_write symbol used in _marker so behavior matches the rest of the module.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@skills/bmad-story-automator/src/story_automator/commands/state.py`:
- Around line 698-699: The code currently sets teaDetected based on
explicit_override_present (from _explicit_policy_path) which makes a
standard-only story-automator.policy.json flip TEA UX; update the teaDetected
computation so it only uses TEA-specific evidence from the explicit policy
payload returned by _explicit_policy_payload(project_root) (e.g., inspect the
payload for TEA steps/flags and/or explicit_override_error TEA-related info) and
ignore plain presence of an override file; locate uses of
explicit_override_present and _explicit_policy_payload (also at the other
occurrence around the block at the previously noted lines) and change the
boolean used for teaDetected to a TEA-only predicate derived from the payload
instead of the file-existence flag.
---
Nitpick comments:
In `@skills/bmad-story-automator/src/story_automator/commands/orchestrator.py`:
- Around line 590-594: The current code writes the orchestration state in place
via Path(state_file).write_text(...) which can leave a truncated/partial file on
interruption; replace that write with the existing atomic_write helper used by
_marker so the new content is written to a temp file and atomically moved into
place (e.g. call atomic_write(state_file, "\n".join(lines) + "\n",
encoding="utf-8")), keep the same OSError handling to print_json and return 1,
and ensure you import or reuse the same atomic_write symbol used in _marker so
behavior matches the rest of the module.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: b999e135-a366-46f1-9584-1c662c5b6572
📒 Files selected for processing (6)
skills/bmad-story-automator/src/story_automator/commands/orchestrator.pyskills/bmad-story-automator/src/story_automator/commands/state.pyskills/bmad-story-automator/src/story_automator/core/runtime_policy.pytests/test_agent_config_model.pytests/test_runtime_policy.pytests/test_state_policy_metadata.py
| explicit_override_present = _explicit_policy_path(project_root).is_file() | ||
| _, explicit_override_error = _explicit_policy_payload(project_root) |
There was a problem hiding this comment.
Keep teaDetected scoped to TEA signals.
A standard-only story-automator.policy.json now flips teaDetected to true because this payload keys off explicit_override_present, not TEA-specific evidence. That can light up TEA preflight UX even when the override contains no TEA steps.
Suggested fix
- "teaDetected": explicit_policy or explicit_override_present or bool(signals),
+ "teaDetected": explicit_policy or bool(signals),Also applies to: 749-750
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@skills/bmad-story-automator/src/story_automator/commands/state.py` around
lines 698 - 699, The code currently sets teaDetected based on
explicit_override_present (from _explicit_policy_path) which makes a
standard-only story-automator.policy.json flip TEA UX; update the teaDetected
computation so it only uses TEA-specific evidence from the explicit policy
payload returned by _explicit_policy_payload(project_root) (e.g., inspect the
payload for TEA steps/flags and/or explicit_override_error TEA-related info) and
ignore plain presence of an override file; locate uses of
explicit_override_present and _explicit_policy_payload (also at the other
occurrence around the block at the previously noted lines) and change the
boolean used for teaDetected to a TEA-only predicate derived from the payload
instead of the file-existence flag.
| override = _read_json(override_path) if override_path.is_file() else {} | ||
| policy = _deep_merge(bundled, override) | ||
| try: | ||
| override = _read_json(override_path) if override_path.is_file() else {} |
There was a problem hiding this comment.
skills/bmad-story-automator/src/story_automator/core/runtime_policy.py:44 — override_path.is_file() can raise OSError/PermissionError (e.g., unreadable _bmad/bmm), which would bypass the PolicyError wrapping and can crash callers expecting structured policy_invalid JSON. Consider guarding the is_file() probe similarly to _read_json(...) so unreadable stat failures also become PolicyErrors.
Other locations where this applies: skills/bmad-story-automator/src/story_automator/commands/state.py:577, skills/bmad-story-automator/src/story_automator/commands/state.py:698
Severity: medium
Other Locations
skills/bmad-story-automator/src/story_automator/commands/state.py:577skills/bmad-story-automator/src/story_automator/commands/state.py:698
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
| elif arg == "--config-json" and idx + 1 < len(args): | ||
| config_json = args[idx + 1] | ||
| if config_file and file_exists(config_file): | ||
| config_json = read_text(config_file) |
There was a problem hiding this comment.
skills/bmad-story-automator/src/story_automator/commands/state.py:554 — cmd_build_run_policy() reads --config-file with read_text(config_file) without handling OSError, so an existing-but-unreadable config will likely raise and print a traceback instead of returning structured JSON. Consider catching read/stat failures and returning a specific config_file_unreadable-style error payload.
Severity: medium
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
| print_json({"ok": False, "error": "invalid_set_argument", "argument": raw_update}) | ||
| return 1 | ||
| key, value = raw_update.split("=", 1) | ||
| updates[_normalize_progress_key(key)] = value |
There was a problem hiding this comment.
skills/bmad-story-automator/src/story_automator/commands/orchestrator.py:539 — _state_progress() writes value directly into a Markdown table cell; if it contains | or newlines it can corrupt the progress table and break future parsing/updates. Consider validating/sanitizing --set values to a safe single-line, no-pipe subset before updating the row.
Severity: medium
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
|
Closing since worked on in #28 |
Summary
Scope
Explicitly out of scope
checkpoint-previewintegration into the automator runVerification
PYTHONPATH=skills/bmad-story-automator/src python3 -m unittest tests.test_runtime_policy tests.test_state_policy_metadata tests.test_orchestrator_parsebash scripts/smoke-test.shSummary by CodeRabbit
New Features
Bug Fixes
Documentation
Tests