bmad-code-org · bma-d · May 21, 2026 · May 21, 2026 · May 21, 2026 · May 21, 2026
@@ -30,6 +30,10 @@ flowchart TD
 
 The generated agents file is a runtime artifact, not just display text.
 
+Agent-plan boundaries validate generated JSON before use. Malformed complexity
+or agents-plan payloads return `structuredIssues` with field paths such as
+`stories[0].complexity.level` or `stories[0].tasks.dev`.
+
 ## Child-Session Command Build
 
 The helper CLI generates step-specific commands with `tmux-wrapper build-cmd`.
@@ -116,6 +120,10 @@ Important distinctions:
 - `stuck` means no valid progress signal within the allowed window
 - `incomplete` is a review-specific result, not a generic session state
 
+`monitor-session --json` may include `structuredIssues` when malformed persisted
+runner state affects the result. CSV status helpers keep the documented columns
+unchanged.
+
 ## Review Verification
 
 Review sessions add extra verification:

@@ -38,6 +38,18 @@ Use these during preflight to keep story selection and complexity scoring determ
 
 Use these to create, inspect, and validate orchestration state.
 
+`validate-state` preserves the legacy response fields:
+
+- `ok`
+- `structure`
+- `issues`
+
+It also adds `structuredIssues` and `issueCount` for field-specific diagnostics. Consumers should prefer `structuredIssues` when present and keep `issues` as the legacy fallback.
+
+## Diagnostic Events
+
+Command stdout stays backward-compatible. Set `STORY_AUTOMATOR_DIAGNOSTICS_FILE=/path/to/events.jsonl` to opt in to structured diagnostic events. The helper appends one redacted JSON object per line for orchestration-stage parse results, state transitions, monitor-session lifecycle results, and policy load failures.
+
 ## tmux Commands
 
 - `tmux-wrapper spawn`
@@ -71,6 +83,8 @@ Critical rule:
 
 These commands are the orchestration control plane.
 
+`orchestrator-helper state-update <file> --set status=<value>` validates status transitions before writing. Invalid transitions return `ok:false`, `error:"invalid_status_transition"`, `currentStatus`, `attemptedStatus`, `allowedTransitions`, legacy `issues`, and `structuredIssues`. Non-status updates keep the existing `ok` and `updated` response shape.
+
 ## Agent Config Commands
 
 - `agent-config list`

@@ -107,6 +107,10 @@ sequenceDiagram
 
 The helper CLI exists so the skill does not need to do everything through raw shell parsing or manual markdown edits.
 
+For observability, helper failures preserve their legacy result fields and add
+`structuredIssues` where a field-specific diagnostic is available. Parse failure
+payloads keep `status` and `reason`; successful parse payloads stay unchanged.
+
 ## Why The State Document Matters
 
 The state document is the control plane for the run.

@@ -0,0 +1,56 @@
+# Phase 00 - Baseline And Plan Reconciliation
+
+## Clean Context Start
+
+Before doing this phase, read [README.md](./README.md), [TODO.md](./TODO.md), [implementation-notes.md](./implementation-notes.md), [handoff-log.md](./handoff-log.md), and relevant prior handoff entries. Treat the handoff log as next-agent continuity context. Treat implementation notes as the user-facing record of decisions and tradeoffs.
+
+## Goal
+
+Establish a reproducible baseline and confirm the Oracle feedback has been incorporated. This phase is not a blocking external-review phase; Oracle feedback is already available and applied to this packet.
+
+## Inputs
+
+- GitHub issue `bmad-code-org/bmad-automator#5`
+- Current branch `bma-d/e2e-tests`
+- Oracle feedback recorded in [implementation-notes.md](./implementation-notes.md)
+- Critical source paths listed in [README.md](./README.md)
+
+## Implementation Steps
+
+1. Confirm working tree, branch, and HEAD:
+   ```bash
+   git status --short --branch
+   git rev-parse --short HEAD
+   ```
+2. Run baseline Python tests:
+   ```bash
+   PYTHONPATH=skills/bmad-story-automator/src python3 -m unittest discover -s tests
+   ```
+3. Verify CLI import/help baseline:
+   ```bash
+   PYTHONPATH=skills/bmad-story-automator/src python3 -m story_automator --help
+   ```
+4. Optionally run `npm run verify` if baseline time is acceptable. Otherwise defer it to Phase 06.
+5. Record baseline results and any blockers in [handoff-log.md](./handoff-log.md).
+
+## Verification
+
+```bash
+PYTHONPATH=skills/bmad-story-automator/src python3 -m unittest discover -s tests
+PYTHONPATH=skills/bmad-story-automator/src python3 -m story_automator --help
+```
+
+## Exit Criteria
+
+- Baseline status is recorded.
+- Revised phase order is confirmed.
+- Any blocked command has an exact error and next action.
+- Phase 01 can start without waiting for Oracle.
+
+## Implementation Notes Requirements
+
+Keep [implementation-notes.md](./implementation-notes.md) current while implementing. Record any baseline surprises, command substitutions, or changes to phase scope.
+
+## Handoff Requirements
+
+Append a Phase 00 entry to [handoff-log.md](./handoff-log.md) with commands run, results, current SHA, blockers, and the next recommended command for Phase 01.
@@ -0,0 +1,61 @@
+# Phase 01 - Diagnostics Contract
+
+## Clean Context Start
+
+Before doing this phase, read [README.md](./README.md), [TODO.md](./TODO.md), [implementation-notes.md](./implementation-notes.md), [handoff-log.md](./handoff-log.md), and the Phase 00 handoff. Treat the handoff log as next-agent continuity context. Treat implementation notes as the user-facing record of decisions and tradeoffs.
+
+## Goal
+
+Add reusable diagnostics objects and serialization helpers without changing command behavior.
+
+## Inputs
+
+- `skills/bmad-story-automator/src/story_automator/core/runtime_policy.py`
+- `skills/bmad-story-automator/src/story_automator/core/utils.py`
+- Existing tests in `tests/`
+- Oracle feedback in [implementation-notes.md](./implementation-notes.md)
+
+## Implementation Steps
+
+1. Add `skills/bmad-story-automator/src/story_automator/core/diagnostics.py`.
+2. Define `DiagnosticIssue` with first-class fields:
+   - `type`
+   - `field`
+   - `expected`
+   - `actual`
+   - `message`
+   - `recovery`
+   - `code`
+   - `severity`
+   - `source`
+3. Define `DiagnosticEvent` for structured observability context, but do not emit standalone event lines to stdout by default.
+4. Add serialization helpers:
+   - `serialize_issue(issue) -> dict`
+   - `serialize_issues(issues) -> list[dict]`
+   - `legacy_issue_message(issue) -> str`
+   - `issues_from_exception(exc, source, field="")`
+5. Add `redact_actual(value)` for long strings, absolute paths, env-like keys, nested dict/list payloads, and other oversized or sensitive values.
+6. Add `tests/test_diagnostics.py`.
+7. Do not touch command outputs yet.
+
+## Verification
+
+```bash
+PYTHONPATH=skills/bmad-story-automator/src python3 -m unittest tests.test_diagnostics
+PYTHONPATH=skills/bmad-story-automator/src python3 -m unittest discover -s tests
+```
+
+## Exit Criteria
+
+- Diagnostics serialize to compact JSON-compatible dictionaries.
+- Redaction behavior is tested.
+- No CLI output shape changes.
+- `severity` and `source` are present from day one.
+
+## Implementation Notes Requirements
+
+Keep [implementation-notes.md](./implementation-notes.md) current while implementing. Record field-name decisions, redaction tradeoffs, event-output decisions, and compatibility constraints.
+
+## Handoff Requirements
+
+Append a Phase 01 entry to [handoff-log.md](./handoff-log.md) with files changed, tests run, exact diagnostics shape, compatibility notes, blockers, and the next recommended command for Phase 02.
@@ -0,0 +1,79 @@
+# Phase 02 - State Validation And Transitions
+
+## Clean Context Start
+
+Before doing this phase, read [README.md](./README.md), [TODO.md](./TODO.md), [implementation-notes.md](./implementation-notes.md), [handoff-log.md](./handoff-log.md), and prior phase handoff entries. Treat the handoff log as next-agent continuity context. Treat implementation notes as the user-facing record of decisions and tradeoffs.
+
+## Goal
+
+Fix the most visible docs/runtime mismatch by adding field-specific state diagnostics, and guard orchestration status updates against invalid transitions.
+
+## Inputs
+
+- `skills/bmad-story-automator/src/story_automator/core/diagnostics.py`
+- `skills/bmad-story-automator/src/story_automator/commands/state.py`
+- `skills/bmad-story-automator/src/story_automator/commands/orchestrator.py`
+- `skills/bmad-story-automator/src/story_automator/core/frontmatter.py`
+- `skills/bmad-story-automator/templates/state-document.md`
+- `skills/bmad-story-automator/steps-v/step-v-01-check.md`
+- `docs/state-and-resume.md`
+- `docs/cli-reference.md`
+- `tests/test_state_policy_metadata.py`
+- `tests/test_replacement_unicode.py`
+
+## Implementation Steps
+
+1. Add `skills/bmad-story-automator/src/story_automator/core/state_validation.py`.
+2. Validate state frontmatter fields with structured issues:
+   - `epic`
+   - `epicName`
+   - `storyRange`
+   - `status`
+   - `lastUpdated`
+   - runtime command config through `aiCommand` or usable `agentConfig`
+   - policy snapshot metadata
+3. Preserve `validate-state` compatibility:
+   - keep `ok`
+   - keep `structure`
+   - keep `issues: list[str]`
+   - add `structuredIssues: list[object]`
+   - add `issueCount`
+4. Add `ALLOWED_STATUS_TRANSITIONS`:
+   ```python
+   ALLOWED_STATUS_TRANSITIONS = {
+       "INITIALIZING": {"INITIALIZING", "READY", "ABORTED"},
+       "READY": {"READY", "IN_PROGRESS", "PAUSED", "ABORTED"},
+       "IN_PROGRESS": {"IN_PROGRESS", "PAUSED", "EXECUTION_COMPLETE", "COMPLETE", "ABORTED"},
+       "PAUSED": {"PAUSED", "IN_PROGRESS", "ABORTED"},
+       "EXECUTION_COMPLETE": {"EXECUTION_COMPLETE", "COMPLETE", "ABORTED"},
+       "COMPLETE": {"COMPLETE"},
+       "ABORTED": {"ABORTED"},
+   }
+   ```
+5. Update `orchestrator-helper state-update` so `status=<value>` changes are checked before writing.
+6. Invalid transitions must return `ok: false`, `error: "invalid_status_transition"`, `currentStatus`, `attemptedStatus`, `allowedTransitions`, legacy `issues`, and `structuredIssues`.
+7. Update `steps-v/step-v-01-check.md` to read `.structuredIssues[]?` first and fall back to legacy `.issues[]?` strings.
+8. Update `docs/state-and-resume.md` and `docs/cli-reference.md` for additive diagnostics and transition rules.
+9. Add `tests/test_state_validation.py` for focused state validation and transition coverage. Existing state tests may also be extended, but this phase must create the focused module because verification depends on it.
+
+## Verification
+
+```bash
+PYTHONPATH=skills/bmad-story-automator/src python3 -m unittest tests.test_state_policy_metadata tests.test_replacement_unicode
+PYTHONPATH=skills/bmad-story-automator/src python3 -m unittest tests.test_state_validation
+```
+
+## Exit Criteria
+
+- `validate-state` returns field-specific diagnostics without replacing legacy string issues.
+- Docs/runtime mismatch around state validation issue shape is resolved.
+- `state-update` blocks invalid status regressions with actionable diagnostics.
+- Legacy states remain valid where intended.
+
+## Implementation Notes Requirements
+
+Keep [implementation-notes.md](./implementation-notes.md) current while implementing. Record the exact compatibility choice for `issues` versus `structuredIssues`, the transition table, and any allowed compatibility compromises such as `IN_PROGRESS -> COMPLETE`.
+
+## Handoff Requirements
+
+Append a Phase 02 entry to [handoff-log.md](./handoff-log.md) with files changed, tests run, transition table, docs changes, blockers, and the next recommended command for Phase 03.
@@ -0,0 +1,59 @@
+# Phase 03 - Parser And Contract Boundaries
+
+## Clean Context Start
+
+Before doing this phase, read [README.md](./README.md), [TODO.md](./TODO.md), [implementation-notes.md](./implementation-notes.md), [handoff-log.md](./handoff-log.md), and prior phase handoff entries. Treat the handoff log as next-agent continuity context. Treat implementation notes as the user-facing record of decisions and tradeoffs.
+
+## Goal
+
+Make LLM parse failures and verifier contract failures field-specific while keeping existing parse contracts and successful output unchanged.
+
+## Inputs
+
+- `skills/bmad-story-automator/src/story_automator/core/diagnostics.py`
+- `skills/bmad-story-automator/src/story_automator/commands/orchestrator_parse.py`
+- `skills/bmad-story-automator/src/story_automator/core/success_verifiers.py`
+- `skills/bmad-story-automator/src/story_automator/core/review_verify.py`
+- `skills/bmad-story-automator/src/story_automator/commands/orchestrator.py`
+- `skills/bmad-story-automator/src/story_automator/commands/tmux.py`
+- `skills/bmad-story-automator/src/story_automator/commands/validate_story_creation.py`
+- `skills/bmad-story-automator/data/parse/*.json`
+- `skills/bmad-story-automator-review/contract.json`
+- `tests/test_orchestrator_parse.py`
+- `tests/test_success_verifiers.py`
+
+## Implementation Steps
+
+1. Add `skills/bmad-story-automator/src/story_automator/core/parse_contracts.py`.
+2. Move parse schema/payload validation out of command code.
+3. Replace boolean schema checks with diagnostics for:
+   - missing required key
+   - wrong nested type
+   - invalid enum
+   - empty string
+   - invalid `path or null`
+4. Preserve parse success output exactly as-is. Do not add diagnostics or events to valid parsed payloads.
+5. On parse failure, preserve `status: "error"` and legacy `reason`, and add `structuredIssues`.
+6. Wrap success verifier contract failures into structured issues at command boundaries where safe.
+7. Add or update tests for field paths such as `issues_found.critical`.
+
+## Verification
+
+```bash
+PYTHONPATH=skills/bmad-story-automator/src python3 -m unittest tests.test_orchestrator_parse tests.test_success_verifiers
+```
+
+## Exit Criteria
+
+- Parser boundary reports specific field-level diagnostics.
+- Existing parse success payloads are unchanged.
+- Legacy failure `reason` values remain available.
+- Verifier contract failures expose structured diagnostics where command outputs already carry errors.
+
+## Implementation Notes Requirements
+
+Keep [implementation-notes.md](./implementation-notes.md) current while implementing. Record any compatibility choice around legacy `reason` values, whether events are returned in failure JSON, and parse schema expressiveness limits.
+
+## Handoff Requirements
+
+Append a Phase 03 entry to [handoff-log.md](./handoff-log.md) with files changed, tests run, schema issue examples, compatibility notes, blockers, and the next recommended command for Phase 04.
@@ -0,0 +1,64 @@
+# Phase 04 - Agent Complexity And Story Boundaries
+
+## Clean Context Start
+
+Before doing this phase, read [README.md](./README.md), [TODO.md](./TODO.md), [implementation-notes.md](./implementation-notes.md), [handoff-log.md](./handoff-log.md), and prior phase handoff entries. Treat the handoff log as next-agent continuity context. Treat implementation notes as the user-facing record of decisions and tradeoffs.
+
+## Goal
+
+Stop raw agent-plan and complexity JSON from failing late inside command handlers, and strengthen story/epic parse seams without touching tmux/session runtime behavior.
+
+## Inputs
+
+- `skills/bmad-story-automator/src/story_automator/core/diagnostics.py`
+- `skills/bmad-story-automator/src/story_automator/commands/orchestrator_epic_agents.py`
+- `skills/bmad-story-automator/src/story_automator/core/agent_config.py`
+- `skills/bmad-story-automator/src/story_automator/core/epic_parser.py`
+- `skills/bmad-story-automator/src/story_automator/core/story_keys.py`
+- `skills/bmad-story-automator/src/story_automator/core/sprint.py`
+- `tests/test_retro_agent.py`
+- `tests/test_runtime_layout.py`
+
+## Implementation Steps
+
+1. Add `skills/bmad-story-automator/src/story_automator/core/agent_plan.py`.
+2. Move duplicated agent config/plan behavior from `commands/orchestrator_epic_agents.py` toward core helpers.
+3. Implement validators:
+   - `validate_complexity_payload(payload) -> list[DiagnosticIssue]`
+   - `validate_agents_plan_payload(payload) -> list[DiagnosticIssue]`
+   - `load_complexity_payload(path) -> tuple[payload, issues]`
+   - `load_agents_plan(path) -> tuple[payload, issues]`
+4. Validation rules:
+   - root must be an object
+   - `stories` must be an array
+   - each story needs string `storyId`
+   - `complexity.level` normalizes to `low`, `medium`, or `high`
+   - task selections cover `create`, `dev`, `auto`, and `review`
+   - each task selection has string `primary`
+   - `fallback` may be false or string and must normalize like current code
+   - unknown fields are allowed unless harmful
+5. Keep `StoryKey` and `SprintStatus` mostly unchanged; they are already useful typed seams.
+6. Optionally add small dataclasses/helpers in `epic_parser.py` if they preserve current returned JSON shape.
+7. Add `tests/test_agent_plan.py` for focused complexity and agents-plan payload coverage. Existing agent config tests may also be extended, but this phase must create the focused module because verification depends on it.
+
+## Verification
+
+```bash
+PYTHONPATH=skills/bmad-story-automator/src python3 -m unittest tests.test_retro_agent tests.test_runtime_layout
+PYTHONPATH=skills/bmad-story-automator/src python3 -m unittest tests.test_agent_plan
+```
+
+## Exit Criteria
+
+- Agent plan and complexity file boundaries fail with field-specific diagnostics.
+- Existing fallback normalization and retro override behavior remain unchanged.
+- Story/epic parse improvements preserve current CLI JSON shape.
+- Tmux/session runtime work is left for Phase 05.
+
+## Implementation Notes Requirements
+
+Keep [implementation-notes.md](./implementation-notes.md) current while implementing. Record module-boundary decisions, any accepted unknown fields, and remaining loose payloads.
+
+## Handoff Requirements
+
+Append a Phase 04 entry to [handoff-log.md](./handoff-log.md) with files changed, tests run, remaining loose payloads, compatibility risks, blockers, and the next recommended command for Phase 05.