Skip to content

fix(mutation-boundary): diff committed subtask against its parent (#162)#171

Merged
azalio merged 1 commit into
mainfrom
fix/162-mutation-boundary-post-commit-false-progress
Jun 11, 2026
Merged

fix(mutation-boundary): diff committed subtask against its parent (#162)#171
azalio merged 1 commit into
mainfrom
fix/162-mutation-boundary-post-commit-false-progress

Conversation

@azalio

@azalio azalio commented Jun 11, 2026

Copy link
Copy Markdown
Owner

Summary

Fixes #162validate_step 2.4 false-progress reject on every committed subtask in /map-efficient.

validate_step 2.4 auto-runs validate_mutation_boundary, which diffs the working tree against the subtask's declared affected_files. But the documented per-subtask close order is:

commit → record_subtask_result --commit-sha <SHA> → validate_step 2.4

record_subtask_result --commit-sha advances last_subtask_commit_sha to the subtask's own commit. By the time validate_step 2.4 runs, the working tree is clean and git diff <own-commit> is empty → actual=[] while expected is non-empty → the false-progress guard fires:

False-progress (mutation-boundary): MONITOR is closing ST-001 but NO files changed, though its contract declares affected_files=[...]

The operator then had to call validate_step 2.4 a second time to get past the once-per-subtask nudge. Per the issue, this happened on every subtask of a 25-subtask plan.

Fix

Implements the issue's suggested fix #1 (diff the recorded subtask commit against its parent). Extracted a shared _resolve_subtask_diff_base helper that:

  • resolves the base ref from last_subtask_commit_shaHEADNone (fresh repo), as before;
  • re-bases onto the commit's parent (<sha>^) when the auto-resolved base equals the commit recorded for this subtask, so the committed work shows up in the diff;
  • probes the parent first, so a root commit (no parent) safely keeps the commit itself.

The same base-ref resolution — and the same latent bug — lived in _current_subtask_changed_files (used by cross-subtask regression detection); it's now routed through the same helper, fixing the bug class everywhere (per the "fix every instance" rule).

Net effect: the boundary check is now commit-order-agnostic — it computes the same actual whether run before or after the per-subtask commit.

Tests

  • unit (positive): committed subtask → actual == ['a.py'], status clean, base_ref == '<sha>^'
  • unit (negative guard): no recorded commit + clean tree → actual == [] so genuine false-progress still fires
  • orchestrator (e2e): the documented commit → record → validate_step 2.4 flow passes on the first call; progress_feedback_subtasks is not consumed

Negative-proofed: disabling the re-base makes the orchestrator test reproduce the exact #162 message, then re-render restores green.

Validation

  • make check green: ruff ✓, mypy ✓, pyright 0/0/0, hook lint ✓, 2285 passed / 3 skipped
  • make check-render ✓ (generated trees match templates_src — change made in the .jinja single source and re-rendered)

Closes #162

🤖 Generated with Claude Code

`validate_step 2.4` auto-runs `validate_mutation_boundary`, which compared the
working tree against the subtask's declared `affected_files`. In the documented
per-subtask close order — commit → `record_subtask_result --commit-sha` →
`validate_step 2.4` — the working tree is clean and `last_subtask_commit_sha`
already points at the subtask's OWN commit. The diff against that commit is
therefore empty, so `actual=[]` while `expected` is non-empty, tripping the
false-progress guard with "MONITOR is closing ST-XXX but NO files changed" on
every committed subtask and forcing a redundant second `validate_step 2.4` call.

Fix: extract `_resolve_subtask_diff_base`, which re-bases onto the subtask
commit's parent when the auto-resolved base equals the commit recorded for THIS
subtask, so the committed work shows up as `actual`. The parent is probed first
so a root commit safely keeps the commit itself. The same base-ref resolution
(and the same latent bug) is shared by `_current_subtask_changed_files`, now
also routed through the helper. The check is now commit-order-agnostic: it
yields the same `actual` whether run before or after the per-subtask commit.

Tests:
- unit: committed subtask -> `actual==['a.py']`, base_ref re-based to `<sha>^`
- unit (negative guard): no recorded commit + clean tree -> `actual==[]` so
  genuine false-progress still fires
- orchestrator: documented commit→record→validate flow passes 2.4 on the FIRST
  call (no redundant second call), `progress_feedback_subtasks` untouched
Negative-proofed: disabling the re-base reproduces the exact #162 message.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@azalio azalio merged commit 5fe4cde into main Jun 11, 2026
6 checks passed
@azalio azalio deleted the fix/162-mutation-boundary-post-commit-false-progress branch June 11, 2026 15:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

validate_step 2.4 false-progress reject when per-subtask commit precedes validation (docs vs validator contradiction)

1 participant