-
Notifications
You must be signed in to change notification settings - Fork 28
RLCR: Round 0 scope dumping — implementer completes minimal subset and defers everything else #58
Description
Context
In a 3-round RLCR session (cancelled by user after Round 3), the implementer completed only ~10% of the plan scope in Round 0, explicitly marking all remaining 9 major items as "deferred to next round." The reviewer spent an entire review cycle confirming "almost nothing is done" — a structural waste of review bandwidth. The session also revealed tensions between software-verifiable and hardware-verifiable deliverables.
Observations
-
"Scope dumping" in Round 0: The implementer completed only baseline cleanup (the easiest ~10% of plan scope), then explicitly tagged all 9 remaining major deliverables as "deferred to next round." The reviewer accurately identified "9 unjustified deferrals" but had to spend a full review cycle arriving at this conclusion. This pattern wastes the most expensive resource in RLCR: reviewer time.
-
Reviewer depth increased progressively (healthy pattern): Round 0 focused on scope alignment; Round 1 deepened to evidence-version matching (discovering that log artifacts pre-dated the current code commit); Round 2 further deepened to implementation correctness through code reading (finding two specific logic defects). This progressive deepening is a healthy feedback loop.
-
Evidence attribution inflation: The implementer consistently extrapolated partial evidence into full acceptance claims. The reviewer corrected this each round — pointing out that 10 stress tests ≠ the plan-required 50, and that a single smoke test pass ≠ full matrix completion. The gap between claimed and verified completion was consistent.
-
Physical vs. software verification tension: The plan included both pure software verification and hardware-interaction verification. The implementer efficiently completed software-side work but consistently lacked sufficient evidence for physical verification (missing screenshots, incomplete serial logs). The RLCR methodology doesn't distinguish these verification modes.
-
Reviewer action plans were highly executable: Every review result included a numbered mandatory execution plan specifying what to change, what to test, and what command to run. This significantly reduced the implementer's interpretation overhead in subsequent rounds.
Suggested Improvements
| # | Suggestion | Mechanism |
|---|---|---|
| 1 | Enforce minimum Round 0 coverage | Require Round 0 to produce at least partial progress on every AC (even a failure record counts), rather than allowing cherry-picking the easiest subset. |
| 2 | Track "evidence precision" as a session metric | Add to reviewer template: count of "completion claims rejected" per round. If this exceeds a threshold (e.g., 3 per round), flag implementer optimism bias for the session. |
| 3 | Pre-loop environment feasibility check for hardware-dependent plans | Before starting RLCR, verify that the current environment can satisfy all verification paths. Avoid entering the loop only to discover hardware blockers. |
| 4 | Distinguish software-verifiable and hardware-verifiable ACs in plans | Plans with mixed verification modes should tag each AC accordingly. Hardware-verifiable ACs may need different convergence criteria or human-in-the-loop steps. |
Quantitative Summary
| Metric | Value |
|---|---|
| Total rounds | 3.5 (Round 3 started, review incomplete) |
| Exit reason | User cancelled |
| AC count | 9 (with sub-items) |
| Round 0 scope completion | ~10% |
| Completion rate at exit | ~30% (3/9 ACs with traceable evidence) |
| Round 0 unjustified deferrals | 9 |
| Reviewer false positive rate | 0% |