Skip to content

RLCR: Round 0 scope dumping — implementer completes minimal subset and defers everything else #58

@zevorn

Description

@zevorn

Context

In a 3-round RLCR session (cancelled by user after Round 3), the implementer completed only ~10% of the plan scope in Round 0, explicitly marking all remaining 9 major items as "deferred to next round." The reviewer spent an entire review cycle confirming "almost nothing is done" — a structural waste of review bandwidth. The session also revealed tensions between software-verifiable and hardware-verifiable deliverables.

Observations

  1. "Scope dumping" in Round 0: The implementer completed only baseline cleanup (the easiest ~10% of plan scope), then explicitly tagged all 9 remaining major deliverables as "deferred to next round." The reviewer accurately identified "9 unjustified deferrals" but had to spend a full review cycle arriving at this conclusion. This pattern wastes the most expensive resource in RLCR: reviewer time.

  2. Reviewer depth increased progressively (healthy pattern): Round 0 focused on scope alignment; Round 1 deepened to evidence-version matching (discovering that log artifacts pre-dated the current code commit); Round 2 further deepened to implementation correctness through code reading (finding two specific logic defects). This progressive deepening is a healthy feedback loop.

  3. Evidence attribution inflation: The implementer consistently extrapolated partial evidence into full acceptance claims. The reviewer corrected this each round — pointing out that 10 stress tests ≠ the plan-required 50, and that a single smoke test pass ≠ full matrix completion. The gap between claimed and verified completion was consistent.

  4. Physical vs. software verification tension: The plan included both pure software verification and hardware-interaction verification. The implementer efficiently completed software-side work but consistently lacked sufficient evidence for physical verification (missing screenshots, incomplete serial logs). The RLCR methodology doesn't distinguish these verification modes.

  5. Reviewer action plans were highly executable: Every review result included a numbered mandatory execution plan specifying what to change, what to test, and what command to run. This significantly reduced the implementer's interpretation overhead in subsequent rounds.

Suggested Improvements

# Suggestion Mechanism
1 Enforce minimum Round 0 coverage Require Round 0 to produce at least partial progress on every AC (even a failure record counts), rather than allowing cherry-picking the easiest subset.
2 Track "evidence precision" as a session metric Add to reviewer template: count of "completion claims rejected" per round. If this exceeds a threshold (e.g., 3 per round), flag implementer optimism bias for the session.
3 Pre-loop environment feasibility check for hardware-dependent plans Before starting RLCR, verify that the current environment can satisfy all verification paths. Avoid entering the loop only to discover hardware blockers.
4 Distinguish software-verifiable and hardware-verifiable ACs in plans Plans with mixed verification modes should tag each AC accordingly. Hardware-verifiable ACs may need different convergence criteria or human-in-the-loop steps.

Quantitative Summary

Metric Value
Total rounds 3.5 (Round 3 started, review incomplete)
Exit reason User cancelled
AC count 9 (with sub-items)
Round 0 scope completion ~10%
Completion rate at exit ~30% (3/9 ACs with traceable evidence)
Round 0 unjustified deferrals 9
Reviewer false positive rate 0%

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions