Source
me-fuji session 150 (#1170 partial fix in PR #1173).
Observation
When fixing a specific in-the-wild bug, all indirect green signals can lie:
- Unit tests pass (synthetic geometry was correctly handled by the new code)
- Full test suite passes (no regression)
- `--check` flags pass (no committed artifacts changed)
But running the actual processing pipeline on the targeted input showed byte-identical output to pre-fix — the new code never fired on the case it was supposed to fix.
In the session this happened on the af-75 stopped freq30 MTF chart. The post-DP V-crossing detector had a clean unit-test pass but never fired on the real chart because the DP output didn't have the shape the detector required. Caught by running `py -m mtfdigitizer.diagnose ttartisan-af-75mm-f2-0` and comparing numbers to the committed digitization-log values.
Proposed template change
Add to `base/workflow/ai-workflow.md` Lessons Learned section, or `base/core/testing.md` near the testability/verification rules:
Verify the fix fires on real data. When a fix is meant to close a specific in-the-wild case, run the actual processing pipeline on that input and compare numbers to the committed/expected output. Unit tests on synthetic inputs and CI-staleness checks can both pass while the fix never executes on the targeted case. `./run-pipeline.sh | diff - ` is the test that matters — it confirms code reached the bug site AND took the new branch.
Generalizes S149's "pixel-level chart probe is the truth" — both point at the same antipattern of trusting indirect green signals over direct measurement.
Alternatives considered
- Add as a MUST in `testing.md` — too strong; some fixes don't have a singular "real case" to point at.
- Stack as an inline note inside the existing `Verify agent calculations against the system` section in `ai-workflow.md` — could work; that section is about agent math errors, this is about a different failure mode (fix never fires) but same root pattern (trust direct measurement over inference).
Preference: standalone bullet in Lessons Learned. Concrete enough to remember, general enough to apply.
Source
me-fuji session 150 (#1170 partial fix in PR #1173).
Observation
When fixing a specific in-the-wild bug, all indirect green signals can lie:
But running the actual processing pipeline on the targeted input showed byte-identical output to pre-fix — the new code never fired on the case it was supposed to fix.
In the session this happened on the af-75 stopped freq30 MTF chart. The post-DP V-crossing detector had a clean unit-test pass but never fired on the real chart because the DP output didn't have the shape the detector required. Caught by running `py -m mtfdigitizer.diagnose ttartisan-af-75mm-f2-0` and comparing numbers to the committed digitization-log values.
Proposed template change
Add to `base/workflow/ai-workflow.md` Lessons Learned section, or `base/core/testing.md` near the testability/verification rules:
Generalizes S149's "pixel-level chart probe is the truth" — both point at the same antipattern of trusting indirect green signals over direct measurement.
Alternatives considered
Preference: standalone bullet in Lessons Learned. Concrete enough to remember, general enough to apply.