The roadmap's outcome loop, capstone of this milestone.
The committed 2026-06-11 Suite C baseline records exact-lookup MRR collapsing 1.0 → 0.75 → 0.13 → 0.00 across the 100/1k/10k/100k corpus ladder, with true targets buried at ranks 9/10/102/498 under unmarked near-duplicates. The baseline explicitly excluded dedup ("Dedup was NOT run before measurement").
Task: run recall dedup --execute over the seeded benchmark corpora, re-run Suite C, and diff against benchmarks/results/2026-06-11T09-36-53-suite-C.jsonl. The diff is dedup's efficacy report — and the empirical evidence the parked entity-keying gate (#49) is waiting on.
Blocked by: #70 (silent-zero hardening — REQUIRED before any baseline re-record) and #63 (cross-run safety fix should land first so measured behavior is final behavior).
Methodology note: keep baseline-first discipline — record honest numbers, no invented thresholds; document the dedup invocation in the run manifest so the comparison is reproducible.
The roadmap's outcome loop, capstone of this milestone.
The committed 2026-06-11 Suite C baseline records exact-lookup MRR collapsing 1.0 → 0.75 → 0.13 → 0.00 across the 100/1k/10k/100k corpus ladder, with true targets buried at ranks 9/10/102/498 under unmarked near-duplicates. The baseline explicitly excluded dedup ("Dedup was NOT run before measurement").
Task: run
recall dedup --executeover the seeded benchmark corpora, re-run Suite C, and diff againstbenchmarks/results/2026-06-11T09-36-53-suite-C.jsonl. The diff is dedup's efficacy report — and the empirical evidence the parked entity-keying gate (#49) is waiting on.Blocked by: #70 (silent-zero hardening — REQUIRED before any baseline re-record) and #63 (cross-run safety fix should land first so measured behavior is final behavior).
Methodology note: keep baseline-first discipline — record honest numbers, no invented thresholds; document the dedup invocation in the run manifest so the comparison is reproducible.