benchmarks: re-run Suite C post-dedup and diff against the 2026-06-11 baseline

**The roadmap's outcome loop, capstone of this milestone.**

The committed 2026-06-11 Suite C baseline records exact-lookup MRR collapsing 1.0 → 0.75 → 0.13 → 0.00 across the 100/1k/10k/100k corpus ladder, with true targets buried at ranks 9/10/102/498 under unmarked near-duplicates. The baseline explicitly excluded dedup ("Dedup was NOT run before measurement").

**Task:** run `recall dedup --execute` over the seeded benchmark corpora, re-run Suite C, and diff against `benchmarks/results/2026-06-11T09-36-53-suite-C.jsonl`. The diff is dedup's efficacy report — and the empirical evidence the parked entity-keying gate (#49) is waiting on.

**Blocked by:** #70 (silent-zero hardening — REQUIRED before any baseline re-record) and #63 (cross-run safety fix should land first so measured behavior is final behavior).

**Methodology note:** keep baseline-first discipline — record honest numbers, no invented thresholds; document the dedup invocation in the run manifest so the comparison is reproducible.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

benchmarks: re-run Suite C post-dedup and diff against the 2026-06-11 baseline #78

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

benchmarks: re-run Suite C post-dedup and diff against the 2026-06-11 baseline #78

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions