Skip to content

benchmarks: Suite C silent-zero hardening — required before any baseline re-record #70

@edheltzel

Description

@edheltzel

Source: PR #64 review (the substantive non-blocking finding). Flagged as REQUIRED before any baseline re-record — including the post-dedup re-run that closes the roadmap's outcome loop.

The Suite C harness never consults getLastSearchErrors() and no test asserts a nonzero score anywhere, so a broken harness (or a search path silently erroring) records an all-zero "baseline" indistinguishable from genuine retrieval collapse.

Fix sketch: after each query batch, check getLastSearchErrors() and fail the run loudly on swallowed errors; add one canary assertion that the smallest corpus produces a nonzero aggregate score. Then re-run Suite C post-dedup and diff against the committed 2026-06-11 baseline — that diff is the dedup efficacy report.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or requestneeds-triageMaintainer needs to evaluate this issue

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions