You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
normal, flag, and ignore in projects.json are the calibration layer that makes general prompts project-specific. But:
There's no tooling to validate whether these fields are still accurate
There's no visibility into which ignore patterns are actively firing vs. obsolete
There's no signal when flag criteria produce only noise or only nothing
A project can silently drift into one of two bad states:
Over-flagging: flag criteria are too broad → scan files issues every run that are not actionable → issue queue fills with noise → backpressure kicks in → legitimate issues stop being filed
Under-flagging: flag criteria are too narrow or ignore patterns are too aggressive → real problems go undetected → the scan appears to be working but produces no output
Neither state is detectable without manually inspecting the issue queue and comparing it to the codebase.
What's needed
Calibration audit tooling:
Ignore pattern hit rates: Track how often each ignore entry suppresses a finding across runs. An ignore pattern that never fires is dead weight; one that fires 100% of the time may indicate the underlying condition is now always present and should be addressed.
Flag criteria yield rates: Track what fraction of scan runs produce findings that pass triage and become posted issues. A yield rate of 0% over 4+ weeks suggests flag criteria are miscalibrated or the project genuinely has no issues in this category.
Staleness warnings: If a scan type produces zero posted issues for N consecutive runs, surface a warning (log line, comment on a calibration issue, or a dedicated scan:calibration label on a filed issue).
Related
The retrospective scan (prompts/agency/history/scans.md) already analyzes cross-run patterns. Calibration health could be an output of the retrospective rather than a separate tool.
Problem
normal,flag, andignoreinprojects.jsonare the calibration layer that makes general prompts project-specific. But:ignorepatterns are actively firing vs. obsoleteflagcriteria produce only noise or only nothingA project can silently drift into one of two bad states:
flagcriteria are too broad → scan files issues every run that are not actionable → issue queue fills with noise → backpressure kicks in → legitimate issues stop being filedflagcriteria are too narrow orignorepatterns are too aggressive → real problems go undetected → the scan appears to be working but produces no outputNeither state is detectable without manually inspecting the issue queue and comparing it to the codebase.
What's needed
Calibration audit tooling:
Ignore pattern hit rates: Track how often each
ignoreentry suppresses a finding across runs. An ignore pattern that never fires is dead weight; one that fires 100% of the time may indicate the underlying condition is now always present and should be addressed.Flag criteria yield rates: Track what fraction of scan runs produce findings that pass triage and become posted issues. A yield rate of 0% over 4+ weeks suggests
flagcriteria are miscalibrated or the project genuinely has no issues in this category.Staleness warnings: If a scan type produces zero posted issues for N consecutive runs, surface a warning (log line, comment on a calibration issue, or a dedicated
scan:calibrationlabel on a filed issue).Related
prompts/agency/history/scans.md) already analyzes cross-run patterns. Calibration health could be an output of the retrospective rather than a separate tool.Definition of Done
calibration_warnings[]fieldscope:configwith the calibration audit as the bodyOut of Scope