Add 'Size code thresholds from in-codebase distributions' to base/core/quality.md

Pattern from wuseria S147 (#1122) — small but valuable addition to quality conventions.

## Pattern

When introducing a numeric threshold to a filter (cutoff, bound, tolerance), measure the actual distribution of the filtered quantity in the working data set (p99, p95, max-legitimate) before picking the threshold. Comment the data source in the constant's docstring so future maintainers can re-validate the threshold against fresh data without re-running the analysis from scratch.

The alternative — picking thresholds from first principles or "feels right" — produces thresholds that mask future bugs by being too loose or break edge cases by being too tight, with no record of how the number was chosen.

## Concrete example

wuseria/me-fuji#1122 added `_MAX_RUN_LENGTH = 20` to filter vertical chrome out of ridge candidates. The threshold was sized from the reference-set column-run histogram:

```
median=2 px, p95=3 px, p99=13 px, max legitimate ~17 px
chrome runs measured: 50 px and 265 px
```

20 admits any real curve ridge (slack above p99=13) but drops chrome (the smallest chrome run is 2.5× the threshold). Calibration validated: only one chart changed, by losing a previously out-of-tolerance reading.

The docstring on the constant records the data source so a future maintainer can rerun the histogram against new reference data and re-validate:

```python
# Real MTF curve ridges measured across the reference set fit within ~3 px
# at p95 and ~13 px at p99; runs >20 px are chrome.
_MAX_RUN_LENGTH: int = 20
```

## Why this belongs upstream

`templates/base/core/quality.md` §"Magic numbers and magic strings must be named constants" is one half — the other half is *how* the number gets picked. The named-constant rule prevents scattered literals; the distribution-sized rule prevents arbitrary thresholds masquerading as principled ones.

## Proposed wording

Add to `templates/base/core/quality.md` under the existing "Magic numbers" item or as a sibling item:

> - When introducing a numeric threshold to a filter, cutoff, or bound, size it from the actual distribution of the relevant quantity in the working data set (p95, p99, max-legitimate) — not from first principles or intuition. Document the data source in the constant's docstring so a future maintainer can rerun the analysis against fresh data and re-validate the threshold.

## Related to upstream-flag in #468

Companion to #468 ("probe before trusting issue body's mechanism"). Both came from the same session's mtfdigitizer work; both are about grounding decisions in measured runtime data instead of inferred premises.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add 'Size code thresholds from in-codebase distributions' to base/core/quality.md #469

Pattern

Concrete example

Why this belongs upstream

Proposed wording

Related to upstream-flag in #468

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Add 'Size code thresholds from in-codebase distributions' to base/core/quality.md #469

Description

Pattern

Concrete example

Why this belongs upstream

Proposed wording

Related to upstream-flag in #468

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions