The CRAP score tier system in crates/cli/src/health/scoring.rs and crates/cli/src/health_types/finding.rs has three structural rough edges that show up as user-visible inconsistency in finding outputs.
1. Tier precedence is implicit in control flow
scoring.rs:1352-1404 resolves which coverage tier applies to a given function:
- Line 1352: Angular template inheritance context present ->
EstimatedComponentInherited.
- Line 1372: Istanbul data loaded -> try Istanbul match; if function unmatched, fall through silently to estimated tiers.
- Line 1393: else ->
StaticEstimated.
There is no explicit enum CoverageSource priority order documented in code. A reader has to follow the control flow to understand which tier wins when.
The more concrete user-impact: when Istanbul is loaded but only matches 80% of functions, the remaining 20% silently fall back to estimated tiers. The CRAP-finding action selection in report::json::build_health_finding_actions then emits different action verbs (add-tests for estimated tiers, increase-coverage for partial/high) for functions in the same file. The user sees mixed advice and cannot tell from the output that the coverage source shifted mid-file.
2. SECONDARY_REFACTOR_BAND is a magic constant
crates/cli/src/health_types/finding.rs:48:
const SECONDARY_REFACTOR_BAND: u16 = 5;
Used at finding.rs:267-273 to trigger a secondary refactor-function action on CRAP-only findings whose cyclomatic is within 5 points of the max threshold AND whose cognitive complexity is at least half the cognitive threshold. The rationale (suppress false positives on flat type-tag dispatchers per .claude/rules/cli-crate.md) lives only in CLAUDE.md, not next to the constant.
Three issues:
- No unit test asserts the band heuristic's behavior on a fixture.
- Not user-configurable, so teams with stricter standards ("extract if within 3 of threshold") cannot adjust.
- Lives in
health_types rather than config/health.rs alongside other thresholds.
3. Coverage tier is invisible in the human-output sections users actually read
crates/cli/src/report/human/health.rs:1102 already emits a tier-aware hint:
CRAP estimated from export references (85% direct, 40% indirect, 0% untested). Use --coverage for exact scores.
But this string is only rendered in the --file-scores section. The two sections that surface CRAP values most prominently when running bare fallow (the default combined pipeline) carry zero tier context:
- The orientation header at
crates/cli/src/report/human/health.rs:635 lists LOC, cyclomatic, MI, churn, unused deps but never says "no coverage data; CRAP scores are estimated."
- The
● High complexity functions section header at crates/cli/src/report/human/health.rs:843 and its per-finding CRAP value rendering at line 917 ({crap:>5.1} CRAP) prints values like 650.0 CRAP with no indication of whether the score came from real Istanbul coverage or the estimated model.
The on-disk artifact already knows: coverage_source is on every finding (scoring.rs:443 sets it via CoverageTier::from_pct(estimated_coverage)), and the report-level summary.coverage_model records the source. The signal is there; the human renderer just does not surface it where users are looking.
Fix surface options:
- (a) one-line hint above
● High complexity functions when coverage_model == Estimated: "CRAP scores estimated from export references. Pass --coverage <path> for exact scores."
- (b) per-finding suffix on the CRAP line:
650.0 CRAP (estimated) / 12.0 CRAP (Istanbul) / 47.0 CRAP (Istanbul partial)
- (c) extend the orientation header
■ Metrics: line at health.rs:635 with a coverage: estimated / coverage: Istanbul (854/1043 matched) segment when complexity is in the pipeline.
(a) is cheapest and high-signal; (b) is most informative per-finding; (c) is the most discoverable but also the most space-constrained.
The same gap shows up via MCP and JSON: consumers reading findings[].crap cannot route advice without also reading findings[].coverage_source. Documenting the precedence (Part 1) and surfacing it in human output (this Part 3) reinforce each other.
Scope
Part A: explicit tier precedence
-
Replace the implicit if/else chain with an explicit enum CoverageSource { TemplateInherited, IstanbulHigh, IstanbulPartial, IstanbulMissing, StaticEstimated } and a single resolver function that returns the source on From<...>.
-
Per file, track whether all functions resolved to the same tier or diverged. Emit a top-level finding metadata field coverage_source_consistency: "uniform" | "mixed" so downstream consumers (dashboards, AI agents) can detect the silent shift and surface a hint.
-
Document the precedence in the resolver function's doc-comment with a one-line "why" per tier.
Part B: configurable threshold
- Move
SECONDARY_REFACTOR_BAND to crates/config/src/config/health.rs as a documented field with a sensible default.
- Add a unit test that constructs a finding at the band boundary and asserts the secondary action fires.
- Expose via
.fallowrc.json so users can tune; default unchanged.
Part C: surface the coverage tier in human output
-
Add a single-line tier note above the ● High complexity functions section in crates/cli/src/report/human/health.rs whenever report.summary.coverage_model is Estimated (and a less prominent variant for Istanbul so consistency is signalled both ways).
-
Optionally extend the orientation header at health.rs:635 with a coverage: estimated | Istanbul (m/t matched) segment when the complexity sub-pipeline ran.
-
Surface the Part-A coverage_source_consistency value in the same hint when mixed, so users see "CRAP scores: Istanbul where matched, estimated for 21% of functions" instead of silent divergence within one file.
-
Snapshot tests for the new line against fixtures with and without --coverage.
-
Rewrite the existing tier-aware hint at crates/cli/src/report/human/health.rs:1102. Today it reads "CRAP estimated from export references (85% direct, 40% indirect, 0% untested). Use --coverage for exact scores.". Users who follow that hint type either fallow --coverage (gets unexpected argument --coverage plus a confused tip: a similar argument exists: '--tolerance') or fallow health --coverage (gets a value is required for '--coverage <PATH>' but none was supplied). The hint must spell out the subcommand context AND the required path argument:
Run `fallow health --coverage <coverage-final.json>` for exact scores.
Same edit applies to any docs page mirroring the hint (e.g. https://docs.fallow.tools/explanations/health#file-health-scores); the prose should match the CLI invocation users need to type.
-
Add a custom clap UnknownArgument handler for top-level --coverage. Clap's default ranking suggests --tolerance because it is the alphabetically nearest top-level long arg, which is irrelevant. The handler should detect --coverage at the top level (where neither the coverage subcommand nor any --coverage flag exists) and route to a targeted hint:
error: --coverage is a flag on the `health` subcommand, not a top-level flag.
tip: run `fallow health --coverage <coverage-final.json>` for exact CRAP scores, or `fallow coverage setup` to configure the runtime-coverage sidecar.
Two confounders justify the custom handler: coverage is also a subcommand (fallow coverage), and --coverage <PATH> is health-scoped. The handler disambiguates both cases in one message instead of letting clap fall back to distance-ranking against --tolerance.
Why this matters
- Mixed-tier action verbs in the same file erode user trust in fallow's output. Either uniformly use one source per file (with a fallback hint), or expose the divergence as metadata.
- Magic constants without tests + without config knobs are the canonical example of "numerology" that project memory
project_eslint_overlap_not_disqualifier warns against.
- A first-time fallow user looking at
650.0 CRAP on a complex function has no way to know that number is based on "I have no idea how well this is tested." That is precisely the kind of opaque metric that erodes trust; Part C closes the gap by making the input source visible at the point of consumption, not hidden in --file-scores.
Out of scope
- Reworking the CRAP formula itself. Savoia & Evans (2007) is the literature anchor; fallow inherits it intentionally.
- Coverage tier watermarks (
HIGH_COVERAGE_WATERMARK = 70% at scores.rs:186). Worth surfacing as configurable separately if requested by users.
Verification
- New unit test on tier precedence.
- New unit test on configurable band.
- New snapshot tests on the human-output tier hint (no-coverage and Istanbul cases).
- Existing CRAP scoring tests continue to pass.
- Manually inspect a fixture with mixed Istanbul match: confirm the new metadata field surfaces AND the human-output hint mentions the divergence.
The CRAP score tier system in
crates/cli/src/health/scoring.rsandcrates/cli/src/health_types/finding.rshas three structural rough edges that show up as user-visible inconsistency in finding outputs.1. Tier precedence is implicit in control flow
scoring.rs:1352-1404resolves which coverage tier applies to a given function:EstimatedComponentInherited.StaticEstimated.There is no explicit
enum CoverageSourcepriority order documented in code. A reader has to follow the control flow to understand which tier wins when.The more concrete user-impact: when Istanbul is loaded but only matches 80% of functions, the remaining 20% silently fall back to estimated tiers. The CRAP-finding action selection in
report::json::build_health_finding_actionsthen emits different action verbs (add-testsfor estimated tiers,increase-coveragefor partial/high) for functions in the same file. The user sees mixed advice and cannot tell from the output that the coverage source shifted mid-file.2. SECONDARY_REFACTOR_BAND is a magic constant
crates/cli/src/health_types/finding.rs:48:Used at
finding.rs:267-273to trigger a secondaryrefactor-functionaction on CRAP-only findings whose cyclomatic is within 5 points of the max threshold AND whose cognitive complexity is at least half the cognitive threshold. The rationale (suppress false positives on flat type-tag dispatchers per.claude/rules/cli-crate.md) lives only in CLAUDE.md, not next to the constant.Three issues:
health_typesrather thanconfig/health.rsalongside other thresholds.3. Coverage tier is invisible in the human-output sections users actually read
crates/cli/src/report/human/health.rs:1102already emits a tier-aware hint:But this string is only rendered in the
--file-scoressection. The two sections that surface CRAP values most prominently when running barefallow(the default combined pipeline) carry zero tier context:crates/cli/src/report/human/health.rs:635lists LOC, cyclomatic, MI, churn, unused deps but never says "no coverage data; CRAP scores are estimated."● High complexity functionssection header atcrates/cli/src/report/human/health.rs:843and its per-finding CRAP value rendering at line 917 ({crap:>5.1} CRAP) prints values like650.0 CRAPwith no indication of whether the score came from real Istanbul coverage or the estimated model.The on-disk artifact already knows:
coverage_sourceis on every finding (scoring.rs:443sets it viaCoverageTier::from_pct(estimated_coverage)), and the report-levelsummary.coverage_modelrecords the source. The signal is there; the human renderer just does not surface it where users are looking.Fix surface options:
● High complexity functionswhencoverage_model == Estimated: "CRAP scores estimated from export references. Pass--coverage <path>for exact scores."650.0 CRAP (estimated)/12.0 CRAP (Istanbul)/47.0 CRAP (Istanbul partial)■ Metrics:line athealth.rs:635with acoverage: estimated/coverage: Istanbul (854/1043 matched)segment when complexity is in the pipeline.(a) is cheapest and high-signal; (b) is most informative per-finding; (c) is the most discoverable but also the most space-constrained.
The same gap shows up via MCP and JSON: consumers reading
findings[].crapcannot route advice without also readingfindings[].coverage_source. Documenting the precedence (Part 1) and surfacing it in human output (this Part 3) reinforce each other.Scope
Part A: explicit tier precedence
Replace the implicit if/else chain with an explicit
enum CoverageSource { TemplateInherited, IstanbulHigh, IstanbulPartial, IstanbulMissing, StaticEstimated }and a single resolver function that returns the source onFrom<...>.Per file, track whether all functions resolved to the same tier or diverged. Emit a top-level finding metadata field
coverage_source_consistency: "uniform" | "mixed"so downstream consumers (dashboards, AI agents) can detect the silent shift and surface a hint.Document the precedence in the resolver function's doc-comment with a one-line "why" per tier.
Part B: configurable threshold
SECONDARY_REFACTOR_BANDtocrates/config/src/config/health.rsas a documented field with a sensible default..fallowrc.jsonso users can tune; default unchanged.Part C: surface the coverage tier in human output
Add a single-line tier note above the
● High complexity functionssection incrates/cli/src/report/human/health.rswheneverreport.summary.coverage_modelisEstimated(and a less prominent variant forIstanbulso consistency is signalled both ways).Optionally extend the orientation header at
health.rs:635with acoverage: estimated | Istanbul (m/t matched)segment when the complexity sub-pipeline ran.Surface the Part-A
coverage_source_consistencyvalue in the same hint whenmixed, so users see "CRAP scores: Istanbul where matched, estimated for 21% of functions" instead of silent divergence within one file.Snapshot tests for the new line against fixtures with and without
--coverage.Rewrite the existing tier-aware hint at
crates/cli/src/report/human/health.rs:1102. Today it reads"CRAP estimated from export references (85% direct, 40% indirect, 0% untested). Use --coverage for exact scores.". Users who follow that hint type eitherfallow --coverage(getsunexpected argument --coverageplus a confusedtip: a similar argument exists: '--tolerance') orfallow health --coverage(getsa value is required for '--coverage <PATH>' but none was supplied). The hint must spell out the subcommand context AND the required path argument:Same edit applies to any docs page mirroring the hint (e.g.
https://docs.fallow.tools/explanations/health#file-health-scores); the prose should match the CLI invocation users need to type.Add a custom clap
UnknownArgumenthandler for top-level--coverage. Clap's default ranking suggests--tolerancebecause it is the alphabetically nearest top-level long arg, which is irrelevant. The handler should detect--coverageat the top level (where neither thecoveragesubcommand nor any--coverageflag exists) and route to a targeted hint:Two confounders justify the custom handler:
coverageis also a subcommand (fallow coverage), and--coverage <PATH>ishealth-scoped. The handler disambiguates both cases in one message instead of letting clap fall back to distance-ranking against--tolerance.Why this matters
project_eslint_overlap_not_disqualifierwarns against.650.0 CRAPon a complex function has no way to know that number is based on "I have no idea how well this is tested." That is precisely the kind of opaque metric that erodes trust; Part C closes the gap by making the input source visible at the point of consumption, not hidden in--file-scores.Out of scope
HIGH_COVERAGE_WATERMARK = 70%atscores.rs:186). Worth surfacing as configurable separately if requested by users.Verification