Skip to content

fix: silence false-positive --explain hints and preview selected FQNs at lifecycle level#29

Merged
vedanthvdev merged 1 commit intomasterfrom
fix/hint-noise-and-lifecycle-fqn-tail
Apr 22, 2026
Merged

fix: silence false-positive --explain hints and preview selected FQNs at lifecycle level#29
vedanthvdev merged 1 commit intomasterfrom
fix/hint-noise-and-lifecycle-fqn-tail

Conversation

@vedanthvdev
Copy link
Copy Markdown
Owner

Summary

Batch-3 of the post-v1.9.17 sanity-test fixes. Two user-visible
changes, one latent-bug cleanup that falls out of the refactor.

1. --explain hint gating

Running v1.9.17 across nine representative MR shapes on the
security-service pilot surfaced the Hint: outOfScopeTestDirs is configured but no file in the diff matched any configured pattern
line firing on five of them — including a markdown-only MR where
nothing a source-tree matcher could have bitten was even in the diff.
That 5-of-9 false-positive rate trains reviewers to ignore the hint,
which is exactly the anti-pattern it was added in v1.9.14 to prevent.

The hint is now gated on Situation.DISCOVERY_SUCCESS or
Situation.DISCOVERY_EMPTY — the only two branches where a
correctly-configured out-of-scope pattern could have changed the
outcome. The other four situations suppress:

Situation Why it can't be silent
EMPTY_DIFF No files to match
ALL_FILES_IGNORED Ignore wins before out-of-scope ever looks
ALL_FILES_OUT_OF_SCOPE Bucket is non-empty (guard was already there, gate makes it explicit)
UNMAPPED_FILE File missed every pattern, including out-of-scope

Each negative case has an explicit test so the gate can't drift.

2. Lifecycle FQN preview for SELECTED dispatches

Pre-v1.9.18 the task printed only the module summary at lifecycle
level (application:test (47 test classes)) and demoted every FQN to
info. A reviewer scrolling the default CI log could see the dispatch
size but not which tests ran without either re-running with
--info (slow) or opening the JUnit report after the fact.

renderLifecycleDispatchPreview is a new pure helper that emits the
first 5 FQNs per module at lifecycle level with a
… and N more (use --info for full list) tail on larger dispatches.
The 5-class cap keeps total lifecycle output well under the 4 MiB
GitHub Actions step cap that forced the info-level demotion in the
first place. Info-level per-FQN logging is retained for --info
users.

Example lifecycle output (17-class dispatch):

Running 17 affected test classes across 1 module(s):
  application:test (17 test classes)
    com.example.AuthenticatorTest
    com.example.TokenValidatorTest
    com.example.SessionStoreTest
    com.example.PasswordResetTest
    com.example.MfaChallengeTest
    … and 12 more (use --info for full list)

3. Header arithmetic fix + silent-fallback closure

Hoisting FQN validation out of the per-module log loop to feed the
lifecycle preview also fixed two latent issues:

  • Header count inconsistency. The Running N affected test classes
    header used the pre-validation count while per-module counts were
    post-validation. Skipped FQNs now surface in a
    (K malformed FQN skipped — see WARN above) suffix.
  • Silent full-module fallback. A module whose entire discovered
    FQN set failed isValidFqn used to dispatch its bare taskPath
    with no --tests filter, silently escalating to a full module
    suite — the exact safety regression isValidFqn was added to
    prevent. Modules with zero valid FQNs are now dropped from dispatch,
    and a cross-all-modules zero-survivors dispatch throws
    GradleException instead of invoking Gradle with an empty task
    list. CHANGELOG calls both behavior deltas out for adopters.

Test plan

  • ./gradlew check — green
  • 5 new negative hint-gate tests in AffectedTestTaskExplainFormatTest (one per suppressed situation, plus an explicit ALL_FILES_OUT_OF_SCOPE pin)
  • 1 new positive hint-gate test covering the DISCOVERY_EMPTY full-suite escalation path
  • 5 new dispatch-preview tests (empty, singleton, exactly-at-limit, oversized, 200-FQN stress) covering format, pluralization, ordering, and tail
  • Sanity-retested on security-service against 9 MR shapes; hint now silent on the 4 previously-false-positive cases, still fires on the correctly-silent misconfig scenario, preview renders with the expected tail
  • Code-reviewer sweep (must-fix: none; should-fix: added empty-validation GradleException guard + CHANGELOG bullet for the silent-fallback behavior change; nits: FQN pluralization paper-cut fixed)

…in hints and preview selected FQNs at lifecycle level

Two fixes from the v1.9.17 sanity-test pass on the security-service
pilot, plus an upstream refactor that falls out of tightening the
second one.

The --explain "Hint: outOfScopeTestDirs is configured but no files
matched" line fired on 5 of 9 representative MR shapes — including
markdown-only diffs and empty diffs where no source-tree matcher could
have bitten. The false-positive rate trained reviewers to ignore the
line, which defeated its diagnostic purpose. The hint now requires the
resolved Situation to be DISCOVERY_SUCCESS or DISCOVERY_EMPTY (the only
two branches where a correctly-configured out-of-scope pattern could
have changed the outcome). The four suppressed situations —
EMPTY_DIFF, ALL_FILES_IGNORED, ALL_FILES_OUT_OF_SCOPE, UNMAPPED_FILE —
each have an explicit negative test so the gate can't silently drift.

SELECTED dispatches previously printed only the module summary at
lifecycle level and demoted every FQN to info, so a reviewer scrolling
the default CI log could see the dispatch size but not which tests
ran. The new renderLifecycleDispatchPreview helper surfaces the first
five FQNs per module with a "… and N more (use --info for full list)"
tail on larger dispatches, keeping the default log diagnosable without
breaching the 4 MiB GitHub Actions step cap that forced the info-level
demotion in the first place. The helper is a pure function with its
own unit-test file so the exact format stays pinned without a Gradle
runtime.

Hoisting FQN validation out of the per-module log loop to feed the
lifecycle preview also fixed a latent header inconsistency — the
"Running N affected test classes" line used the pre-validation count
while the per-module counts underneath it were post-validation — and
closed a pre-existing silent-fallback: a module whose entire
discovered FQN set failed isValidFqn used to dispatch its bare
taskPath with no --tests filter, silently escalating to a full module
suite. Modules with zero valid FQNs are now dropped from dispatch, and
a dispatch where zero FQNs survive validation across ALL modules
throws GradleException instead of invoking Gradle with an empty task
list (environment-dependent behaviour). Both changes are documented in
the CHANGELOG so adopters relying on the accidental fallback can
migrate to explicit runAllIfNoMatches / Action.FULL_SUITE config.
@vedanthvdev vedanthvdev merged commit fbb646f into master Apr 22, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant