From 2a4e1e7aa68e7b311f4c85c50fb69fb9b23df357 Mon Sep 17 00:00:00 2001 From: vedanthvasudev Date: Wed, 22 Apr 2026 22:58:24 +0100 Subject: [PATCH] fix/discovery-incomplete-and-gradlew-timeout: surface parse failures as DISCOVERY_INCOMPLETE and deadline nested gradlew invocations MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Closes the two deferred findings from the v1.9.21 reliability batch. Before this release, a single mid-refactor syntax error in a scanned Java file would silently shrink the test selection: `JavaParsers.parseOrWarn` returned null, the dispatch map was built from whatever happened to parse, and the engine reported `DISCOVERY_SUCCESS` — exactly the scenario where the merge-gate safety net needs to pay out. `ProjectIndex` now counts parse failures at the cache boundary (de-duplicated across strategies) and `AffectedTestsEngine` routes through a new `Situation.DISCOVERY_INCOMPLETE` before the old empty/success branches. Mode defaults keep the existing safety-vs-speed asymmetry: `CI`/`STRICT` escalate to `FULL_SUITE`, `LOCAL` stays on `SELECTED` because the developer already saw the WARN in their terminal. The new `onDiscoveryIncomplete` DSL knob lets operators override per-project, and a matching `EscalationReason.RUN_ALL_ON_DISCOVERY_INCOMPLETE` feeds `--explain` so a full-suite run from a parse failure is distinguishable from one triggered by an empty diff or an unmapped file. The second half of the release deadlines the nested `./gradlew :module:test` invocation. Before this release, that child was launched via `ExecOperations` with no timeout surface at all — a hung test kept the CI worker busy until the outer CI-job deadline killed everything, which typically surfaces as "pipeline failed with no useful logs" hours later. The new `affectedTests.gradlewTimeoutSeconds` knob defaults to `0` (preserves the wait-forever behaviour for zero-config upgrades) and, when set, branches onto a `ProcessBuilder` watchdog that runs a polite `destroy()` → 10s grace → `destroyForcibly()` → 5s reap ladder and fails the build with a message that names the setting. Because `./gradlew` connects to a shared daemon JVM, killing only the wrapper would leave the actually-hung test JVM running as a grandchild; the watchdog snapshots `ProcessHandle.descendants()` before signalling and reaps every still-live descendant on both legs of the ladder — the initial cut only reaped on the forcible leg, which failed the regression test that now pins the behaviour because a SIGTERM-responsive wrapper (plain `sh`, most test runners) exits on `destroy()` and leaves its children re-parented to pid 1. Outer-build cancellation propagates through the same path and re-asserts the interrupt. Validation lives at the builder gate (rejects negative values, pins the zero default) and at runtime in a new `AffectedTestTaskTimeoutTest` that drives the watchdog against a hung `sleep 60` child and verifies the grandchild PID via `ProcessHandle` after the exception surfaces. The PR also addresses the tier-2 findings from the code-reviewer sweep on the initial cut: `situationOrder()` and the matrix-order regression test now include `DISCOVERY_INCOMPLETE` in the correct evaluation position, `describeSkipReason` branches on `Action` so `DISCOVERY_INCOMPLETE + SELECTED + empty` no longer renders the contradictory "SELECTED … onDiscoveryIncomplete=SKIPPED" summary, `Situation` javadoc is refreshed from "six" to "seven" situations with the new evaluation order, `DISCOVERY_INCOMPLETE` is explicitly unwired from the `runAllIfNoMatches` legacy boolean so `CI` mode cannot silently skip on parse failures, and the `AffectedTestsEngine.run()` tail no longer duplicates the result-construction block or drops the `=== Result` log header. --- CHANGELOG.md | 107 +++++++ README.md | 22 ++ .../core/AffectedTestsEngine.java | 66 +++- .../core/config/AffectedTestsConfig.java | 101 +++++- .../affectedtests/core/config/Situation.java | 40 ++- .../core/discovery/ProjectIndex.java | 36 +++ .../core/AffectedTestsEngineTest.java | 164 ++++++++++ .../core/config/AffectedTestsConfigTest.java | 110 +++++++ .../core/discovery/ProjectIndexTest.java | 57 ++++ .../gradle/AffectedTestTask.java | 294 +++++++++++++++++- .../gradle/AffectedTestsExtension.java | 33 ++ .../gradle/AffectedTestsPlugin.java | 2 + .../AffectedTestTaskExplainFormatTest.java | 7 + .../gradle/AffectedTestTaskLogFormatTest.java | 59 +++- .../gradle/AffectedTestTaskTimeoutTest.java | 203 ++++++++++++ .../gradle/AffectedTestsPluginTest.java | 43 +++ 16 files changed, 1317 insertions(+), 27 deletions(-) create mode 100644 affected-tests-gradle/src/test/java/io/affectedtests/gradle/AffectedTestTaskTimeoutTest.java diff --git a/CHANGELOG.md b/CHANGELOG.md index 18ed31e..bae6e58 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -6,6 +6,113 @@ adheres to [Semantic Versioning](https://semver.org/). ## [Unreleased] +### Added — discovery-incomplete signal + nested `gradlew` timeout (v1.9.22) + +The two deferred findings from the v1.9.21 reliability batch: both close +long-tail silent-failure modes that could leave an operator convinced a +merge-gate run was green when it was in fact half-blind or stuck. + +- **`Situation.DISCOVERY_INCOMPLETE` now fires whenever any scanned Java + file fails to parse.** Before this release, `JavaParsers.parseOrWarn` + emitted a `WARN` and returned `null`, and the strategy call site + silently continued — the dispatch map was built from whatever happened + to parse. For a branch with one mid-refactor syntax error that meant + `DISCOVERY_SUCCESS` with a quietly-shrunk selection, which is exactly + the scenario where the merge-gate safety net needs to pay out. + `ProjectIndex#compilationUnit` now increments a per-run + `parseFailureCount` at the cache boundary (de-duplicated across + strategies — a file that fails to parse once counts once, no matter + how many of naming / usage / implementation / transitive ask for it). + `AffectedTestsEngine#run` consults that count after discovery and + routes through the new situation before the old `DISCOVERY_EMPTY` / + `DISCOVERY_SUCCESS` branches. The mode-default resolution matches the + existing asymmetry — `CI` and `STRICT` escalate to `FULL_SUITE` + (safety wins over speed on merge-gate runs), `LOCAL` stays on + `SELECTED` (iteration speed wins for the developer actively editing + the broken file, who already has the `WARN` in their terminal). The + new knob is configurable via the Gradle DSL: + ```groovy + affectedTests { + onDiscoveryIncomplete = "FULL_SUITE" // or "SELECTED" / "SKIPPED" + } + ``` + A matching `EscalationReason.RUN_ALL_ON_DISCOVERY_INCOMPLETE` feeds + `--explain` and the `affectedTest` task's skip-reason / escalation + copy so operators can tell at a glance whether a full-suite run was + triggered by an empty diff, an unmapped file, or a parse failure. + Regression coverage: + - `ProjectIndexTest.parseFailureCountIncrementsOnUnparseableFile` + pins the cache-boundary counting contract and the de-duplication + behaviour (three calls on the same broken file still count as + one failure). + - `AffectedTestsEngineTest.discoveryIncompleteEscalatesToFullSuiteInCI` + and `discoveryIncompleteStillSelectsInLocalMode` pin the + CI-vs-LOCAL asymmetry. + - `discoveryIncompleteRespectsExplicitOnDiscoveryIncompleteOverride` + pins the explicit-wins-over-mode tier. + - Three `AffectedTestsConfigTest` cases pin the mode defaults for + `CI`, `LOCAL`, and `STRICT` so a future refactor of the resolver + can't silently bump zero-config users. +- **New `affectedTests.gradlewTimeoutSeconds` knob deadlines the nested + `./gradlew` invocation.** The task spawns a child `./gradlew :module:test + --tests …` for each affected module. Before this release the child + was launched via Gradle's `ExecOperations`, which has no timeout + surface at all — a hung test kept the CI worker busy until the outer + CI-system deadline killed the whole job, which typically surfaces as + "pipeline failed with no useful logs" hours later. The new knob is a + wall-clock deadline in seconds; `0` (the default) preserves the + pre-v1.9.22 "wait forever" behaviour for zero-config callers so the + upgrade is safe. Positive values branch the task onto a `ProcessBuilder` + watchdog path: the child still streams its output to the operator's + terminal via `inheritIO()`, and on timeout the task runs a polite + `destroy()` on the wrapper → 10-second grace → `destroyForcibly()` + on both the wrapper and every snapshotted descendant (shared Gradle + daemons, test JVMs) → 5-second reap ladder before failing the build + with a message that names the setting so the operator knows what to + raise. Because `./gradlew` connects to a shared daemon JVM by + default, killing only the wrapper would leave the actually-hung + test JVM running as a grandchild; the watchdog snapshots + `ProcessHandle#descendants()` *before* signalling (once the wrapper + exits, re-parented daemons become unreachable from + `process.descendants()`) and forcibly destroys every live descendant + after the wrapper is reaped, regardless of whether the wrapper + itself exited gracefully on `SIGTERM` or had to be `destroyForcibly` + escalated. This matters because a SIGTERM-responsive wrapper + (plain `sh`, most test runners) exits on `destroy()` but leaves its + children re-parented to pid 1 — those grandchildren are the real + hung workload and would keep holding the CI worker hostage if the + descendants-kill only ran on the forcible leg. Operators who need a + hard cutoff with no daemon reuse can pass `--no-daemon` at the + outer build level. + Outer-build cancellation (Ctrl-C, Gradle daemon shutdown) propagates + via the same snapshot-then-destroyForcibly path and re-asserts the + interrupt. The trade-off for opting in: + the watchdog path uses `inheritIO` instead of `ExecOperations`, so + Develocity build-scan stream capture of the child's output is not + available on timed runs — callers who rely on scan ingestion of the + nested output should leave the knob at `0` and enforce a deadline at + the CI-job level instead. Validation lives at the builder gate: + `AffectedTestsConfigTest.gradlewTimeoutRejectsNegativeValues` fails + when a user sets `-1` (typoed "no timeout"), and + `gradlewTimeoutDefaultsToZero` pins the zero-config default so any + future refactor that flips the default breaks loudly instead of + retroactively deadline-killing every long-running CI on upgrade. + Runtime coverage lives in `AffectedTestTaskTimeoutTest` + (POSIX-only): `hungChildIsKilledWithinLadderBudget` pins the ladder + budget, `sigtermSwallowingChildEscalatesToForcibleKill` exercises + the forcible leg against a `trap '' TERM; sleep 60` wrapper, + `successfulChildReturnsExitCodeBeforeTimeout` pins both the happy + path and non-zero exit propagation, and + `descendantsAreTerminatedNotOnlyTheWrapper` writes the grandchild's + PID to a pidfile and asserts via `ProcessHandle.of(pid).isAlive()` + that the re-parented `sleep` is reaped — the exact grandchild- + survival bug the descendants-kill loop closes. The last test was + the one that surfaced the "only the forcible leg kills descendants" + regression in the first cut of the fix: if the wrapper exited + gracefully on SIGTERM the descendants loop never ran and the + orphaned sleep survived. The fix now reaps descendants on both + legs. + ### Fixed — tier-1 reliability / safety batch (v1.9.21) The v1.9.20 batch closed every tier-1 *correctness* bug surfaced by the diff --git a/README.md b/README.md index 95f34e5..64dc4a1 100644 --- a/README.md +++ b/README.md @@ -222,6 +222,27 @@ affectedTests { onAllFilesOutOfScope = "skipped" onUnmappedFile = "full_suite" // the key MR-safety knob onDiscoveryEmpty = "full_suite" // belt-and-braces for CI + // Fires when any scanned Java file failed to parse (malformed source, + // half-committed refactor, encoding glitch). Mode defaults: CI / STRICT + // escalate to full_suite (selection is known to be under-reported so + // safety wins); LOCAL stays on selected (the developer sees the WARN + // and values iteration speed on a branch they're actively editing). + onDiscoveryIncomplete = "full_suite" + + // ---------------- Child-process deadline (v1.9.22) ---------------- + + // Wall-clock timeout in seconds for the nested `./gradlew :module:test` + // invocation. 0 (the default) disables the watchdog and matches + // pre-v1.9.22 behaviour — the task waits forever. Positive values + // spawn the child under a ProcessBuilder watchdog: on timeout the + // task runs destroy() → 10s grace → destroyForcibly() → 5s reap and + // then fails the build so a hung test never holds the CI worker + // hostage. Note: the watchdog path uses inheritIO for the child's + // stdio, so Develocity build-scan stream capture of the nested + // output is not available when a timeout is set — leave at 0 and + // enforce the deadline at the CI-job level if you rely on scan + // ingestion of child-process output. + gradlewTimeoutSeconds = 1800 // 30 min; use 3600 for suites with integration tests // ---------------- Discovery tuning ---------------- @@ -309,6 +330,7 @@ Every row below shows the situation the engine resolved, and the action applied | Any YAML / Gradle / Liquibase / `.java` outside configured dirs | `UNMAPPED_FILE` | `FULL_SUITE` (via `onUnmappedFile = "full_suite"`) | `onUnmappedFile = "selected"` | | No changed files at all | `EMPTY_DIFF` | `SKIPPED` | `onEmptyDiff = "full_suite"` / `mode = strict` | | Mapping succeeds but discovery returns zero tests | `DISCOVERY_EMPTY` | `SKIPPED` — or `FULL_SUITE` if `mode=ci`/`strict` | `onDiscoveryEmpty` / `mode` | +| At least one scanned `.java` file failed to parse (malformed / half-committed / encoding glitch) | `DISCOVERY_INCOMPLETE` (takes precedence over both `DISCOVERY_EMPTY` and `DISCOVERY_SUCCESS`) | `SELECTED` — or `FULL_SUITE` if `mode=ci`/`strict` | `onDiscoveryIncomplete` / `mode` | | Mixed diff: Java + unmapped file | `UNMAPPED_FILE` (takes precedence) | `FULL_SUITE` | `onUnmappedFile` — set to `"selected"` to fall through to discovery | | `baseRef` not resolvable | `FAILED` | Hard error (prevents silent test skipping in CI) | — | | Not a git work tree / JGit I/O error | `FAILED` | Hard error | — | diff --git a/affected-tests-core/src/main/java/io/affectedtests/core/AffectedTestsEngine.java b/affected-tests-core/src/main/java/io/affectedtests/core/AffectedTestsEngine.java index 3772a22..cc5c52a 100644 --- a/affected-tests-core/src/main/java/io/affectedtests/core/AffectedTestsEngine.java +++ b/affected-tests-core/src/main/java/io/affectedtests/core/AffectedTestsEngine.java @@ -91,7 +91,17 @@ public enum EscalationReason { * action for {@link Situation#ALL_FILES_OUT_OF_SCOPE} resolved to * {@link Action#FULL_SUITE}. v2-only. */ - RUN_ALL_ON_ALL_FILES_OUT_OF_SCOPE + RUN_ALL_ON_ALL_FILES_OUT_OF_SCOPE, + /** + * One or more scanned Java files failed to parse (malformed + * source, or a language level beyond {@link io.affectedtests.core.discovery.JavaParsers#LANGUAGE_LEVEL}), + * so discovery may have under-reported its selection. The action + * for {@link Situation#DISCOVERY_INCOMPLETE} resolved to + * {@link Action#FULL_SUITE}, typically via {@link io.affectedtests.core.config.Mode#CI} + * or {@link io.affectedtests.core.config.Mode#STRICT} defaults + * or an explicit {@code runAllIfNoMatches=true}. v2-only. + */ + RUN_ALL_ON_DISCOVERY_INCOMPLETE } /** @@ -320,14 +330,57 @@ public AffectedTestsResult run() { } } + // Parse failures during index / strategy walks mean the Usage / + // Implementation / Transitive tiers may have silently dropped + // tests that actually do reference the changed production code. + // Surface that as its own situation so callers can pick a policy + // (escalate to FULL_SUITE on CI, keep the partial selection on + // LOCAL) instead of routing through DISCOVERY_EMPTY / + // DISCOVERY_SUCCESS as if the selection were trustworthy. The + // count is de-duplicated inside ProjectIndex so a single + // unparseable file is not multi-counted across strategies. + int parseFailures = index.parseFailureCount(); + if (parseFailures > 0) { + Action action = config.actionFor(Situation.DISCOVERY_INCOMPLETE); + log.warn("Affected Tests: discovery observed {} unparseable Java file(s); " + + "selection may be incomplete. Action: {}. See WARN lines above " + + "for the specific files that failed to parse.", + parseFailures, action); + if (action == Action.FULL_SUITE || action == Action.SKIPPED) { + return emptyResult(Situation.DISCOVERY_INCOMPLETE, action, changedFiles, + mapping.productionClasses(), mapping.testClasses(), buckets); + } + // SELECTED: honour the situation but fall through to the + // shared empty/selected tail below. The tail now picks the + // returned situation based on parseFailures so both the + // happy path and this one share one log line ({@code "===" + // Result: N affected test classes (SITUATION)}) and one + // record construction, instead of two near-identical copies + // that will drift. + } + + Situation finalSituation = parseFailures > 0 + ? Situation.DISCOVERY_INCOMPLETE + : Situation.DISCOVERY_SUCCESS; + if (allTestsToRun.isEmpty()) { - Action action = config.actionFor(Situation.DISCOVERY_EMPTY); - log.info("Discovery produced no affected tests. Action: {}.", action); - return emptyResult(Situation.DISCOVERY_EMPTY, action, changedFiles, + // Shared empty tail. For parseFailures == 0 we report + // DISCOVERY_EMPTY (the pre-v1.9.22 branch); for + // parseFailures > 0 + SELECTED we report DISCOVERY_INCOMPLETE + // so --explain, the skip-reason phrase, and the CI log + // grep all see a single truthful situation. + Situation emptySituation = parseFailures > 0 + ? Situation.DISCOVERY_INCOMPLETE + : Situation.DISCOVERY_EMPTY; + Action action = config.actionFor(emptySituation); + log.info("Discovery produced no affected tests. Situation: {}, Action: {}.", + emptySituation, action); + return emptyResult(emptySituation, action, changedFiles, mapping.productionClasses(), mapping.testClasses(), buckets); } - log.info("=== Result: {} affected test classes ===", allTestsToRun.size()); + log.info("=== Result: {} affected test classes ({}) ===", + allTestsToRun.size(), finalSituation); // FQNs are derived from diff filenames by the discovery strategies // and have not yet been through the AffectedTestTask#isValidFqn // gate; sanitise here so an attacker-planted filename like @@ -344,7 +397,7 @@ public AffectedTestsResult run() { buckets, false, false, - Situation.DISCOVERY_SUCCESS, + finalSituation, Action.SELECTED, EscalationReason.NONE ); @@ -411,6 +464,7 @@ private static EscalationReason legacyReason(Situation situation, Action action) return switch (situation) { case EMPTY_DIFF -> EscalationReason.RUN_ALL_ON_EMPTY_CHANGESET; case DISCOVERY_EMPTY -> EscalationReason.RUN_ALL_IF_NO_MATCHES; + case DISCOVERY_INCOMPLETE -> EscalationReason.RUN_ALL_ON_DISCOVERY_INCOMPLETE; case UNMAPPED_FILE -> EscalationReason.RUN_ALL_ON_NON_JAVA_CHANGE; case ALL_FILES_IGNORED -> EscalationReason.RUN_ALL_ON_ALL_FILES_IGNORED; case ALL_FILES_OUT_OF_SCOPE -> EscalationReason.RUN_ALL_ON_ALL_FILES_OUT_OF_SCOPE; diff --git a/affected-tests-core/src/main/java/io/affectedtests/core/config/AffectedTestsConfig.java b/affected-tests-core/src/main/java/io/affectedtests/core/config/AffectedTestsConfig.java index a18c242..2478bb2 100644 --- a/affected-tests-core/src/main/java/io/affectedtests/core/config/AffectedTestsConfig.java +++ b/affected-tests-core/src/main/java/io/affectedtests/core/config/AffectedTestsConfig.java @@ -63,6 +63,7 @@ public final class AffectedTestsConfig { private final List outOfScopeSourceDirs; private final boolean includeImplementationTests; private final List implementationNaming; + private final long gradlewTimeoutSeconds; private final Mode mode; private final Mode effectiveMode; private final Map situationActions; @@ -80,6 +81,11 @@ private AffectedTestsConfig(Builder builder) { this.testDirs = List.copyOf(builder.testDirs); this.includeImplementationTests = builder.includeImplementationTests; this.implementationNaming = List.copyOf(builder.implementationNaming); + // Zero = no timeout, preserving the pre-v1.9.22 "wait forever" + // behaviour for zero-config callers. Negative values are + // rejected at the builder gate below; see + // {@link Builder#gradlewTimeoutSeconds(long)}. + this.gradlewTimeoutSeconds = builder.gradlewTimeoutSeconds; // Resolve the paths-to-ignore list. Precedence matches the doc: // explicit ignorePaths() > explicit excludePaths() > builder default. @@ -224,6 +230,21 @@ private static ResolvedActions resolveSituationActions(Builder b, Mode effective b.onUnmappedFile, legacyNonJava, effectiveMode, Action.FULL_SUITE); resolveInto(actions, sources, Situation.DISCOVERY_EMPTY, b.onDiscoveryEmpty, legacyNoMatches, effectiveMode, Action.SKIPPED); + // No legacy boolean maps to DISCOVERY_INCOMPLETE — the situation + // did not exist pre-v1.9.22, so there is nothing to translate + // and there is no pre-v2 behaviour to preserve. Wiring the + // {@code runAllIfNoMatches} shim in here would silently fight + // the CI / STRICT safety-net guarantee documented in the + // {@link Situation#DISCOVERY_INCOMPLETE} javadoc: a user who + // set {@code mode=CI} plus {@code runAllIfNoMatches=false} + // would get SKIPPED on parse failures, which is the opposite + // of what the doc promises. The hard-coded fallback is + // {@link Action#SELECTED}, matching the LOCAL-mode default + // and mirroring the {@link Situation#ALL_FILES_OUT_OF_SCOPE} + // wiring (another v2-only situation that passes {@code null} + // for the legacy argument). + resolveInto(actions, sources, Situation.DISCOVERY_INCOMPLETE, + b.onDiscoveryIncomplete, null, effectiveMode, Action.SELECTED); // DISCOVERY_SUCCESS is definitionally SELECTED — there is no // other sensible outcome when discovery returns tests. Making // it configurable would let users set "discovery ran, found @@ -273,16 +294,27 @@ private static Action defaultFor(Situation s, Mode effectiveMode) { case LOCAL -> switch (s) { case EMPTY_DIFF, ALL_FILES_IGNORED, ALL_FILES_OUT_OF_SCOPE, DISCOVERY_EMPTY -> Action.SKIPPED; case UNMAPPED_FILE -> Action.FULL_SUITE; - case DISCOVERY_SUCCESS -> Action.SELECTED; + // LOCAL is the dev-iteration profile: surfacing the WARN + // at parse time is enough feedback, and forcing a full + // suite on every parse hiccup would make the wrapper + // unusable while mid-edit. The dev can re-run after + // fixing the offending file. + case DISCOVERY_INCOMPLETE, DISCOVERY_SUCCESS -> Action.SELECTED; }; case CI -> switch (s) { case EMPTY_DIFF, ALL_FILES_IGNORED, ALL_FILES_OUT_OF_SCOPE -> Action.SKIPPED; - case UNMAPPED_FILE, DISCOVERY_EMPTY -> Action.FULL_SUITE; + // Parse failures on a CI merge-gate are the exact situation + // the safety net exists for: we cannot prove coverage, so + // default to running everything. Operators who know their + // trees have transient parse noise can override with + // {@code onDiscoveryIncomplete = "selected"}. + case UNMAPPED_FILE, DISCOVERY_EMPTY, DISCOVERY_INCOMPLETE -> Action.FULL_SUITE; case DISCOVERY_SUCCESS -> Action.SELECTED; }; case STRICT -> switch (s) { case ALL_FILES_OUT_OF_SCOPE -> Action.SKIPPED; - case EMPTY_DIFF, ALL_FILES_IGNORED, UNMAPPED_FILE, DISCOVERY_EMPTY -> Action.FULL_SUITE; + case EMPTY_DIFF, ALL_FILES_IGNORED, UNMAPPED_FILE, DISCOVERY_EMPTY, + DISCOVERY_INCOMPLETE -> Action.FULL_SUITE; case DISCOVERY_SUCCESS -> Action.SELECTED; }; case AUTO -> throw new IllegalStateException( @@ -314,6 +346,33 @@ private static Action defaultFor(Situation s, Mode effectiveMode) { public List sourceDirs() { return sourceDirs; } public List testDirs() { return testDirs; } + /** + * Hard wall-clock timeout applied to the nested {@code ./gradlew} + * invocation that runs the affected / full test suite. Zero means + * "no timeout" (the pre-v1.9.22 default — wait for the child to + * finish no matter how long it takes); any positive value is the + * deadline in seconds, after which the Gradle task destroys the + * child process tree and fails with a clear error. + * + *

Motivating class of bug: CI workers pinned for hours on a + * hung JVM or a stuck test — usually a deadlocked custom test + * harness, an exhausted Docker fixture, or a JDK agent that + * mis-responds to {@code SIGTERM}. Without a timeout the plugin + * has no way to surface the stall, so the only feedback is a + * pipeline that times out at the CI runner level with no mapping + * back to which test got stuck. + * + *

Recommended values: {@code 1800} (30 min) for merge-gate + * unit-test runs, {@code 3600} (1 hour) for suites that include + * integration tests, {@code 0} for local runs where the operator + * wants to attach a debugger and has infinite patience. Must be + * {@code >= 0}; negative values are rejected at build time. + * + * @return the wall-clock timeout in seconds, or {@code 0} when + * disabled + */ + public long gradlewTimeoutSeconds() { return gradlewTimeoutSeconds; } + /** * Glob patterns for files that must not influence test selection at all. * A diff consisting entirely of ignored paths routes through @@ -515,6 +574,11 @@ public static final class Builder { // silently dropped tests for every "Default"-prefixed impl in the // wild on zero-config installs. private List implementationNaming = List.of("Impl", "Default"); + // 0 = no timeout (matches pre-v1.9.22 behaviour, wait forever). + // Positive values are wall-clock seconds before the nested + // ./gradlew child process is destroyed. Validated at the + // setter; negative values throw IllegalArgumentException. + private long gradlewTimeoutSeconds = 0L; private Mode mode; private Action onEmptyDiff; @@ -522,6 +586,7 @@ public static final class Builder { private Action onAllFilesOutOfScope; private Action onUnmappedFile; private Action onDiscoveryEmpty; + private Action onDiscoveryIncomplete; public Builder baseRef(String baseRef) { if (baseRef == null || baseRef.isBlank()) { @@ -693,6 +758,32 @@ public Builder implementationNaming(List v) { return this; } + /** + * Sets the wall-clock deadline for the nested {@code ./gradlew} + * invocation. {@code 0} disables the timeout (pre-v1.9.22 + * default — wait indefinitely). Any positive value is seconds + * to wait before the Gradle task destroys the child process and + * fails with a clear error. + * + *

Negative values are rejected — the single pre-computed + * policy decision here is "there is no such thing as a negative + * deadline". Clamping to zero would hide a misconfigured + * Groovy/Kotlin DSL expression; throwing forces the user to see + * it at build-config time. + * + * @param v the timeout in seconds; must be {@code >= 0} + * @return this builder + * @throws IllegalArgumentException if {@code v} is negative + */ + public Builder gradlewTimeoutSeconds(long v) { + if (v < 0) { + throw new IllegalArgumentException( + "gradlewTimeoutSeconds must be >= 0 (0 disables the timeout); got " + v); + } + this.gradlewTimeoutSeconds = v; + return this; + } + public Builder mode(Mode v) { this.mode = Objects.requireNonNull(v, "mode must not be null"); return this; @@ -717,6 +808,10 @@ public Builder onDiscoveryEmpty(Action v) { this.onDiscoveryEmpty = Objects.requireNonNull(v, "onDiscoveryEmpty must not be null"); return this; } + public Builder onDiscoveryIncomplete(Action v) { + this.onDiscoveryIncomplete = Objects.requireNonNull(v, "onDiscoveryIncomplete must not be null"); + return this; + } public AffectedTestsConfig build() { return new AffectedTestsConfig(this); diff --git a/affected-tests-core/src/main/java/io/affectedtests/core/config/Situation.java b/affected-tests-core/src/main/java/io/affectedtests/core/config/Situation.java index b839e8b..8dfab7d 100644 --- a/affected-tests-core/src/main/java/io/affectedtests/core/config/Situation.java +++ b/affected-tests-core/src/main/java/io/affectedtests/core/config/Situation.java @@ -6,13 +6,14 @@ * {@link Action} via the per-situation config in * {@link AffectedTestsConfig}. * - *

The six situations are deliberately exhaustive and mutually exclusive, + *

The seven situations are deliberately exhaustive and mutually exclusive, * so a human reading the {@code --explain} output never has to reconstruct * which of several overlapping "safety net" branches the engine actually * took. The priority in {@link io.affectedtests.core.AffectedTestsEngine} * evaluates them in a fixed order — empty diff, all-ignored, - * all-out-of-scope, unmapped, discovery — so any diff maps to a single - * situation even when it nominally touches several buckets. + * all-out-of-scope, unmapped, discovery-incomplete, discovery-empty, + * discovery-success — so any diff maps to a single situation even when it + * nominally touches several buckets. * *

Note: {@link io.affectedtests.core.mapping.PathToClassMapper} already * routes a file to at most one of its buckets (ignore, out-of-scope, @@ -58,6 +59,39 @@ public enum Situation { */ DISCOVERY_EMPTY, + /** + * Discovery ran but at least one scanned Java file failed to parse, so + * the selection (empty or non-empty) is known to be under-reported. The + * engine routes here in preference to {@link #DISCOVERY_EMPTY} / + * {@link #DISCOVERY_SUCCESS} whenever + * {@link io.affectedtests.core.discovery.ProjectIndex#parseFailureCount()} + * is non-zero. + * + *

Motivating class of bug: a test file using a Java language level + * newer than {@link io.affectedtests.core.discovery.JavaParsers} knows + * about (or genuinely malformed source) produces {@code null} from + * {@code JavaParsers.parseOrWarn} and silently drops out of usage / + * transitive discovery. Before v1.9.22 the only surfacing was a WARN + * at parse time; the engine still routed through {@code DISCOVERY_EMPTY} + * or {@code DISCOVERY_SUCCESS}, and a caller grepping the + * {@code Situation} could not tell "nothing matched" from "matching + * failed because we couldn't read half the files". That asymmetry is + * especially dangerous on merge-gate runs: the caller is optimising + * test runtime, the parse failure silently under-selects, and the + * lost coverage only shows up on main. + * + *

Defaults in the {@link AffectedTestsConfig} resolver are + * conservative on purpose: {@link Mode#CI} and {@link Mode#STRICT} + * escalate to {@link Action#FULL_SUITE}, {@link Mode#LOCAL} stays on + * {@link Action#SELECTED} (dev sees the WARN and decides), and the + * legacy {@code runAllIfNoMatches=true} shim also escalates — it is + * the closest existing boolean to "I don't trust a no-signal result". + * The hardcoded pre-v2 default is {@link Action#SELECTED} so + * zero-config callers upgrading from v1.9.21 do not experience a + * behaviour change. + */ + DISCOVERY_INCOMPLETE, + /** Discovery ran and produced a non-empty test selection. */ DISCOVERY_SUCCESS } diff --git a/affected-tests-core/src/main/java/io/affectedtests/core/discovery/ProjectIndex.java b/affected-tests-core/src/main/java/io/affectedtests/core/discovery/ProjectIndex.java index 4095514..f6a5a73 100644 --- a/affected-tests-core/src/main/java/io/affectedtests/core/discovery/ProjectIndex.java +++ b/affected-tests-core/src/main/java/io/affectedtests/core/discovery/ProjectIndex.java @@ -33,6 +33,18 @@ public final class ProjectIndex { private final Map cuCache = new HashMap<>(); private final JavaParser parser = JavaParsers.newParser(); + // Count of distinct files that {@link JavaParsers#parseOrWarn} returned + // null for. The engine consults this after discovery to decide whether + // to route through {@link io.affectedtests.core.config.Situation#DISCOVERY_INCOMPLETE}: + // a parse failure silently drops the affected file from Usage / + // Implementation / Transitive strategies, and before v1.9.22 the only + // signal was a WARN at parse time — the engine itself couldn't tell + // a clean empty selection apart from "we couldn't read half the + // tests". Counting at the index boundary (not at each strategy) + // de-duplicates across strategies — the shared cache means one file + // parses once per run regardless of how many strategies consult it. + private int parseFailureCount = 0; + private ProjectIndex(List sourceFiles, List testFiles, Map testFqnToPath, Set sourceFqns) { this.sourceFiles = sourceFiles; @@ -156,6 +168,30 @@ public CompilationUnit compilationUnit(Path file) { } CompilationUnit cu = JavaParsers.parseOrWarn(parser, file, "index"); cuCache.put(file, cu); + if (cu == null) { + // First-time miss for this path: a WARN has already been + // emitted by parseOrWarn. Counting here (not at each + // strategy call site) naturally de-duplicates — the cached + // null return on subsequent calls for the same file does + // not re-increment, so a file that fails to parse once is + // counted once no matter how many strategies consult it. + parseFailureCount++; + } return cu; } + + /** + * Number of distinct scanned files for which {@link #compilationUnit(Path)} + * could not produce an AST during this engine run. Non-zero means the + * Usage / Implementation / Transitive strategies may have under- + * reported their tier: the engine consumes this to route through + * {@link io.affectedtests.core.config.Situation#DISCOVERY_INCOMPLETE} + * so the configured action (SELECTED / FULL_SUITE / SKIPPED) decides + * the outcome instead of the pre-v1.9.22 silent-drop behaviour. + * + * @return the de-duplicated count of files that failed to parse + */ + public int parseFailureCount() { + return parseFailureCount; + } } diff --git a/affected-tests-core/src/test/java/io/affectedtests/core/AffectedTestsEngineTest.java b/affected-tests-core/src/test/java/io/affectedtests/core/AffectedTestsEngineTest.java index a0f2d3d..d2e5371 100644 --- a/affected-tests-core/src/test/java/io/affectedtests/core/AffectedTestsEngineTest.java +++ b/affected-tests-core/src/test/java/io/affectedtests/core/AffectedTestsEngineTest.java @@ -568,4 +568,168 @@ void multiModuleTestPathsArePreservedInResult() throws Exception { actual.toAbsolutePath().normalize()); } } + + @Test + void discoveryIncompleteEscalatesToFullSuiteInCI() throws Exception { + // Regression for B6-#9: when any Java file in the on-disk index + // fails to parse, CI mode must stop pretending discovery is + // complete. Before the fix parseOrWarn swallowed the failure + // and the engine routed through DISCOVERY_EMPTY / DISCOVERY_SUCCESS + // based on whatever survived. That produced silent + // under-selection on any partially-migrated branch or + // half-committed syntax error — exactly the case where you + // most need the full suite. Defaults in CI mode must escalate + // to FULL_SUITE and tag the escalation reason so the Gradle + // task can print a useful explanation. + try (Git git = initRepoWithInitialCommit()) { + String base = git.log().call().iterator().next().getName(); + + Path prodDir = tempDir.resolve("src/main/java/com/example"); + Files.createDirectories(prodDir); + Files.writeString(prodDir.resolve("FooService.java"), + "package com.example;\npublic class FooService {}"); + + Path testDir = tempDir.resolve("src/test/java/com/example"); + Files.createDirectories(testDir); + Files.writeString(testDir.resolve("FooServiceTest.java"), + "package com.example;\npublic class FooServiceTest {}"); + // Unparseable test sibling — UsageStrategy will try to + // parse every test file to look for references to the + // changed production FQN and ProjectIndex#compilationUnit + // will record the failure at the cache boundary. This + // mirrors the real-world case we're hardening against: + // a half-committed / mid-refactor test file that JavaParser + // can't make sense of, silently dropped from the scan + // before v1.9.22. + Files.writeString(testDir.resolve("BrokenTest.java"), + "package com.example; public class BrokenTest {"); + + git.add().addFilepattern(".").call(); + git.commit().setMessage("add FooService + broken test sibling").call(); + + AffectedTestsConfig config = AffectedTestsConfig.builder() + .baseRef(base) + .mode(Mode.CI) + .includeUncommitted(false) + .includeStaged(false) + .strategies(Set.of("usage")) + .transitiveDepth(0) + .build(); + + AffectedTestsEngine engine = new AffectedTestsEngine(config, tempDir); + AffectedTestsEngine.AffectedTestsResult result = engine.run(); + + assertEquals(Situation.DISCOVERY_INCOMPLETE, result.situation(), + "Unparseable Java file must route the engine through DISCOVERY_INCOMPLETE"); + assertEquals(Action.FULL_SUITE, result.action(), + "CI mode default for DISCOVERY_INCOMPLETE must be FULL_SUITE (safety net)"); + assertTrue(result.runAll(), + "FULL_SUITE action must set runAll so the Gradle task triggers the whole suite"); + assertEquals(EscalationReason.RUN_ALL_ON_DISCOVERY_INCOMPLETE, + result.escalationReason(), + "Escalation reason must expose that the cause was parse failures, not generic no-match"); + } + } + + @Test + void discoveryIncompleteStillSelectsInLocalMode() throws Exception { + // Regression for B6-#9 (local-mode half): developers don't + // want to pay a full-suite tax for a syntax error they're + // actively editing. LOCAL mode must surface the same + // DISCOVERY_INCOMPLETE situation so --explain can warn, but + // keep the filtered selection instead of escalating. This + // pins the asymmetry between CI (safety-first) and LOCAL + // (iteration-speed-first) defaults. + try (Git git = initRepoWithInitialCommit()) { + String base = git.log().call().iterator().next().getName(); + + Path prodDir = tempDir.resolve("src/main/java/com/example"); + Files.createDirectories(prodDir); + Files.writeString(prodDir.resolve("FooService.java"), + "package com.example;\npublic class FooService {}"); + + Path testDir = tempDir.resolve("src/test/java/com/example"); + Files.createDirectories(testDir); + Files.writeString(testDir.resolve("FooServiceTest.java"), + "package com.example;\npublic class FooServiceTest {}"); + Files.writeString(testDir.resolve("BrokenTest.java"), + "package com.example; public class BrokenTest {"); + + git.add().addFilepattern(".").call(); + git.commit().setMessage("add FooService + broken test sibling").call(); + + AffectedTestsConfig config = AffectedTestsConfig.builder() + .baseRef(base) + .mode(Mode.LOCAL) + .includeUncommitted(false) + .includeStaged(false) + // Need both — usage forces a parse of the broken + // file so we hit DISCOVERY_INCOMPLETE, naming then + // supplies FooServiceTest so we can also assert + // the selection survives parse failure. + .strategies(Set.of("usage", "naming")) + .transitiveDepth(0) + .build(); + + AffectedTestsEngine engine = new AffectedTestsEngine(config, tempDir); + AffectedTestsEngine.AffectedTestsResult result = engine.run(); + + assertEquals(Situation.DISCOVERY_INCOMPLETE, result.situation(), + "LOCAL mode must still observe DISCOVERY_INCOMPLETE so --explain / logs stay truthful"); + assertEquals(Action.SELECTED, result.action(), + "LOCAL mode default for DISCOVERY_INCOMPLETE is SELECTED — iteration speed wins"); + assertFalse(result.runAll(), + "SELECTED must not trigger a full suite run even with parse failures"); + assertTrue(result.testClassFqns().contains("com.example.FooServiceTest"), + "Naming-based tests still get discovered from the files that did parse"); + } + } + + @Test + void discoveryIncompleteRespectsExplicitOnDiscoveryIncompleteOverride() throws Exception { + // Regression for B6-#9 config wiring: the new + // onDiscoveryIncomplete builder knob must override the + // mode-based default. Without this test a future refactor + // that flips the switch inside defaultFor(...) could silently + // ignore explicit user configuration. Here we force CI mode + // (default FULL_SUITE) but explicitly ask for SKIPPED, which + // is the moral equivalent of "I know parse failures exist, + // don't run anything, I'll deal with it." + try (Git git = initRepoWithInitialCommit()) { + String base = git.log().call().iterator().next().getName(); + + Path prodDir = tempDir.resolve("src/main/java/com/example"); + Files.createDirectories(prodDir); + Files.writeString(prodDir.resolve("FooService.java"), + "package com.example;\npublic class FooService {}"); + + Path testDir = tempDir.resolve("src/test/java/com/example"); + Files.createDirectories(testDir); + Files.writeString(testDir.resolve("BrokenTest.java"), + "package com.example; public class BrokenTest {"); + + git.add().addFilepattern(".").call(); + git.commit().setMessage("add FooService + broken test sibling").call(); + + AffectedTestsConfig config = AffectedTestsConfig.builder() + .baseRef(base) + .mode(Mode.CI) + .onDiscoveryIncomplete(Action.SKIPPED) + .includeUncommitted(false) + .includeStaged(false) + .strategies(Set.of("usage")) + .transitiveDepth(0) + .build(); + + AffectedTestsEngine engine = new AffectedTestsEngine(config, tempDir); + AffectedTestsEngine.AffectedTestsResult result = engine.run(); + + assertEquals(Situation.DISCOVERY_INCOMPLETE, result.situation()); + assertEquals(Action.SKIPPED, result.action(), + "Explicit onDiscoveryIncomplete must win over mode default in CI"); + assertFalse(result.runAll()); + assertTrue(result.testClassFqns().isEmpty(), + "SKIPPED must yield no selected tests"); + } + } } diff --git a/affected-tests-core/src/test/java/io/affectedtests/core/config/AffectedTestsConfigTest.java b/affected-tests-core/src/test/java/io/affectedtests/core/config/AffectedTestsConfigTest.java index 631e729..2d49495 100644 --- a/affected-tests-core/src/test/java/io/affectedtests/core/config/AffectedTestsConfigTest.java +++ b/affected-tests-core/src/test/java/io/affectedtests/core/config/AffectedTestsConfigTest.java @@ -152,6 +152,116 @@ void modeStrictEscalatesEverythingAmbiguous() { assertEquals(Action.SKIPPED, config.actionFor(Situation.ALL_FILES_OUT_OF_SCOPE)); } + @Test + void modeCiEscalatesDiscoveryIncompleteByDefault() { + // Regression for B6-#9 resolver wiring: the new situation has + // to map to the mode-specific safety net, not silently default + // to the all-pass legacy value. CI (merge-gate) must escalate + // a parse-failure run to FULL_SUITE because selection is known + // to be under-reported; anything else is a silent safety loss. + AffectedTestsConfig config = AffectedTestsConfig.builder() + .mode(Mode.CI) + .build(); + assertEquals(Action.FULL_SUITE, config.actionFor(Situation.DISCOVERY_INCOMPLETE)); + } + + @Test + void modeLocalKeepsDiscoveryIncompleteSelectedByDefault() { + // Mirror of modeCi test — LOCAL prefers iteration speed over + // safety so defaults to SELECTED (warn-and-continue). If a + // future refactor flips this to FULL_SUITE, every dev's + // `./gradlew affectedTest` on a branch with one syntax error + // suddenly pays a full-suite tax — the regression is painful + // and quiet without this pin. + AffectedTestsConfig config = AffectedTestsConfig.builder() + .mode(Mode.LOCAL) + .build(); + assertEquals(Action.SELECTED, config.actionFor(Situation.DISCOVERY_INCOMPLETE)); + } + + @Test + void modeStrictEscalatesDiscoveryIncompleteToFullSuite() { + // STRICT must never leave DISCOVERY_INCOMPLETE at SELECTED — + // the whole point of STRICT is "any doubt means full suite". + AffectedTestsConfig config = AffectedTestsConfig.builder() + .mode(Mode.STRICT) + .build(); + assertEquals(Action.FULL_SUITE, config.actionFor(Situation.DISCOVERY_INCOMPLETE)); + } + + @Test + void explicitOnDiscoveryIncompleteWinsOverModeDefault() { + // Same contract as every other on* setter: explicit wins. + // Regression catches a refactor that forgets to route + // DISCOVERY_INCOMPLETE through the explicit-tier resolver + // and falls back to mode default. + AffectedTestsConfig config = AffectedTestsConfig.builder() + .mode(Mode.CI) + .onDiscoveryIncomplete(Action.SKIPPED) + .build(); + assertEquals(Action.SKIPPED, config.actionFor(Situation.DISCOVERY_INCOMPLETE)); + } + + @Test + void legacyRunAllIfNoMatchesDoesNotOverrideDiscoveryIncomplete() { + // Regression for B6-#9 legacy-boolean wiring: DISCOVERY_INCOMPLETE + // is a v1.9.22-only situation, so wiring the pre-v2 + // runAllIfNoMatches shim into it would let a CI caller with + // runAllIfNoMatches=false silently opt out of the parse- + // failure safety net that the Situation javadoc and README + // promise. Since the legacy tier wins over mode defaults in + // {@link AffectedTestsConfig#resolveInto}, this test pins + // that the legacy argument is deliberately NOT wired for + // DISCOVERY_INCOMPLETE — CI + runAllIfNoMatches=false must + // still give FULL_SUITE on parse failures, the same as CI + // without the legacy flag. + AffectedTestsConfig config = AffectedTestsConfig.builder() + .mode(Mode.CI) + .runAllIfNoMatches(false) + .build(); + assertEquals(Action.FULL_SUITE, config.actionFor(Situation.DISCOVERY_INCOMPLETE), + "runAllIfNoMatches=false must NOT drag DISCOVERY_INCOMPLETE down to SKIPPED — " + + "this is a v2-only situation with no pre-v2 behaviour to preserve, " + + "so CI mode's safety-net FULL_SUITE must survive"); + assertEquals(ActionSource.MODE_DEFAULT, + config.actionSourceFor(Situation.DISCOVERY_INCOMPLETE), + "Source must report MODE_DEFAULT — the legacy boolean has no say"); + } + + @Test + void gradlewTimeoutDefaultsToZero() { + // Zero means "no deadline" — the pre-v1.9.22 behaviour. + // Defaulting to any positive value would retroactively fail + // every long-running CI on the upgrade, which is why the + // builder defaults to 0 and the Gradle task branches on + // `> 0` before spawning a watchdog. + AffectedTestsConfig config = AffectedTestsConfig.builder().build(); + assertEquals(0L, config.gradlewTimeoutSeconds(), + "Default must remain 0 (no timeout) — any other default is a silent breaking change"); + } + + @Test + void gradlewTimeoutAcceptsPositiveValues() { + AffectedTestsConfig config = AffectedTestsConfig.builder() + .gradlewTimeoutSeconds(1800L) + .build(); + assertEquals(1800L, config.gradlewTimeoutSeconds()); + } + + @Test + void gradlewTimeoutRejectsNegativeValues() { + // Negative durations are nonsensical and in practice always + // mean a user typo (e.g. -1 meaning "no timeout"). Rejecting + // at builder time surfaces the typo immediately instead of + // letting it reach ProcessBuilder where the Long gets + // interpreted as "wait forever" or worse silently overflows + // the TimeUnit conversion. + IllegalArgumentException e = assertThrows(IllegalArgumentException.class, + () -> AffectedTestsConfig.builder().gradlewTimeoutSeconds(-1L).build()); + assertTrue(e.getMessage().toLowerCase().contains("timeout"), + "Error must name the offending setting so the user can find it; got: " + e.getMessage()); + } + @Test void runAllOnNonJavaChangeFalseTranslatesToSelected() { // Pre-v2 callers that opted out of the safety net expected diff --git a/affected-tests-core/src/test/java/io/affectedtests/core/discovery/ProjectIndexTest.java b/affected-tests-core/src/test/java/io/affectedtests/core/discovery/ProjectIndexTest.java index d3617ed..96cec84 100644 --- a/affected-tests-core/src/test/java/io/affectedtests/core/discovery/ProjectIndexTest.java +++ b/affected-tests-core/src/test/java/io/affectedtests/core/discovery/ProjectIndexTest.java @@ -84,6 +84,63 @@ void literalAndGlobFormsAgreeOnTheSameOutcome() throws Exception { "Literal and glob forms of the same logical rule must produce the same index"); } + @Test + void parseFailureCountStartsAtZero() throws Exception { + // Baseline for the B6-#9 regression: a brand-new index with + // no compilationUnit() calls must report zero parse failures, + // otherwise the downstream DISCOVERY_INCOMPLETE routing would + // fire on every run and blow up {@link io.affectedtests.core.AffectedTestsEngine} + // discovery with spurious escalations. + writeJava(projectDir.resolve("src/main/java/com/example/Foo.java"), + "package com.example; public class Foo {}"); + writeJava(projectDir.resolve("src/test/java/com/example/FooTest.java"), + "package com.example; public class FooTest {}"); + + AffectedTestsConfig config = AffectedTestsConfig.builder().mode(Mode.CI).build(); + ProjectIndex index = ProjectIndex.build(projectDir, config); + + assertEquals(0, index.parseFailureCount(), + "A fresh index that never touched compilationUnit() must report zero failures"); + } + + @Test + void parseFailureCountIncrementsOnUnparseableFile() throws Exception { + // Regression for B6-#9: a scanned Java file that JavaParser + // cannot parse must bump parseFailureCount so the engine can + // route through Situation.DISCOVERY_INCOMPLETE instead of + // silently under-reporting discovery. Before the fix the + // cache hid every null return behind containsKey, so the + // engine had no way to tell a clean empty selection apart + // from "parse failed, we saw nothing". + Path prod = projectDir.resolve("src/main/java/com/example/Foo.java"); + writeJava(prod, "package com.example; public class Foo {}"); + // Malformed: missing closing brace, no class declaration. + Path broken = projectDir.resolve("src/main/java/com/example/Broken.java"); + writeJava(broken, "package com.example; public class Broken {"); + + AffectedTestsConfig config = AffectedTestsConfig.builder().mode(Mode.CI).build(); + ProjectIndex index = ProjectIndex.build(projectDir, config); + + // Warm the cache for both files — the working one should + // return a non-null CU, the broken one should return null. + assertNotNull(index.compilationUnit(prod), + "Valid source must still parse — otherwise the counter is counting false positives"); + assertNull(index.compilationUnit(broken), + "Malformed source must still return null (the current parseOrWarn contract)"); + + assertEquals(1, index.parseFailureCount(), + "Broken file must be counted exactly once — de-dup means a repeat call does not re-count"); + + // Hit the cache again to make sure we did not double-count: + // the whole point of counting at the cache boundary is that + // N strategies can ask for the same file and we still report + // a single distinct failure. + index.compilationUnit(broken); + index.compilationUnit(broken); + assertEquals(1, index.parseFailureCount(), + "Repeated compilationUnit() calls on the same file must not inflate the count"); + } + private static void writeJava(Path target, String body) throws Exception { Files.createDirectories(target.getParent()); Files.writeString(target, body); diff --git a/affected-tests-gradle/src/main/java/io/affectedtests/gradle/AffectedTestTask.java b/affected-tests-gradle/src/main/java/io/affectedtests/gradle/AffectedTestTask.java index 0702a42..953b0ac 100644 --- a/affected-tests-gradle/src/main/java/io/affectedtests/gradle/AffectedTestTask.java +++ b/affected-tests-gradle/src/main/java/io/affectedtests/gradle/AffectedTestTask.java @@ -280,6 +280,37 @@ public AffectedTestTask() { @org.gradle.api.tasks.Optional public abstract Property getOnDiscoveryEmpty(); + /** + * Action to take when discovery ran but one or more scanned Java files + * failed to parse (see {@link Situation#DISCOVERY_INCOMPLETE}). One of + * {@code "selected"}, {@code "full_suite"}, {@code "skipped"}. Unset + * falls through to the {@link Mode} default (CI and STRICT escalate + * to {@code full_suite}; LOCAL keeps the partial selection). + * + * @return the on-discovery-incomplete property + */ + @Input + @org.gradle.api.tasks.Optional + public abstract Property getOnDiscoveryIncomplete(); + + /** + * Wall-clock timeout (in seconds) for the nested {@code ./gradlew} + * invocation that runs the affected / full test suite. {@code 0} + * disables the timeout (pre-v1.9.22 default: wait indefinitely). + * Positive values deadline the child: the task destroys the + * process tree after the interval and fails the build. + * + *

Kept {@link org.gradle.api.tasks.Optional @Optional} so that + * zero-config callers skip the timeout entirely — opting in is + * explicit. Must be {@code >= 0}; the core config builder rejects + * negative values at build-config time. + * + * @return the gradlew timeout property in seconds + */ + @Input + @org.gradle.api.tasks.Optional + public abstract Property getGradlewTimeoutSeconds(); + /** * When {@code true}, the task prints the full decision trace * (buckets, situation, action, action source) and exits without @@ -394,7 +425,8 @@ public void runAffectedTests() { executeTests(projectDir, result.testClassFqns(), result.testFqnToPath(), - result.runAll()); + result.runAll(), + config.gradlewTimeoutSeconds()); } /** @@ -451,6 +483,12 @@ private AffectedTestsConfig buildConfig() { if (getOnDiscoveryEmpty().isPresent()) { builder.onDiscoveryEmpty(parseAction(getOnDiscoveryEmpty().get(), "onDiscoveryEmpty")); } + if (getOnDiscoveryIncomplete().isPresent()) { + builder.onDiscoveryIncomplete(parseAction(getOnDiscoveryIncomplete().get(), "onDiscoveryIncomplete")); + } + if (getGradlewTimeoutSeconds().isPresent()) { + builder.gradlewTimeoutSeconds(getGradlewTimeoutSeconds().get()); + } return builder.build(); } @@ -483,7 +521,8 @@ static Action parseAction(String raw, String property) { private void executeTests(Path projectDir, Set testFqns, Map fqnToPath, - boolean runAll) { + boolean runAll, + long gradlewTimeoutSeconds) { String gradleCommand = resolveGradleCommand(projectDir); List args = new ArrayList<>(); @@ -602,14 +641,215 @@ private void executeTests(Path projectDir, args.add("-x"); args.add("compileJava"); + int exitCode = (gradlewTimeoutSeconds > 0) + ? runWithTimeout(args, projectDir, gradlewTimeoutSeconds) + : runWithoutTimeout(args, projectDir); + + if (exitCode != 0) { + throw new GradleException("Test execution failed with exit code " + exitCode); + } + } + + /** + * Pre-v1.9.22 dispatch path: hand the argv to Gradle's + * {@link ExecOperations} and wait for the child to finish with no + * deadline. Kept as the default because {@link ExecOperations} + * integrates with the outer Gradle build's log capture and build- + * scan infrastructure in ways {@link ProcessBuilder} cannot match + * from a plugin. + */ + private int runWithoutTimeout(List args, Path projectDir) { ExecResult execResult = getExecOperations().exec(spec -> { spec.commandLine(args); spec.workingDir(projectDir.toFile()); spec.setIgnoreExitValue(true); }); + return execResult.getExitValue(); + } + + /** + * Watchdog dispatch path used when + * {@code affectedTests.gradlewTimeoutSeconds} is set to a positive + * value. Spawns the child via {@link ProcessBuilder} (so we can + * hold a {@link Process} handle and call {@link Process#destroyForcibly()}), + * {@code inheritIO()}'s stdin/stdout/stderr so the user still sees + * the nested Gradle output in real time, and enforces a wall-clock + * deadline. + * + *

Termination ladder on timeout: + *

    + *
  1. {@link Process#destroy()} — polite SIGTERM; most JVMs react + * to this within a second or two, which gives the test + * runner a chance to flush coverage reports / screenshots + * instead of leaving them half-written.
  2. + *
  3. Wait {@value #TIMEOUT_GRACEFUL_SHUTDOWN_SECONDS}s for the + * polite stop to land.
  4. + *
  5. Fall back to {@link Process#destroyForcibly()} — SIGKILL — + * so a child that swallowed the SIGTERM cannot keep holding + * the CI worker hostage.
  6. + *
  7. Give the kernel another + * {@value #TIMEOUT_FORCIBLE_SHUTDOWN_SECONDS}s to reap the + * process and fail the build either way.
  8. + *
+ * + *

Process-tree semantics: {@code ./gradlew} almost always + * connects to a shared Gradle daemon JVM, so the wrapper process + * is the parent of the daemon only transiently; the test JVM + * that's actually hung lives as a grandchild (shared daemon) or + * a sibling (daemon reused across invocations). Killing only the + * wrapper would therefore leave the hung workload running and + * defeat the whole point of this knob. {@link #shutdownChild} + * snapshots {@link ProcessHandle#descendants()} before the first + * signal and {@code destroyForcibly}-es every still-live + * descendant on the forcible leg. Operators who want a true + * "no grace period, no daemon reuse" deadline can pass + * {@code --no-daemon} at the outer build level. + * + *

Interrupt handling: if the outer Gradle build is cancelled + * (Ctrl-C, Gradle daemon shutdown), the watchdog propagates the + * cancellation by {@code destroyForcibly}-ing the wrapper and + * every snapshotted descendant, then re-asserts the interrupt + * on the caller thread. + */ + // Package-private so AffectedTestTaskTimeoutTest can drive the + // watchdog directly against a controllable child process. The + // outer `executeTests` caller is still the only production call + // site; exposing at package level is the smallest surface that + // lets the regression test pin the kill-ladder behaviour without + // rebuilding a DefaultTask harness. + int runWithTimeout(List args, Path projectDir, long timeoutSeconds) { + ProcessBuilder pb = new ProcessBuilder(args); + pb.directory(projectDir.toFile()); + // Inherit stdin/stdout/stderr so the child's output (Gradle's + // test task output, test report progress, etc.) streams to + // the operator's terminal exactly as it did on the + // ExecOperations path. We trade build-scan stream capture + // for the ability to kill the child on timeout; opting in to + // the timeout is opting in to that trade. + pb.inheritIO(); + Process process; + try { + process = pb.start(); + } catch (java.io.IOException e) { + throw new GradleException( + "Affected Tests: failed to spawn nested Gradle invocation for timeout dispatch: " + + LogSanitizer.sanitize(e.getMessage()), + e); + } + try { + boolean finished = process.waitFor(timeoutSeconds, java.util.concurrent.TimeUnit.SECONDS); + if (finished) { + return process.exitValue(); + } + getLogger().error( + "Affected Tests: nested Gradle invocation exceeded configured timeout of {}s; " + + "terminating the child process tree.", + timeoutSeconds); + shutdownChild(process); + throw new GradleException( + "Affected Tests: nested Gradle invocation exceeded the configured timeout of " + + timeoutSeconds + "s (affectedTests.gradlewTimeoutSeconds). " + + "The child process has been terminated. Raise the timeout or " + + "investigate a hung test — see the output above for progress."); + } catch (InterruptedException e) { + // Outer build was cancelled or the daemon is shutting down. + // Kill the whole tree immediately so we do not leave + // orphaned Gradle daemons / test JVMs on the runner, then + // re-assert the interrupt so Gradle's own shutdown path + // can continue. Snapshotting descendants before + // destroyForcibly on the wrapper mirrors shutdownChild — + // once the wrapper exits, re-parented daemons become + // unreachable from process.descendants(). + List descendants = process.descendants() + .collect(Collectors.toList()); + process.destroyForcibly(); + for (ProcessHandle child : descendants) { + if (child.isAlive()) { + child.destroyForcibly(); + } + } + Thread.currentThread().interrupt(); + throw new GradleException( + "Affected Tests: nested Gradle invocation interrupted before completion.", e); + } + } - if (execResult.getExitValue() != 0) { - throw new GradleException("Test execution failed with exit code " + execResult.getExitValue()); + private static final int TIMEOUT_GRACEFUL_SHUTDOWN_SECONDS = 10; + private static final int TIMEOUT_FORCIBLE_SHUTDOWN_SECONDS = 5; + + /** + * Shuts down the child process tree: sends {@link Process#destroy()} + * to the wrapper first, waits for a grace period, then escalates + * to {@link Process#destroyForcibly()} on the wrapper and every + * still-live descendant. + * + *

Killing only the wrapper is not enough because + * {@code ./gradlew} connects to a shared Gradle daemon JVM that + * outlives the wrapper. On the timeout path the daemon is running + * a test JVM — that's the actually-hung process that kept the + * runner busy — and if we only reap the wrapper the daemon / test + * JVM keep holding the runner hostage, which defeats the entire + * point of the watchdog. We therefore snapshot + * {@link ProcessHandle#descendants()} before the first signal + * (so any daemon / test-JVM grandchildren are captured while + * still reachable via the wrapper's process tree) and + * {@code destroyForcibly} each descendant on the forcible leg. + * + *

The graceful leg still targets only the wrapper so SIGTERM- + * aware test runners get their normal shutdown hook window to + * flush coverage reports. Descendants are then reaped + * {@code destroyForcibly} regardless of whether the wrapper + * exited gracefully or had to be SIGKILLed, because a SIGTERM- + * responsive wrapper (plain {@code sh}, most test runners) + * exits on its own but leaves its children re-parented to + * pid 1 — those grandchildren are the real hung workload and + * would keep holding the CI worker hostage otherwise. Operators + * who need "hard cutoff, no grace" semantics can still run the + * outer build with {@code --no-daemon}, which re-parents the + * test JVM as a direct child of the wrapper and makes the polite + * leg already effective against the daemon too. + */ + private void shutdownChild(Process process) { + // Snapshot the descendant tree before we signal — once the + // wrapper exits the grandchildren may re-parent to pid 1 and + // {@link ProcessHandle#descendants()} on the wrapper will + // return empty, leaving orphaned Gradle daemons alive. + List descendants = process.descendants() + .collect(Collectors.toList()); + process.destroy(); + try { + if (!process.waitFor(TIMEOUT_GRACEFUL_SHUTDOWN_SECONDS, + java.util.concurrent.TimeUnit.SECONDS)) { + process.destroyForcibly(); + // Best-effort wait — if SIGKILL itself does not reap + // the child within this window something is deeply + // wrong with the host, but we still want to return + // to the caller so the build can fail cleanly rather + // than hang here forever. + process.waitFor(TIMEOUT_FORCIBLE_SHUTDOWN_SECONDS, + java.util.concurrent.TimeUnit.SECONDS); + } + // Reap any still-live descendants regardless of whether + // the wrapper exited gracefully or had to be SIGKILLed. + // A SIGTERM-responsive wrapper (plain `sh`, most test + // runners) exits on destroy() but leaves its children + // (daemon / test JVM) re-parented to pid 1 — those + // grandchildren are the real hung workload and would + // keep holding the CI worker hostage if we returned + // here without reaping them. + for (ProcessHandle child : descendants) { + if (child.isAlive()) { + child.destroyForcibly(); + } + } + } catch (InterruptedException e) { + process.destroyForcibly(); + for (ProcessHandle child : descendants) { + if (child.isAlive()) { + child.destroyForcibly(); + } + } + Thread.currentThread().interrupt(); } } @@ -789,6 +1029,8 @@ static String describeEscalation(EscalationReason reason) { "onAllFilesIgnored=FULL_SUITE — every changed file matched ignorePaths"; case RUN_ALL_ON_ALL_FILES_OUT_OF_SCOPE -> "onAllFilesOutOfScope=FULL_SUITE — every changed file sat under out-of-scope dirs"; + case RUN_ALL_ON_DISCOVERY_INCOMPLETE -> + "onDiscoveryIncomplete=FULL_SUITE — discovery observed unparseable Java files, selection may be incomplete"; case NONE -> throw new IllegalStateException( "describeEscalation must not be called for EscalationReason.NONE; " + "the engine should only produce NONE on non-runAll results"); @@ -939,9 +1181,11 @@ static List renderExplainTrace(AffectedTestsConfig config, AffectedTests * package-private so the explain-format tests can pin the exact * conditions without spinning up Gradle. * - *

Gate: only {@link Situation#DISCOVERY_SUCCESS} and - * {@link Situation#DISCOVERY_EMPTY} can have been influenced by an - * out-of-scope pattern. The other four situations reach their + *

Gate: only {@link Situation#DISCOVERY_SUCCESS}, + * {@link Situation#DISCOVERY_EMPTY}, and {@link Situation#DISCOVERY_INCOMPLETE} + * can have been influenced by an out-of-scope pattern — all three + * run the mapper over the full diff and only then route post- + * discovery. The other four situations reach their * outcome for reasons the out-of-scope bucket cannot change: * {@link Situation#EMPTY_DIFF} (no files to match), * {@link Situation#ALL_FILES_IGNORED} (ignore wins before @@ -963,7 +1207,8 @@ static void appendOutOfScopeHint(List lines, AffectedTestsResult result) { Situation situation = result.situation(); if (situation != Situation.DISCOVERY_SUCCESS - && situation != Situation.DISCOVERY_EMPTY) { + && situation != Situation.DISCOVERY_EMPTY + && situation != Situation.DISCOVERY_INCOMPLETE) { return; } if (result.changedFiles().isEmpty()) { @@ -1022,11 +1267,20 @@ private static void appendSample(List lines, String label, Set f * referencing another doc. */ private static List situationOrder() { + // Mirrors the engine's evaluation order in + // {@link io.affectedtests.core.AffectedTestsEngine#run()}: + // DISCOVERY_INCOMPLETE short-circuits before DISCOVERY_EMPTY / + // DISCOVERY_SUCCESS whenever parseFailureCount > 0. The + // --explain trace has to list the situations in the same + // order an operator reads the engine logs, otherwise the + // matrix reads as a lie when a parse-failure run elides + // DISCOVERY_EMPTY. return List.of( Situation.EMPTY_DIFF, Situation.ALL_FILES_IGNORED, Situation.ALL_FILES_OUT_OF_SCOPE, Situation.UNMAPPED_FILE, + Situation.DISCOVERY_INCOMPLETE, Situation.DISCOVERY_EMPTY, Situation.DISCOVERY_SUCCESS ); @@ -1085,7 +1339,7 @@ static LogLine renderSummary(AffectedTestsResult result) { prefix + "{} changed file(s); {}.", new Object[] { result.changedFiles().size(), - describeSkipReason(result.situation()) + describeSkipReason(result.situation(), result.action()) }); } return new LogLine( @@ -1100,16 +1354,31 @@ static LogLine renderSummary(AffectedTestsResult result) { /** * Renders a {@link Situation} as a short human-readable phrase * suitable for the summary line's {@code SKIPPED} branch. Only the - * five "ambiguous" situations can legitimately resolve to + * ambiguous situations can legitimately resolve to * {@link Action#SKIPPED}; {@link Situation#DISCOVERY_SUCCESS} is * rejected because the engine never skips when it found tests. * + *

Takes the resolved {@link Action} because one situation + * ({@link Situation#DISCOVERY_INCOMPLETE}) can legitimately reach + * the "skipped" rendering path with {@code SELECTED} as well as + * {@code SKIPPED}. The engine routes + * {@code DISCOVERY_INCOMPLETE + SELECTED} with an empty selection + * through the skipped-summary branch in the task summary (the + * `skipped` flag is set by {@link io.affectedtests.core.AffectedTestsEngine#emptyResult} + * whenever a non-SUCCESS situation winds up with zero tests), + * so a situation-only rendering would print the {@code SKIPPED} + * literal for a run the user actually configured as {@code SELECTED}. + * Branching on the action keeps the phrase truthful without + * requiring operators to reconcile two different action labels + * in the same summary line. + * *

Package-private so {@code AffectedTestTaskLogFormatTest} can * pin the exact wording — mirrors {@link #describeEscalation} so * the two halves of the summary log share one vocabulary. */ - static String describeSkipReason(Situation situation) { + static String describeSkipReason(Situation situation, Action action) { Objects.requireNonNull(situation, "situation"); + Objects.requireNonNull(action, "action"); return switch (situation) { case EMPTY_DIFF -> "no changed files detected"; @@ -1121,6 +1390,9 @@ static String describeSkipReason(Situation situation) { "onUnmappedFile=SKIPPED — non-Java or unmapped file in diff"; case DISCOVERY_EMPTY -> "no affected tests discovered"; + case DISCOVERY_INCOMPLETE -> action == Action.SELECTED + ? "onDiscoveryIncomplete=SELECTED — no affected tests matched the parsed files" + : "onDiscoveryIncomplete=SKIPPED — discovery observed unparseable files"; case DISCOVERY_SUCCESS -> throw new IllegalStateException( "describeSkipReason must not be called for DISCOVERY_SUCCESS; " + "the engine only produces that situation on a non-empty " diff --git a/affected-tests-gradle/src/main/java/io/affectedtests/gradle/AffectedTestsExtension.java b/affected-tests-gradle/src/main/java/io/affectedtests/gradle/AffectedTestsExtension.java index bf17120..17f348c 100644 --- a/affected-tests-gradle/src/main/java/io/affectedtests/gradle/AffectedTestsExtension.java +++ b/affected-tests-gradle/src/main/java/io/affectedtests/gradle/AffectedTestsExtension.java @@ -257,4 +257,37 @@ public abstract class AffectedTestsExtension { * @return the on-discovery-empty property */ public abstract Property getOnDiscoveryEmpty(); + + /** + * Action to take when discovery ran but at least one scanned Java file + * failed to parse — see {@code Situation.DISCOVERY_INCOMPLETE} on the + * core module for the full rationale. One of {@code "selected"}, + * {@code "full_suite"}, {@code "skipped"}. + * + *

Unset falls through to the {@link #getMode() mode} default + * (CI and STRICT escalate to {@code full_suite}; LOCAL keeps the + * partial selection). The legacy {@code runAllIfNoMatches=true} shim + * also maps to {@code full_suite} here because it is the closest + * existing signal for "I don't trust a silent no-signal result". + * + * @return the on-discovery-incomplete property + */ + public abstract Property getOnDiscoveryIncomplete(); + + /** + * Wall-clock timeout in seconds for the nested {@code ./gradlew} + * invocation that executes the affected / full test suite. + * {@code 0} disables the timeout (pre-v1.9.22 default — wait + * indefinitely). Positive values deadline the child process: the + * plugin destroys the process tree after the interval and fails + * the build with a clear error. + * + *

Recommended values: {@code 1800} (30 min) for merge-gate + * unit-test runs, {@code 3600} (1 hour) for suites that include + * integration tests. Must be {@code >= 0}; the core config + * builder rejects negative values at build-config time. + * + * @return the gradlew timeout property in seconds + */ + public abstract Property getGradlewTimeoutSeconds(); } diff --git a/affected-tests-gradle/src/main/java/io/affectedtests/gradle/AffectedTestsPlugin.java b/affected-tests-gradle/src/main/java/io/affectedtests/gradle/AffectedTestsPlugin.java index ff68238..02c8c6d 100644 --- a/affected-tests-gradle/src/main/java/io/affectedtests/gradle/AffectedTestsPlugin.java +++ b/affected-tests-gradle/src/main/java/io/affectedtests/gradle/AffectedTestsPlugin.java @@ -85,6 +85,8 @@ public void apply(Project project) { task.getOnAllFilesOutOfScope().set(extension.getOnAllFilesOutOfScope()); task.getOnUnmappedFile().set(extension.getOnUnmappedFile()); task.getOnDiscoveryEmpty().set(extension.getOnDiscoveryEmpty()); + task.getOnDiscoveryIncomplete().set(extension.getOnDiscoveryIncomplete()); + task.getGradlewTimeoutSeconds().set(extension.getGradlewTimeoutSeconds()); task.getRootDir().set(rootDir); task.getSubprojectPaths().set(project.provider(() -> collectSubprojectPaths(rootProject))); diff --git a/affected-tests-gradle/src/test/java/io/affectedtests/gradle/AffectedTestTaskExplainFormatTest.java b/affected-tests-gradle/src/test/java/io/affectedtests/gradle/AffectedTestTaskExplainFormatTest.java index 0b48cd8..c5c843b 100644 --- a/affected-tests-gradle/src/test/java/io/affectedtests/gradle/AffectedTestTaskExplainFormatTest.java +++ b/affected-tests-gradle/src/test/java/io/affectedtests/gradle/AffectedTestTaskExplainFormatTest.java @@ -234,6 +234,13 @@ void matrixIncludesEveryNonSuccessSituationInStableOrder() { Situation.ALL_FILES_IGNORED, Situation.ALL_FILES_OUT_OF_SCOPE, Situation.UNMAPPED_FILE, + // DISCOVERY_INCOMPLETE must appear before + // DISCOVERY_EMPTY / DISCOVERY_SUCCESS because the + // engine evaluates parse-failure escalation first + // (see AffectedTestsEngine#run). Without this ordering + // pin the --explain matrix would silently drift to any + // row position and stop mirroring the engine. + Situation.DISCOVERY_INCOMPLETE, Situation.DISCOVERY_EMPTY, Situation.DISCOVERY_SUCCESS); int previous = -1; diff --git a/affected-tests-gradle/src/test/java/io/affectedtests/gradle/AffectedTestTaskLogFormatTest.java b/affected-tests-gradle/src/test/java/io/affectedtests/gradle/AffectedTestTaskLogFormatTest.java index f47b5c8..0a20343 100644 --- a/affected-tests-gradle/src/test/java/io/affectedtests/gradle/AffectedTestTaskLogFormatTest.java +++ b/affected-tests-gradle/src/test/java/io/affectedtests/gradle/AffectedTestTaskLogFormatTest.java @@ -13,6 +13,7 @@ import java.util.Set; import static org.junit.jupiter.api.Assertions.assertEquals; +import static org.junit.jupiter.api.Assertions.assertFalse; import static org.junit.jupiter.api.Assertions.assertThrows; import static org.junit.jupiter.api.Assertions.assertTrue; @@ -115,7 +116,12 @@ void skippedBranchNamesReasonPhraseForEachSituation() { Situation.ALL_FILES_IGNORED, "every changed file matched ignorePaths", Situation.ALL_FILES_OUT_OF_SCOPE, "every changed file sat under out-of-scope dirs", Situation.UNMAPPED_FILE, "non-Java or unmapped file in diff", - Situation.DISCOVERY_EMPTY, "no affected tests discovered"); + Situation.DISCOVERY_EMPTY, "no affected tests discovered", + // DISCOVERY_INCOMPLETE + SKIPPED phrase — the action- + // SELECTED variant has its own distinct rendering and + // its own regression test below, so this map locks + // only the SKIPPED branch. + Situation.DISCOVERY_INCOMPLETE, "discovery observed unparseable files"); expected.forEach((situation, phrase) -> { AffectedTestsResult skipped = new AffectedTestsResult( @@ -141,15 +147,60 @@ void describeSkipReasonRejectsDiscoverySuccess() { // returned tests we never skip. The helper must fail loudly so // such a drift cannot surface a placeholder phrase in CI. assertThrows(IllegalStateException.class, - () -> AffectedTestTask.describeSkipReason(Situation.DISCOVERY_SUCCESS), - "describeSkipReason(DISCOVERY_SUCCESS) must throw — that combination is an " + () -> AffectedTestTask.describeSkipReason(Situation.DISCOVERY_SUCCESS, Action.SKIPPED), + "describeSkipReason(DISCOVERY_SUCCESS, *) must throw — that combination is an " + "engine contract violation, not a log-formatting concern"); } @Test void describeSkipReasonRejectsNull() { assertThrows(NullPointerException.class, - () -> AffectedTestTask.describeSkipReason(null)); + () -> AffectedTestTask.describeSkipReason(null, Action.SKIPPED)); + assertThrows(NullPointerException.class, + () -> AffectedTestTask.describeSkipReason(Situation.EMPTY_DIFF, null)); + } + + @Test + void discoveryIncompleteSelectedBranchDoesNotClaimSkipped() { + // Regression for the B6-#9 summary-line bug: DISCOVERY_INCOMPLETE + // with action=SELECTED and an empty selection (the LOCAL-mode + // default) winds up on the skipped-summary branch via + // emptyResult's skipped-flag rule, but the summary prefix + // still correctly reads `SELECTED (DISCOVERY_INCOMPLETE)`. + // Before the action-aware rendering, the reason phrase said + // `onDiscoveryIncomplete=SKIPPED — ...`, producing a line that + // contained both `SELECTED` and `SKIPPED` and contradicted + // itself. Pinning the truthful variant keeps renderer and + // resolver honest. + AffectedTestsResult result = new AffectedTestsResult( + Set.of(), Map.of(), + Set.of("src/main/java/com/example/Foo.java"), + Set.of(), Set.of(), + Buckets.empty(), + false, true, + Situation.DISCOVERY_INCOMPLETE, Action.SELECTED, + EscalationReason.NONE); + String summary = render(AffectedTestTask.renderSummary(result)); + + assertTrue(summary.contains("SELECTED (DISCOVERY_INCOMPLETE)"), + "Prefix must name the resolved action, not the skipped flag"); + assertFalse(summary.contains("SKIPPED"), + "Summary must not contain the SKIPPED literal when action=SELECTED; got: " + summary); + assertTrue(summary.contains("no affected tests matched the parsed files"), + "Reason phrase must describe the SELECTED+empty outcome honestly; got: " + summary); + } + + @Test + void describeSkipReasonBranchesOnActionForDiscoveryIncomplete() { + // Unit-level pin for the helper directly — protects against a + // refactor that loses the action branch inside describeSkipReason + // even if the summary renderer swallows the difference. + assertEquals( + "onDiscoveryIncomplete=SKIPPED — discovery observed unparseable files", + AffectedTestTask.describeSkipReason(Situation.DISCOVERY_INCOMPLETE, Action.SKIPPED)); + assertEquals( + "onDiscoveryIncomplete=SELECTED — no affected tests matched the parsed files", + AffectedTestTask.describeSkipReason(Situation.DISCOVERY_INCOMPLETE, Action.SELECTED)); } @Test diff --git a/affected-tests-gradle/src/test/java/io/affectedtests/gradle/AffectedTestTaskTimeoutTest.java b/affected-tests-gradle/src/test/java/io/affectedtests/gradle/AffectedTestTaskTimeoutTest.java new file mode 100644 index 0000000..92f7785 --- /dev/null +++ b/affected-tests-gradle/src/test/java/io/affectedtests/gradle/AffectedTestTaskTimeoutTest.java @@ -0,0 +1,203 @@ +package io.affectedtests.gradle; + +import org.gradle.api.GradleException; +import org.gradle.api.Project; +import org.gradle.testfixtures.ProjectBuilder; +import org.junit.jupiter.api.Test; +import org.junit.jupiter.api.condition.DisabledOnOs; +import org.junit.jupiter.api.condition.OS; +import org.junit.jupiter.api.io.TempDir; + +import java.nio.file.Path; +import java.util.List; + +import static org.junit.jupiter.api.Assertions.assertFalse; +import static org.junit.jupiter.api.Assertions.assertThrows; +import static org.junit.jupiter.api.Assertions.assertTrue; + +/** + * Regression tests for {@link AffectedTestTask#runWithTimeout} — the + * B6-#11 watchdog path. Without this suite, the entire timeout surface + * (branch selection, kill ladder, child-tree teardown) would only be + * verified by eyeballing CI logs the day a hung test ships. + * + *

The tests drive {@code runWithTimeout} against a deliberately + * hung {@code sh -c "sleep 60"} child. The watchdog's wall-clock + * budget is 2 seconds; on a timeout the test then asserts: + *

    + *
  • a {@link GradleException} is thrown with a message that names + * the setting the operator has to tune,
  • + *
  • the child PID is no longer alive within the ladder's budget + * (timeout + grace + forcible = 2 + 10 + 5 = 17 seconds, plus + * a generous headroom for slow CI).
  • + *
+ * + *

Skipped on Windows because the spawn helpers (`sh`, `sleep`) are + * POSIX-specific. The plugin's overall surface already assumes a + * POSIX-like runner for `./gradlew` wrapper invocation, so skipping + * here doesn't lose coverage that exists elsewhere in the suite. + */ +@DisabledOnOs(OS.WINDOWS) +class AffectedTestTaskTimeoutTest { + + @TempDir + Path tempDir; + + private AffectedTestTask taskFor(Project project) { + project.getPlugins().apply("io.github.vedanthvdev.affectedtests"); + return (AffectedTestTask) project.getTasks().findByName("affectedTest"); + } + + @Test + void hungChildIsKilledWithinLadderBudget() throws Exception { + // A 60-second `sleep` represents the "hung test JVM" failure + // mode the watchdog exists to protect against. The watchdog + // budget is 2 seconds; if the ladder works the child must be + // terminated well before the 60-second sleep would naturally + // exit. Pre-fix, a future refactor that swapped + // destroy()/destroyForcibly() for waitFor-with-no-destroy + // would leave the child alive and this test would fail with + // a still-running PID. + Project project = ProjectBuilder.builder().build(); + AffectedTestTask task = taskFor(project); + + List args = List.of("sh", "-c", "sleep 60"); + + long start = System.nanoTime(); + GradleException thrown = assertThrows(GradleException.class, + () -> task.runWithTimeout(args, tempDir, 2L), + "Watchdog must translate a missed deadline into a GradleException — " + + "otherwise CI sees a green exit code for a killed build"); + long elapsedMs = (System.nanoTime() - start) / 1_000_000L; + + assertTrue(thrown.getMessage().contains("exceeded the configured timeout"), + "Message must state the actual failure mode so operators can find the right knob; got: " + + thrown.getMessage()); + assertTrue(thrown.getMessage().contains("affectedTests.gradlewTimeoutSeconds"), + "Message must name the DSL knob so operators know what to raise; got: " + + thrown.getMessage()); + + // Ladder budget: 2s timeout + 10s graceful + 5s forcible = 17s. + // Allow 30s headroom for slow CI runners. If we spend longer + // than that, something in the ladder is waiting forever — + // precisely the regression this test exists to catch. + assertTrue(elapsedMs < 30_000L, + "Watchdog must return within the ladder budget (2+10+5s) plus headroom; took " + + elapsedMs + "ms"); + } + + @Test + void sigtermSwallowingChildEscalatesToForcibleKill() throws Exception { + // A wrapper that swallows SIGTERM is the exact failure mode + // the forcible leg of the ladder exists for. Graceful + // destroy() does nothing, waitFor(graceful) times out, and + // the watchdog has to escalate to destroyForcibly() to + // actually reap the process. A refactor that dropped the + // forcible leg (e.g. "just call destroy() and trust the + // runtime") would make this test hang until the ladder's + // outer wait returns, pushing elapsed time close to 60s and + // flipping the assertion red. + Project project = ProjectBuilder.builder().build(); + AffectedTestTask task = taskFor(project); + + long before = System.nanoTime(); + assertThrows(GradleException.class, + () -> task.runWithTimeout( + List.of("sh", "-c", "trap '' TERM; sleep 60"), + tempDir, 2L)); + long elapsedMs = (System.nanoTime() - before) / 1_000_000L; + + assertTrue(elapsedMs < 30_000L, + "Forcible leg must reap a SIGTERM-swallowing wrapper within ladder budget; took " + + elapsedMs + "ms (this regresses to ~60s if destroyForcibly is dropped " + + "from shutdownChild)"); + } + + @Test + void successfulChildReturnsExitCodeBeforeTimeout() throws Exception { + // The happy path: child finishes well before the deadline. + // Pre-fix, a refactor that accidentally inverted the + // `finished` branch (throwing on success, returning on + // timeout) would ship a plugin that always fails green + // builds. We pin both the exit code and the no-throw + // behaviour here. + Project project = ProjectBuilder.builder().build(); + AffectedTestTask task = taskFor(project); + + int exit = task.runWithTimeout(List.of("sh", "-c", "exit 0"), tempDir, 5L); + assertFalse(exit != 0, + "Watchdog must propagate exit code from a child that finishes before the deadline"); + + int exit17 = task.runWithTimeout(List.of("sh", "-c", "exit 17"), tempDir, 5L); + assertTrue(exit17 == 17, + "Watchdog must propagate non-zero exit codes verbatim so the outer " + + "Gradle invocation can surface the real test failure; got " + exit17); + } + + @Test + void descendantsAreTerminatedNotOnlyTheWrapper() throws Exception { + // The core fix for the grandchild-survival bug: if + // shutdownChild only destroys the wrapper and not its + // descendants, any shared Gradle daemon / test JVM lives on + // as an orphan (re-parented to pid 1) and defeats the + // watchdog's whole reason for existing. + // + // To actually prove the grandchild is dead — not just that + // runWithTimeout returned in time — we write the grandchild's + // PID to a pidfile from inside the spawned shell. After + // runWithTimeout surfaces its GradleException, we read the + // pidfile and assert the PID is no longer alive. + // + // Without the descendants-kill loop, the grandchild survives + // the wrapper's destroyForcibly() and ProcessHandle.of(pid) + // shows isAlive()==true, failing this assertion. + Project project = ProjectBuilder.builder().build(); + AffectedTestTask task = taskFor(project); + + Path pidFile = tempDir.resolve("grandchild.pid"); + + // Outer sh spawns an inner sh that writes its own PID to the + // pidfile and execs `sleep 60` (keeping the same PID so the + // pidfile is accurate). `wait` in the outer sh keeps the + // Java-side Process alive until the grandchild dies or the + // watchdog reaps the tree. + String script = + "sh -c 'echo $$ > \"" + pidFile.toAbsolutePath() + "\"; exec sleep 60' & wait"; + long before = System.nanoTime(); + assertThrows(GradleException.class, + () -> task.runWithTimeout( + List.of("sh", "-c", script), tempDir, 2L)); + long elapsedMs = (System.nanoTime() - before) / 1_000_000L; + + assertTrue(java.nio.file.Files.exists(pidFile), + "Test harness invariant: grandchild must have written its PID before the " + + "watchdog fired; missing pidfile means the spawn never reached exec"); + + long grandchildPid = Long.parseLong( + java.nio.file.Files.readString(pidFile).trim()); + + // Give the OS a short beat to reap the forcibly-killed + // grandchild before we sample ProcessHandle — on busy CI + // the destroyForcibly signal is in flight when we check. + Thread.sleep(500); + + boolean grandchildAlive = ProcessHandle.of(grandchildPid) + .map(ProcessHandle::isAlive) + .orElse(false); + + if (grandchildAlive) { + // Clean up so the test doesn't leak a 60-second sleeper + // on CI before failing. + ProcessHandle.of(grandchildPid).ifPresent(ProcessHandle::destroyForcibly); + } + assertFalse(grandchildAlive, + "Descendant PID " + grandchildPid + " survived the watchdog — " + + "shutdownChild reaped the wrapper but left the grandchild orphaned. " + + "This is the exact bug the snapshot-then-destroyForcibly loop closes."); + + // Sanity: we also expect the whole ladder to finish well + // under the 2+10+5 budget plus headroom. + assertTrue(elapsedMs < 30_000L, + "Watchdog must complete within ladder budget; took " + elapsedMs + "ms"); + } +} diff --git a/affected-tests-gradle/src/test/java/io/affectedtests/gradle/AffectedTestsPluginTest.java b/affected-tests-gradle/src/test/java/io/affectedtests/gradle/AffectedTestsPluginTest.java index 8610c65..5d0d7e6 100644 --- a/affected-tests-gradle/src/test/java/io/affectedtests/gradle/AffectedTestsPluginTest.java +++ b/affected-tests-gradle/src/test/java/io/affectedtests/gradle/AffectedTestsPluginTest.java @@ -102,4 +102,47 @@ void taskInputsAreWiredFromExtension() { assertFalse(task.getRunAllOnNonJavaChange().get(), "Task input must reflect the extension's runAllOnNonJavaChange override"); } + + @Test + void extensionExposesOnDiscoveryIncompleteAndGradlewTimeout() { + // Regression for B6-#9 + B6-#11 DSL wiring: missing setters + // in AffectedTestsExtension would compile-break the build but + // a typo between the extension and the task ( `getOnDiscoveryIncomplete` + // vs `getOnDiscoveryIncompleteAction`) would only show up at + // runtime when a user configured the DSL. Pin the names and + // round-trip the values so that refactor doesn't silently + // drop the wiring. + Project project = ProjectBuilder.builder().build(); + project.getPlugins().apply("io.github.vedanthvdev.affectedtests"); + + AffectedTestsExtension ext = project.getExtensions() + .getByType(AffectedTestsExtension.class); + ext.getOnDiscoveryIncomplete().set("SKIPPED"); + ext.getGradlewTimeoutSeconds().set(1800L); + + AffectedTestTask task = (AffectedTestTask) project.getTasks().findByName("affectedTest"); + assertNotNull(task); + assertEquals("SKIPPED", task.getOnDiscoveryIncomplete().get(), + "Task must expose the extension's onDiscoveryIncomplete setting byte-for-byte"); + assertEquals(1800L, task.getGradlewTimeoutSeconds().get(), + "Task must expose the extension's gradlewTimeoutSeconds setting byte-for-byte"); + } + + @Test + void gradlewTimeoutAndDiscoveryIncompleteAreUnconfiguredByDefault() { + // Defaults must stay absent — presence-based defaulting is how + // the core config builder distinguishes "user didn't say" + // (fall through to mode defaults) from "user set 0" (explicit + // no-timeout). Installing a convention would collapse those + // two cases and silently change mode-default behaviour. + Project project = ProjectBuilder.builder().build(); + project.getPlugins().apply("io.github.vedanthvdev.affectedtests"); + + AffectedTestsExtension ext = project.getExtensions() + .getByType(AffectedTestsExtension.class); + assertFalse(ext.getOnDiscoveryIncomplete().isPresent(), + "onDiscoveryIncomplete must have no convention so the core resolver picks the mode default"); + assertFalse(ext.getGradlewTimeoutSeconds().isPresent(), + "gradlewTimeoutSeconds must have no convention so the task branch between watchdog/exec paths stays correct"); + } }