Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
107 changes: 107 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,113 @@ adheres to [Semantic Versioning](https://semver.org/).

## [Unreleased]

### Added — discovery-incomplete signal + nested `gradlew` timeout (v1.9.22)

The two deferred findings from the v1.9.21 reliability batch: both close
long-tail silent-failure modes that could leave an operator convinced a
merge-gate run was green when it was in fact half-blind or stuck.

- **`Situation.DISCOVERY_INCOMPLETE` now fires whenever any scanned Java
file fails to parse.** Before this release, `JavaParsers.parseOrWarn`
emitted a `WARN` and returned `null`, and the strategy call site
silently continued — the dispatch map was built from whatever happened
to parse. For a branch with one mid-refactor syntax error that meant
`DISCOVERY_SUCCESS` with a quietly-shrunk selection, which is exactly
the scenario where the merge-gate safety net needs to pay out.
`ProjectIndex#compilationUnit` now increments a per-run
`parseFailureCount` at the cache boundary (de-duplicated across
strategies — a file that fails to parse once counts once, no matter
how many of naming / usage / implementation / transitive ask for it).
`AffectedTestsEngine#run` consults that count after discovery and
routes through the new situation before the old `DISCOVERY_EMPTY` /
`DISCOVERY_SUCCESS` branches. The mode-default resolution matches the
existing asymmetry — `CI` and `STRICT` escalate to `FULL_SUITE`
(safety wins over speed on merge-gate runs), `LOCAL` stays on
`SELECTED` (iteration speed wins for the developer actively editing
the broken file, who already has the `WARN` in their terminal). The
new knob is configurable via the Gradle DSL:
```groovy
affectedTests {
onDiscoveryIncomplete = "FULL_SUITE" // or "SELECTED" / "SKIPPED"
}
```
A matching `EscalationReason.RUN_ALL_ON_DISCOVERY_INCOMPLETE` feeds
`--explain` and the `affectedTest` task's skip-reason / escalation
copy so operators can tell at a glance whether a full-suite run was
triggered by an empty diff, an unmapped file, or a parse failure.
Regression coverage:
- `ProjectIndexTest.parseFailureCountIncrementsOnUnparseableFile`
pins the cache-boundary counting contract and the de-duplication
behaviour (three calls on the same broken file still count as
one failure).
- `AffectedTestsEngineTest.discoveryIncompleteEscalatesToFullSuiteInCI`
and `discoveryIncompleteStillSelectsInLocalMode` pin the
CI-vs-LOCAL asymmetry.
- `discoveryIncompleteRespectsExplicitOnDiscoveryIncompleteOverride`
pins the explicit-wins-over-mode tier.
- Three `AffectedTestsConfigTest` cases pin the mode defaults for
`CI`, `LOCAL`, and `STRICT` so a future refactor of the resolver
can't silently bump zero-config users.
- **New `affectedTests.gradlewTimeoutSeconds` knob deadlines the nested
`./gradlew` invocation.** The task spawns a child `./gradlew :module:test
--tests …` for each affected module. Before this release the child
was launched via Gradle's `ExecOperations`, which has no timeout
surface at all — a hung test kept the CI worker busy until the outer
CI-system deadline killed the whole job, which typically surfaces as
"pipeline failed with no useful logs" hours later. The new knob is a
wall-clock deadline in seconds; `0` (the default) preserves the
pre-v1.9.22 "wait forever" behaviour for zero-config callers so the
upgrade is safe. Positive values branch the task onto a `ProcessBuilder`
watchdog path: the child still streams its output to the operator's
terminal via `inheritIO()`, and on timeout the task runs a polite
`destroy()` on the wrapper → 10-second grace → `destroyForcibly()`
on both the wrapper and every snapshotted descendant (shared Gradle
daemons, test JVMs) → 5-second reap ladder before failing the build
with a message that names the setting so the operator knows what to
raise. Because `./gradlew` connects to a shared daemon JVM by
default, killing only the wrapper would leave the actually-hung
test JVM running as a grandchild; the watchdog snapshots
`ProcessHandle#descendants()` *before* signalling (once the wrapper
exits, re-parented daemons become unreachable from
`process.descendants()`) and forcibly destroys every live descendant
after the wrapper is reaped, regardless of whether the wrapper
itself exited gracefully on `SIGTERM` or had to be `destroyForcibly`
escalated. This matters because a SIGTERM-responsive wrapper
(plain `sh`, most test runners) exits on `destroy()` but leaves its
children re-parented to pid 1 — those grandchildren are the real
hung workload and would keep holding the CI worker hostage if the
descendants-kill only ran on the forcible leg. Operators who need a
hard cutoff with no daemon reuse can pass `--no-daemon` at the
outer build level.
Outer-build cancellation (Ctrl-C, Gradle daemon shutdown) propagates
via the same snapshot-then-destroyForcibly path and re-asserts the
interrupt. The trade-off for opting in:
the watchdog path uses `inheritIO` instead of `ExecOperations`, so
Develocity build-scan stream capture of the child's output is not
available on timed runs — callers who rely on scan ingestion of the
nested output should leave the knob at `0` and enforce a deadline at
the CI-job level instead. Validation lives at the builder gate:
`AffectedTestsConfigTest.gradlewTimeoutRejectsNegativeValues` fails
when a user sets `-1` (typoed "no timeout"), and
`gradlewTimeoutDefaultsToZero` pins the zero-config default so any
future refactor that flips the default breaks loudly instead of
retroactively deadline-killing every long-running CI on upgrade.
Runtime coverage lives in `AffectedTestTaskTimeoutTest`
(POSIX-only): `hungChildIsKilledWithinLadderBudget` pins the ladder
budget, `sigtermSwallowingChildEscalatesToForcibleKill` exercises
the forcible leg against a `trap '' TERM; sleep 60` wrapper,
`successfulChildReturnsExitCodeBeforeTimeout` pins both the happy
path and non-zero exit propagation, and
`descendantsAreTerminatedNotOnlyTheWrapper` writes the grandchild's
PID to a pidfile and asserts via `ProcessHandle.of(pid).isAlive()`
that the re-parented `sleep` is reaped — the exact grandchild-
survival bug the descendants-kill loop closes. The last test was
the one that surfaced the "only the forcible leg kills descendants"
regression in the first cut of the fix: if the wrapper exited
gracefully on SIGTERM the descendants loop never ran and the
orphaned sleep survived. The fix now reaps descendants on both
legs.

### Fixed — tier-1 reliability / safety batch (v1.9.21)

The v1.9.20 batch closed every tier-1 *correctness* bug surfaced by the
Expand Down
22 changes: 22 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -222,6 +222,27 @@ affectedTests {
onAllFilesOutOfScope = "skipped"
onUnmappedFile = "full_suite" // the key MR-safety knob
onDiscoveryEmpty = "full_suite" // belt-and-braces for CI
// Fires when any scanned Java file failed to parse (malformed source,
// half-committed refactor, encoding glitch). Mode defaults: CI / STRICT
// escalate to full_suite (selection is known to be under-reported so
// safety wins); LOCAL stays on selected (the developer sees the WARN
// and values iteration speed on a branch they're actively editing).
onDiscoveryIncomplete = "full_suite"

// ---------------- Child-process deadline (v1.9.22) ----------------

// Wall-clock timeout in seconds for the nested `./gradlew :module:test`
// invocation. 0 (the default) disables the watchdog and matches
// pre-v1.9.22 behaviour — the task waits forever. Positive values
// spawn the child under a ProcessBuilder watchdog: on timeout the
// task runs destroy() → 10s grace → destroyForcibly() → 5s reap and
// then fails the build so a hung test never holds the CI worker
// hostage. Note: the watchdog path uses inheritIO for the child's
// stdio, so Develocity build-scan stream capture of the nested
// output is not available when a timeout is set — leave at 0 and
// enforce the deadline at the CI-job level if you rely on scan
// ingestion of child-process output.
gradlewTimeoutSeconds = 1800 // 30 min; use 3600 for suites with integration tests

// ---------------- Discovery tuning ----------------

Expand Down Expand Up @@ -309,6 +330,7 @@ Every row below shows the situation the engine resolved, and the action applied
| Any YAML / Gradle / Liquibase / `.java` outside configured dirs | `UNMAPPED_FILE` | `FULL_SUITE` (via `onUnmappedFile = "full_suite"`) | `onUnmappedFile = "selected"` |
| No changed files at all | `EMPTY_DIFF` | `SKIPPED` | `onEmptyDiff = "full_suite"` / `mode = strict` |
| Mapping succeeds but discovery returns zero tests | `DISCOVERY_EMPTY` | `SKIPPED` — or `FULL_SUITE` if `mode=ci`/`strict` | `onDiscoveryEmpty` / `mode` |
| At least one scanned `.java` file failed to parse (malformed / half-committed / encoding glitch) | `DISCOVERY_INCOMPLETE` (takes precedence over both `DISCOVERY_EMPTY` and `DISCOVERY_SUCCESS`) | `SELECTED` — or `FULL_SUITE` if `mode=ci`/`strict` | `onDiscoveryIncomplete` / `mode` |
| Mixed diff: Java + unmapped file | `UNMAPPED_FILE` (takes precedence) | `FULL_SUITE` | `onUnmappedFile` — set to `"selected"` to fall through to discovery |
| `baseRef` not resolvable | `FAILED` | Hard error (prevents silent test skipping in CI) | — |
| Not a git work tree / JGit I/O error | `FAILED` | Hard error | — |
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,17 @@ public enum EscalationReason {
* action for {@link Situation#ALL_FILES_OUT_OF_SCOPE} resolved to
* {@link Action#FULL_SUITE}. v2-only.
*/
RUN_ALL_ON_ALL_FILES_OUT_OF_SCOPE
RUN_ALL_ON_ALL_FILES_OUT_OF_SCOPE,
/**
* One or more scanned Java files failed to parse (malformed
* source, or a language level beyond {@link io.affectedtests.core.discovery.JavaParsers#LANGUAGE_LEVEL}),
* so discovery may have under-reported its selection. The action
* for {@link Situation#DISCOVERY_INCOMPLETE} resolved to
* {@link Action#FULL_SUITE}, typically via {@link io.affectedtests.core.config.Mode#CI}
* or {@link io.affectedtests.core.config.Mode#STRICT} defaults
* or an explicit {@code runAllIfNoMatches=true}. v2-only.
*/
RUN_ALL_ON_DISCOVERY_INCOMPLETE
}

/**
Expand Down Expand Up @@ -320,14 +330,57 @@ public AffectedTestsResult run() {
}
}

// Parse failures during index / strategy walks mean the Usage /
// Implementation / Transitive tiers may have silently dropped
// tests that actually do reference the changed production code.
// Surface that as its own situation so callers can pick a policy
// (escalate to FULL_SUITE on CI, keep the partial selection on
// LOCAL) instead of routing through DISCOVERY_EMPTY /
// DISCOVERY_SUCCESS as if the selection were trustworthy. The
// count is de-duplicated inside ProjectIndex so a single
// unparseable file is not multi-counted across strategies.
int parseFailures = index.parseFailureCount();
if (parseFailures > 0) {
Action action = config.actionFor(Situation.DISCOVERY_INCOMPLETE);
log.warn("Affected Tests: discovery observed {} unparseable Java file(s); "
+ "selection may be incomplete. Action: {}. See WARN lines above "
+ "for the specific files that failed to parse.",
parseFailures, action);
if (action == Action.FULL_SUITE || action == Action.SKIPPED) {
return emptyResult(Situation.DISCOVERY_INCOMPLETE, action, changedFiles,
mapping.productionClasses(), mapping.testClasses(), buckets);
}
// SELECTED: honour the situation but fall through to the
// shared empty/selected tail below. The tail now picks the
// returned situation based on parseFailures so both the
// happy path and this one share one log line ({@code "==="
// Result: N affected test classes (SITUATION)}) and one
// record construction, instead of two near-identical copies
// that will drift.
}

Situation finalSituation = parseFailures > 0
? Situation.DISCOVERY_INCOMPLETE
: Situation.DISCOVERY_SUCCESS;

if (allTestsToRun.isEmpty()) {
Action action = config.actionFor(Situation.DISCOVERY_EMPTY);
log.info("Discovery produced no affected tests. Action: {}.", action);
return emptyResult(Situation.DISCOVERY_EMPTY, action, changedFiles,
// Shared empty tail. For parseFailures == 0 we report
// DISCOVERY_EMPTY (the pre-v1.9.22 branch); for
// parseFailures > 0 + SELECTED we report DISCOVERY_INCOMPLETE
// so --explain, the skip-reason phrase, and the CI log
// grep all see a single truthful situation.
Situation emptySituation = parseFailures > 0
? Situation.DISCOVERY_INCOMPLETE
: Situation.DISCOVERY_EMPTY;
Action action = config.actionFor(emptySituation);
log.info("Discovery produced no affected tests. Situation: {}, Action: {}.",
emptySituation, action);
return emptyResult(emptySituation, action, changedFiles,
mapping.productionClasses(), mapping.testClasses(), buckets);
}

log.info("=== Result: {} affected test classes ===", allTestsToRun.size());
log.info("=== Result: {} affected test classes ({}) ===",
allTestsToRun.size(), finalSituation);
// FQNs are derived from diff filenames by the discovery strategies
// and have not yet been through the AffectedTestTask#isValidFqn
// gate; sanitise here so an attacker-planted filename like
Expand All @@ -344,7 +397,7 @@ public AffectedTestsResult run() {
buckets,
false,
false,
Situation.DISCOVERY_SUCCESS,
finalSituation,
Action.SELECTED,
EscalationReason.NONE
);
Expand Down Expand Up @@ -411,6 +464,7 @@ private static EscalationReason legacyReason(Situation situation, Action action)
return switch (situation) {
case EMPTY_DIFF -> EscalationReason.RUN_ALL_ON_EMPTY_CHANGESET;
case DISCOVERY_EMPTY -> EscalationReason.RUN_ALL_IF_NO_MATCHES;
case DISCOVERY_INCOMPLETE -> EscalationReason.RUN_ALL_ON_DISCOVERY_INCOMPLETE;
case UNMAPPED_FILE -> EscalationReason.RUN_ALL_ON_NON_JAVA_CHANGE;
case ALL_FILES_IGNORED -> EscalationReason.RUN_ALL_ON_ALL_FILES_IGNORED;
case ALL_FILES_OUT_OF_SCOPE -> EscalationReason.RUN_ALL_ON_ALL_FILES_OUT_OF_SCOPE;
Expand Down
Loading