Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
105 changes: 105 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,111 @@ adheres to [Semantic Versioning](https://semver.org/).

## [Unreleased]

### Fixed — tier-1 reliability / safety batch (v1.9.21)

The v1.9.20 batch closed every tier-1 *correctness* bug surfaced by the
multi-reviewer sweep. v1.9.21 closes the tier-1 *reliability and
safety* cluster from the same review — the class of issues that don't
silently drop tests but do weaken the merge-gate contract and the
hardening work shipped in v1.9.19.

- **`SourceFileScanner` now rejects file-level symlinks during the
walk.** v1.9.19 hardened the *directory* path via
`stayInsideProjectRoot` + `toRealPath`, which closed
`src/main/java -> /` as an MR-planted symlink. The per-file leg of
the same attack — an attacker committing
`src/main/java/com/x/Leak.java -> /etc/passwd` — slid past that
guard because `Files.walkFileTree` still called `visitFile` on the
symlink and the visitor didn't check `attrs.isSymbolicLink()`. The
two file-walk visitors in the scanner (`collectJavaFiles`,
`walkFqnsUnder`) now skip any entry flagged as a symlink
unconditionally; a regular `.java` file whose real path
legitimately lives inside the project tree is just a regular file,
not a symlink. Regression lives in `SourceFileScannerTest`
(`rejectsFileLevelSymlinkEscapingProjectRoot`,
`scansTestFqnsIgnoresFileLevelSymlinks`) — both fail when the
production guard is reverted, covering both the
`collectTestFiles`/`collectSourceFiles` path and the
`scanTestFqns` path so the two halves of discovery can never
diverge on what counts as a valid source file.
- **Per-subtree `IOException`s no longer truncate the whole walk.**
The default `SimpleFileVisitor.visitFileFailed` rethrows, and the
scanner's outer `IOException` catch swallowed the remaining walk
silently — a single unreadable nested module (permission denied,
file-deleted-under-us race, filesystem loop) took every readable
sibling down with it. All three walk sites (`collectJavaFiles`,
`walkFqnsUnder`, `findAllMatchingDirs`) now override
`visitFileFailed` to log the unreadable entry at WARN via
`LogSanitizer` and continue. Default-visible surfacing of the
skipped entry is the load-bearing change: operators can now see
when a scan ran incomplete, where previously they'd silently
under-select tests and have no way to correlate it with CI config
drift. No regression test — inducing a portable per-entry
`IOException` requires either root-evadable `chmod 000` semantics
(fails on CI runners running as root) or heavyweight filesystem
mocking; the fix is evident in the diff and backed by the 11
existing walk tests in `SourceFileScannerTest`.
- **`AffectedTestsConfig.Builder#isAcceptableBaseRef` now rejects
control characters at the entry gate.** The rev-expression pattern
used `@\{[^}]+\}` for the reflog segment — `[^}]` happily matched
newline, ESC, CSI, and DEL. A `baseRef` sourced from an
attacker-controlled CI env var like
`master@{1\n\u001b[2JAffected Tests: SELECTED\u001b[m}` previously
passed validation and then flowed unsanitised into
`log.info("Base ref: {}", …)` in `AffectedTestsEngine` and into
three `IllegalStateException` messages in `GitChangeDetector`. An
attacker who could poison `CI_BASE_REF` (or an equivalent) could
forge plugin-branded status lines into CI output and defeat
grep-based merge gates. The fix is a single blanket
`containsControlChars` check before any of the downstream matchers
run — C0 (`0x00..0x1F`), DEL (`0x7F`), and C1 (`0x80..0x9F`) are
all rejected, closing every regex path at once without depending
on getting each hand-crafted character class right. Regression
lives in `AffectedTestsConfigTest.baseRefRejectsControlCharsInReflogSuffix`,
which covers the newline-in-reflog shape, bare ESC/DEL, and CRLF
anywhere in the input; each case fails when the entry-gate check
is removed.
- **`LogSanitizer` coverage extended to every remaining diff-sourced
log site.** v1.9.19 introduced the sanitizer and applied it at
four sites; the multi-reviewer review found ~28 additional sites
passing raw filenames, FQNs, or exception messages straight to the
logger — most at DEBUG, but several at default-visible WARN (and
two inside `IllegalStateException` messages that Gradle renders
verbatim into the build log). An attacker-crafted filename or a
poisoned `JavaParser` diagnostic carrying `\u001b[2J…\u001b[m`
could still forge ops-visible log lines. All remaining sites now
route through `LogSanitizer.sanitize`:
`JavaParsers.parseOrWarn` (WARN + DEBUG),
`ImplementationStrategy.supertypesOf` (WARN) and its per-depth
match trace (DEBUG), `PathToClassMapper`'s eight classification
DEBUG lines, `UsageStrategy`'s seven tier-match DEBUG lines,
`NamingConventionStrategy`'s match DEBUG line,
`GitChangeDetector`'s per-file changed-entry DEBUG line and the
`IOException` / `GitAPIException` wrappers it throws (because
JGit exception messages can carry attacker-influenced pack-file
paths and ref segments), `AffectedTestsEngine`'s FQN-has-no-file
DEBUG line, `SourceFileScanner.findAllMatchingDirs`'s trailing
DEBUG summary (brought into parity with its WARN path), and the
echo of the offending value inside the `IllegalArgumentException`
thrown by `AffectedTestsConfig.Builder#baseRef` when validation
fails (otherwise the reject branch would reopen the same surface
`containsControlChars` closes on the accept branch). Even DEBUG
sites are sanitised because operators who bump the log level to
chase a false-positive selection are exactly the audience an
attacker would target — the log-forgery contract has to hold at
every visibility level, not just WARN/INFO. Regressions:
`LogSanitizerTest` covers the escape table;
`baseRefValidationErrorDoesNotEchoRawControlChars` pins the
newly-sanitised exception-message path.

Two related findings from the same review are intentionally deferred
to batch 6 (v1.9.22) so each PR stays scoped: #9 (`ProjectIndex`
parse failures signal discovery-incomplete to the engine so the
safety net can escalate explicitly) and #11 (configurable timeout on
the nested `./gradlew` invocation so a hung child test cannot pin
the outer build indefinitely). Both are observable-behaviour changes
and deserve their own CHANGELOG entries.

### Fixed — post-v1.9.19 multi-reviewer correctness pass (v1.9.20)

A second, deeper multi-agent review of the post-v1.9.19 tree surfaced a
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -311,7 +311,12 @@ public AffectedTestsResult run() {
allTestsToRun.add(fqn);
fqnToPath.put(fqn, file);
} else {
log.debug("Skipping FQN with no matching test file on disk: {}", fqn);
// The FQN reached this log site via the discovery
// strategies, which derive names from diff paths and
// from the scanned source tree — both attacker-
// influenced on a merge-gate run.
log.debug("Skipping FQN with no matching test file on disk: {}",
LogSanitizer.sanitize(fqn));
}
}

Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
package io.affectedtests.core.config;

import io.affectedtests.core.util.LogSanitizer;

import java.util.ArrayList;
import java.util.EnumMap;
import java.util.List;
Expand Down Expand Up @@ -526,9 +528,16 @@ public Builder baseRef(String baseRef) {
throw new IllegalArgumentException("baseRef must not be null or blank");
}
if (!isAcceptableBaseRef(baseRef)) {
// The rejected value is echoed back in the exception message, which
// Gradle renders verbatim into the build log (and often into build-
// scan HTML). Sanitising here closes the same log-forgery surface
// that containsControlChars closes on the accept path — without it,
// an attacker-poisoned CI_BASE_REF would still get its forged
// status line printed, just via the *reject* branch instead of the
// accept branch. See the javadoc on isAcceptableBaseRef.
throw new IllegalArgumentException(
"baseRef is not a valid git ref, SHA, or short form: '"
+ baseRef + "' — expected something like "
+ LogSanitizer.sanitize(baseRef) + "' — expected something like "
+ "'origin/master', 'HEAD~1', or a 7-40 char hex SHA");
}
this.baseRef = baseRef;
Expand All @@ -549,13 +558,35 @@ public Builder baseRef(String baseRef) {
* {@link org.eclipse.jgit.lib.Repository#isValidRefName(String)}),
* short/long SHAs, and {@code HEAD~N}/{@code ^N}/{@code @{N}}
* rev-expressions. Rejects path-traversal shapes by construction
* since JGit's refname validator already forbids {@code ..}, a
* leading {@code /}, and control characters — no hand-rolled
* string sniffing required. Keeping this decision inside the
* builder means every call site (CLI, Gradle plugin, programmatic
* users) gets the same contract without duplicating the rule.
* since JGit's refname validator already forbids {@code ..} and
* a leading {@code /}. The {@link #SHORT_SHA} short-circuit at
* the top bypasses {@code isValidRefName}, but is safe by
* construction: the hex-only character class excludes {@code .}
* and {@code /}, so path-traversal shapes cannot reach the SHA
* path.
*
* <p>Control characters ({@code \n}, {@code \r}, ESC, CSI, DEL,
* the whole C0 and C1 ranges) are rejected at the entry gate
* before any of the downstream matchers run. This is load-
* bearing for the {@link #REV_EXPR} path: its
* {@code @\{[^}]+\}} segment matches newline, ESC, and CSI
* verbatim, so a {@code baseRef} sourced from an attacker-
* controlled CI environment variable like
* {@code master@\{1\n\u001b[2JAffected Tests: SELECTED\u001b[m\}}
* previously passed validation and then flowed straight into
* {@code log.info("Base ref: {}", …)} in
* {@code AffectedTestsEngine} and into three
* {@code IllegalStateException} messages in
* {@code GitChangeDetector} — a log-forgery surface that let
* an attacker fabricate plugin-branded status lines in CI
* output. Rejecting at the gate closes every regex path at
* once without depending on getting each hand-crafted
* character class right.
*/
private static boolean isAcceptableBaseRef(String baseRef) {
if (containsControlChars(baseRef)) {
return false;
}
if (SHORT_SHA.matcher(baseRef).matches()) {
return true;
}
Expand All @@ -581,6 +612,27 @@ private static boolean isAcceptableBaseRef(String baseRef) {
}
return false;
}

/**
* Returns {@code true} if {@code value} contains any C0 control
* character ({@code 0x00..0x1F}), DEL ({@code 0x7F}), or C1
* control character ({@code 0x80..0x9F}). Kept in the builder
* rather than reaching for {@link io.affectedtests.core.util.LogSanitizer}
* because this is a validation gate, not a logging concern —
* the string must be rejected here before it ever reaches a
* logger, and keeping the check local makes the
* {@link #isAcceptableBaseRef(String)} contract auditable
* without cross-package hops.
*/
private static boolean containsControlChars(String value) {
for (int i = 0; i < value.length(); i++) {
char c = value.charAt(i);
if (c < 0x20 || c == 0x7F || (c >= 0x80 && c <= 0x9F)) {
return true;
}
}
return false;
}
public Builder includeUncommitted(boolean v) { this.includeUncommitted = v; return this; }
public Builder includeStaged(boolean v) { this.includeStaged = v; return this; }
public Builder runAllIfNoMatches(boolean v) { this.runAllIfNoMatches = v; return this; }
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
import com.github.javaparser.ast.body.TypeDeclaration;
import com.github.javaparser.ast.type.ClassOrInterfaceType;
import io.affectedtests.core.config.AffectedTestsConfig;
import io.affectedtests.core.util.LogSanitizer;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

Expand Down Expand Up @@ -196,9 +197,16 @@ private Set<String> findImplementations(Set<String> changedClasses,
for (ClassOrInterfaceType supertype : supertypesOf(decl)) {
if (targetsBySimpleName.containsKey(supertype.getNameAsString())) {
newImpls.add(implFqn);
// {@code implFqn} and {@code supertype.getNameAsString()}
// both originate from the scanned source tree,
// which on a merge-gate run is attacker-
// influenced. DEBUG-level, but sanitised for
// forgery-resistance when ops drop the level.
log.debug("[impl] depth {}: {} ({}) extends/implements {}",
depth + 1, implFqn, decl.getClass().getSimpleName(),
supertype.getNameAsString());
depth + 1,
LogSanitizer.sanitize(implFqn),
decl.getClass().getSimpleName(),
LogSanitizer.sanitize(supertype.getNameAsString()));
break;
}
}
Expand Down Expand Up @@ -289,9 +297,16 @@ private static List<ClassOrInterfaceType> supertypesOf(TypeDeclaration<?> decl)
// known supertypes, which is the correct conservative
// fallback (cannot become a false impl match; at worst we
// underselect for a single declaration).
// {@code decl.getNameAsString()} originates from an
// attacker-controllable source file. This WARN surfaces at
// the default log level, so route through LogSanitizer to
// keep the log-forgery contract from v1.9.19 honest.
// {@code decl.getClass().getSimpleName()} is a JavaParser
// internal constant and does not need sanitising.
log.warn("Affected Tests: [impl] unknown TypeDeclaration subtype {} for {} — "
+ "supertype edges cannot be derived, skipping",
decl.getClass().getSimpleName(), decl.getNameAsString());
decl.getClass().getSimpleName(),
LogSanitizer.sanitize(decl.getNameAsString()));
}
return result;
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
import com.github.javaparser.ParserConfiguration;
import com.github.javaparser.ParserConfiguration.LanguageLevel;
import com.github.javaparser.ast.CompilationUnit;
import io.affectedtests.core.util.LogSanitizer;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

Expand Down Expand Up @@ -92,15 +93,32 @@ static CompilationUnit parseOrWarn(JavaParser parser, Path file, String label) {
String firstProblem = result.getProblems().isEmpty()
? "no diagnostics"
: result.getProblems().get(0).getMessage();
// Both {@code file} (attacker-committable filename from
// the scanned source tree) and {@code firstProblem} (parser
// diagnostic that can embed a source-code snippet sourced
// from an attacker-controlled file) flow into a default-
// visible WARN line. Sanitise both so an attacker who
// plants a file with a {@code \n} + fake authoritative
// status line in the name or content cannot forge CI log
// output. Label is an internal constant, not sanitised.
log.warn("Affected Tests: [{}] failed to parse {} at language level {}: {}",
label, file, LANGUAGE_LEVEL, firstProblem);
label,
LogSanitizer.sanitize(String.valueOf(file)),
LANGUAGE_LEVEL,
LogSanitizer.sanitize(firstProblem));
return null;
} catch (Exception e) {
// Preserve the pre-v1.9.20 behaviour of degrading the
// exception path to DEBUG (I/O races, file-deleted-under-
// JGit, etc. are noisy on CI) while still surfacing the
// much-more-common isSuccessful==false branch above.
log.debug("Affected Tests: [{}] error parsing {}: {}", label, file, e.getMessage());
// Sanitise the same way as the WARN branch so that if an
// operator bumps the level to DEBUG to investigate, the
// log cannot be retroactively forged.
log.debug("Affected Tests: [{}] error parsing {}: {}",
label,
LogSanitizer.sanitize(String.valueOf(file)),
LogSanitizer.sanitize(e.getMessage()));
return null;
}
}
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
package io.affectedtests.core.discovery;

import io.affectedtests.core.config.AffectedTestsConfig;
import io.affectedtests.core.util.LogSanitizer;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

Expand Down Expand Up @@ -60,7 +61,12 @@ private Set<String> matchTests(Set<String> changedProductionClasses, Set<String>
for (var entry : expectedTestNames.entrySet()) {
if (entry.getValue().contains(testSimpleName)) {
discoveredTests.add(testFqn);
log.debug("Naming match: {} → {}", entry.getKey(), testFqn);
// Both keys (changed production FQNs from the diff)
// and test FQNs (from the scanned source tree) are
// attacker-influenced on a merge-gate run.
log.debug("Naming match: {} → {}",
LogSanitizer.sanitize(entry.getKey()),
LogSanitizer.sanitize(testFqn));
}
}
}
Expand Down
Loading