diff --git a/CHANGELOG.md b/CHANGELOG.md
index edc0f39..18ed31e 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -6,6 +6,111 @@ adheres to [Semantic Versioning](https://semver.org/).
## [Unreleased]
+### Fixed — tier-1 reliability / safety batch (v1.9.21)
+
+The v1.9.20 batch closed every tier-1 *correctness* bug surfaced by the
+multi-reviewer sweep. v1.9.21 closes the tier-1 *reliability and
+safety* cluster from the same review — the class of issues that don't
+silently drop tests but do weaken the merge-gate contract and the
+hardening work shipped in v1.9.19.
+
+- **`SourceFileScanner` now rejects file-level symlinks during the
+ walk.** v1.9.19 hardened the *directory* path via
+ `stayInsideProjectRoot` + `toRealPath`, which closed
+ `src/main/java -> /` as an MR-planted symlink. The per-file leg of
+ the same attack — an attacker committing
+ `src/main/java/com/x/Leak.java -> /etc/passwd` — slid past that
+ guard because `Files.walkFileTree` still called `visitFile` on the
+ symlink and the visitor didn't check `attrs.isSymbolicLink()`. The
+ two file-walk visitors in the scanner (`collectJavaFiles`,
+ `walkFqnsUnder`) now skip any entry flagged as a symlink
+ unconditionally; a regular `.java` file whose real path
+ legitimately lives inside the project tree is just a regular file,
+ not a symlink. Regression lives in `SourceFileScannerTest`
+ (`rejectsFileLevelSymlinkEscapingProjectRoot`,
+ `scansTestFqnsIgnoresFileLevelSymlinks`) — both fail when the
+ production guard is reverted, covering both the
+ `collectTestFiles`/`collectSourceFiles` path and the
+ `scanTestFqns` path so the two halves of discovery can never
+ diverge on what counts as a valid source file.
+- **Per-subtree `IOException`s no longer truncate the whole walk.**
+ The default `SimpleFileVisitor.visitFileFailed` rethrows, and the
+ scanner's outer `IOException` catch swallowed the remaining walk
+ silently — a single unreadable nested module (permission denied,
+ file-deleted-under-us race, filesystem loop) took every readable
+ sibling down with it. All three walk sites (`collectJavaFiles`,
+ `walkFqnsUnder`, `findAllMatchingDirs`) now override
+ `visitFileFailed` to log the unreadable entry at WARN via
+ `LogSanitizer` and continue. Default-visible surfacing of the
+ skipped entry is the load-bearing change: operators can now see
+ when a scan ran incomplete, where previously they'd silently
+ under-select tests and have no way to correlate it with CI config
+ drift. No regression test — inducing a portable per-entry
+ `IOException` requires either root-evadable `chmod 000` semantics
+ (fails on CI runners running as root) or heavyweight filesystem
+ mocking; the fix is evident in the diff and backed by the 11
+ existing walk tests in `SourceFileScannerTest`.
+- **`AffectedTestsConfig.Builder#isAcceptableBaseRef` now rejects
+ control characters at the entry gate.** The rev-expression pattern
+ used `@\{[^}]+\}` for the reflog segment — `[^}]` happily matched
+ newline, ESC, CSI, and DEL. A `baseRef` sourced from an
+ attacker-controlled CI env var like
+ `master@{1\n\u001b[2JAffected Tests: SELECTED\u001b[m}` previously
+ passed validation and then flowed unsanitised into
+ `log.info("Base ref: {}", …)` in `AffectedTestsEngine` and into
+ three `IllegalStateException` messages in `GitChangeDetector`. An
+ attacker who could poison `CI_BASE_REF` (or an equivalent) could
+ forge plugin-branded status lines into CI output and defeat
+ grep-based merge gates. The fix is a single blanket
+ `containsControlChars` check before any of the downstream matchers
+ run — C0 (`0x00..0x1F`), DEL (`0x7F`), and C1 (`0x80..0x9F`) are
+ all rejected, closing every regex path at once without depending
+ on getting each hand-crafted character class right. Regression
+ lives in `AffectedTestsConfigTest.baseRefRejectsControlCharsInReflogSuffix`,
+ which covers the newline-in-reflog shape, bare ESC/DEL, and CRLF
+ anywhere in the input; each case fails when the entry-gate check
+ is removed.
+- **`LogSanitizer` coverage extended to every remaining diff-sourced
+ log site.** v1.9.19 introduced the sanitizer and applied it at
+ four sites; the multi-reviewer review found ~28 additional sites
+ passing raw filenames, FQNs, or exception messages straight to the
+ logger — most at DEBUG, but several at default-visible WARN (and
+ two inside `IllegalStateException` messages that Gradle renders
+ verbatim into the build log). An attacker-crafted filename or a
+ poisoned `JavaParser` diagnostic carrying `\u001b[2J…\u001b[m`
+ could still forge ops-visible log lines. All remaining sites now
+ route through `LogSanitizer.sanitize`:
+ `JavaParsers.parseOrWarn` (WARN + DEBUG),
+ `ImplementationStrategy.supertypesOf` (WARN) and its per-depth
+ match trace (DEBUG), `PathToClassMapper`'s eight classification
+ DEBUG lines, `UsageStrategy`'s seven tier-match DEBUG lines,
+ `NamingConventionStrategy`'s match DEBUG line,
+ `GitChangeDetector`'s per-file changed-entry DEBUG line and the
+ `IOException` / `GitAPIException` wrappers it throws (because
+ JGit exception messages can carry attacker-influenced pack-file
+ paths and ref segments), `AffectedTestsEngine`'s FQN-has-no-file
+ DEBUG line, `SourceFileScanner.findAllMatchingDirs`'s trailing
+ DEBUG summary (brought into parity with its WARN path), and the
+ echo of the offending value inside the `IllegalArgumentException`
+ thrown by `AffectedTestsConfig.Builder#baseRef` when validation
+ fails (otherwise the reject branch would reopen the same surface
+ `containsControlChars` closes on the accept branch). Even DEBUG
+ sites are sanitised because operators who bump the log level to
+ chase a false-positive selection are exactly the audience an
+ attacker would target — the log-forgery contract has to hold at
+ every visibility level, not just WARN/INFO. Regressions:
+ `LogSanitizerTest` covers the escape table;
+ `baseRefValidationErrorDoesNotEchoRawControlChars` pins the
+ newly-sanitised exception-message path.
+
+Two related findings from the same review are intentionally deferred
+to batch 6 (v1.9.22) so each PR stays scoped: #9 (`ProjectIndex`
+parse failures signal discovery-incomplete to the engine so the
+safety net can escalate explicitly) and #11 (configurable timeout on
+the nested `./gradlew` invocation so a hung child test cannot pin
+the outer build indefinitely). Both are observable-behaviour changes
+and deserve their own CHANGELOG entries.
+
### Fixed — post-v1.9.19 multi-reviewer correctness pass (v1.9.20)
A second, deeper multi-agent review of the post-v1.9.19 tree surfaced a
diff --git a/affected-tests-core/src/main/java/io/affectedtests/core/AffectedTestsEngine.java b/affected-tests-core/src/main/java/io/affectedtests/core/AffectedTestsEngine.java
index 8b212b4..3772a22 100644
--- a/affected-tests-core/src/main/java/io/affectedtests/core/AffectedTestsEngine.java
+++ b/affected-tests-core/src/main/java/io/affectedtests/core/AffectedTestsEngine.java
@@ -311,7 +311,12 @@ public AffectedTestsResult run() {
allTestsToRun.add(fqn);
fqnToPath.put(fqn, file);
} else {
- log.debug("Skipping FQN with no matching test file on disk: {}", fqn);
+ // The FQN reached this log site via the discovery
+ // strategies, which derive names from diff paths and
+ // from the scanned source tree — both attacker-
+ // influenced on a merge-gate run.
+ log.debug("Skipping FQN with no matching test file on disk: {}",
+ LogSanitizer.sanitize(fqn));
}
}
diff --git a/affected-tests-core/src/main/java/io/affectedtests/core/config/AffectedTestsConfig.java b/affected-tests-core/src/main/java/io/affectedtests/core/config/AffectedTestsConfig.java
index 887a3f7..a18c242 100644
--- a/affected-tests-core/src/main/java/io/affectedtests/core/config/AffectedTestsConfig.java
+++ b/affected-tests-core/src/main/java/io/affectedtests/core/config/AffectedTestsConfig.java
@@ -1,5 +1,7 @@
package io.affectedtests.core.config;
+import io.affectedtests.core.util.LogSanitizer;
+
import java.util.ArrayList;
import java.util.EnumMap;
import java.util.List;
@@ -526,9 +528,16 @@ public Builder baseRef(String baseRef) {
throw new IllegalArgumentException("baseRef must not be null or blank");
}
if (!isAcceptableBaseRef(baseRef)) {
+ // The rejected value is echoed back in the exception message, which
+ // Gradle renders verbatim into the build log (and often into build-
+ // scan HTML). Sanitising here closes the same log-forgery surface
+ // that containsControlChars closes on the accept path — without it,
+ // an attacker-poisoned CI_BASE_REF would still get its forged
+ // status line printed, just via the *reject* branch instead of the
+ // accept branch. See the javadoc on isAcceptableBaseRef.
throw new IllegalArgumentException(
"baseRef is not a valid git ref, SHA, or short form: '"
- + baseRef + "' — expected something like "
+ + LogSanitizer.sanitize(baseRef) + "' — expected something like "
+ "'origin/master', 'HEAD~1', or a 7-40 char hex SHA");
}
this.baseRef = baseRef;
@@ -549,13 +558,35 @@ public Builder baseRef(String baseRef) {
* {@link org.eclipse.jgit.lib.Repository#isValidRefName(String)}),
* short/long SHAs, and {@code HEAD~N}/{@code ^N}/{@code @{N}}
* rev-expressions. Rejects path-traversal shapes by construction
- * since JGit's refname validator already forbids {@code ..}, a
- * leading {@code /}, and control characters — no hand-rolled
- * string sniffing required. Keeping this decision inside the
- * builder means every call site (CLI, Gradle plugin, programmatic
- * users) gets the same contract without duplicating the rule.
+ * since JGit's refname validator already forbids {@code ..} and
+ * a leading {@code /}. The {@link #SHORT_SHA} short-circuit at
+ * the top bypasses {@code isValidRefName}, but is safe by
+ * construction: the hex-only character class excludes {@code .}
+ * and {@code /}, so path-traversal shapes cannot reach the SHA
+ * path.
+ *
+ *
Control characters ({@code \n}, {@code \r}, ESC, CSI, DEL,
+ * the whole C0 and C1 ranges) are rejected at the entry gate
+ * before any of the downstream matchers run. This is load-
+ * bearing for the {@link #REV_EXPR} path: its
+ * {@code @\{[^}]+\}} segment matches newline, ESC, and CSI
+ * verbatim, so a {@code baseRef} sourced from an attacker-
+ * controlled CI environment variable like
+ * {@code master@\{1\n\u001b[2JAffected Tests: SELECTED\u001b[m\}}
+ * previously passed validation and then flowed straight into
+ * {@code log.info("Base ref: {}", …)} in
+ * {@code AffectedTestsEngine} and into three
+ * {@code IllegalStateException} messages in
+ * {@code GitChangeDetector} — a log-forgery surface that let
+ * an attacker fabricate plugin-branded status lines in CI
+ * output. Rejecting at the gate closes every regex path at
+ * once without depending on getting each hand-crafted
+ * character class right.
*/
private static boolean isAcceptableBaseRef(String baseRef) {
+ if (containsControlChars(baseRef)) {
+ return false;
+ }
if (SHORT_SHA.matcher(baseRef).matches()) {
return true;
}
@@ -581,6 +612,27 @@ private static boolean isAcceptableBaseRef(String baseRef) {
}
return false;
}
+
+ /**
+ * Returns {@code true} if {@code value} contains any C0 control
+ * character ({@code 0x00..0x1F}), DEL ({@code 0x7F}), or C1
+ * control character ({@code 0x80..0x9F}). Kept in the builder
+ * rather than reaching for {@link io.affectedtests.core.util.LogSanitizer}
+ * because this is a validation gate, not a logging concern —
+ * the string must be rejected here before it ever reaches a
+ * logger, and keeping the check local makes the
+ * {@link #isAcceptableBaseRef(String)} contract auditable
+ * without cross-package hops.
+ */
+ private static boolean containsControlChars(String value) {
+ for (int i = 0; i < value.length(); i++) {
+ char c = value.charAt(i);
+ if (c < 0x20 || c == 0x7F || (c >= 0x80 && c <= 0x9F)) {
+ return true;
+ }
+ }
+ return false;
+ }
public Builder includeUncommitted(boolean v) { this.includeUncommitted = v; return this; }
public Builder includeStaged(boolean v) { this.includeStaged = v; return this; }
public Builder runAllIfNoMatches(boolean v) { this.runAllIfNoMatches = v; return this; }
diff --git a/affected-tests-core/src/main/java/io/affectedtests/core/discovery/ImplementationStrategy.java b/affected-tests-core/src/main/java/io/affectedtests/core/discovery/ImplementationStrategy.java
index b562ba9..4e1d015 100644
--- a/affected-tests-core/src/main/java/io/affectedtests/core/discovery/ImplementationStrategy.java
+++ b/affected-tests-core/src/main/java/io/affectedtests/core/discovery/ImplementationStrategy.java
@@ -8,6 +8,7 @@
import com.github.javaparser.ast.body.TypeDeclaration;
import com.github.javaparser.ast.type.ClassOrInterfaceType;
import io.affectedtests.core.config.AffectedTestsConfig;
+import io.affectedtests.core.util.LogSanitizer;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
@@ -196,9 +197,16 @@ private Set findImplementations(Set changedClasses,
for (ClassOrInterfaceType supertype : supertypesOf(decl)) {
if (targetsBySimpleName.containsKey(supertype.getNameAsString())) {
newImpls.add(implFqn);
+ // {@code implFqn} and {@code supertype.getNameAsString()}
+ // both originate from the scanned source tree,
+ // which on a merge-gate run is attacker-
+ // influenced. DEBUG-level, but sanitised for
+ // forgery-resistance when ops drop the level.
log.debug("[impl] depth {}: {} ({}) extends/implements {}",
- depth + 1, implFqn, decl.getClass().getSimpleName(),
- supertype.getNameAsString());
+ depth + 1,
+ LogSanitizer.sanitize(implFqn),
+ decl.getClass().getSimpleName(),
+ LogSanitizer.sanitize(supertype.getNameAsString()));
break;
}
}
@@ -289,9 +297,16 @@ private static List supertypesOf(TypeDeclaration> decl)
// known supertypes, which is the correct conservative
// fallback (cannot become a false impl match; at worst we
// underselect for a single declaration).
+ // {@code decl.getNameAsString()} originates from an
+ // attacker-controllable source file. This WARN surfaces at
+ // the default log level, so route through LogSanitizer to
+ // keep the log-forgery contract from v1.9.19 honest.
+ // {@code decl.getClass().getSimpleName()} is a JavaParser
+ // internal constant and does not need sanitising.
log.warn("Affected Tests: [impl] unknown TypeDeclaration subtype {} for {} — "
+ "supertype edges cannot be derived, skipping",
- decl.getClass().getSimpleName(), decl.getNameAsString());
+ decl.getClass().getSimpleName(),
+ LogSanitizer.sanitize(decl.getNameAsString()));
}
return result;
}
diff --git a/affected-tests-core/src/main/java/io/affectedtests/core/discovery/JavaParsers.java b/affected-tests-core/src/main/java/io/affectedtests/core/discovery/JavaParsers.java
index 43ba06c..76f1abb 100644
--- a/affected-tests-core/src/main/java/io/affectedtests/core/discovery/JavaParsers.java
+++ b/affected-tests-core/src/main/java/io/affectedtests/core/discovery/JavaParsers.java
@@ -5,6 +5,7 @@
import com.github.javaparser.ParserConfiguration;
import com.github.javaparser.ParserConfiguration.LanguageLevel;
import com.github.javaparser.ast.CompilationUnit;
+import io.affectedtests.core.util.LogSanitizer;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
@@ -92,15 +93,32 @@ static CompilationUnit parseOrWarn(JavaParser parser, Path file, String label) {
String firstProblem = result.getProblems().isEmpty()
? "no diagnostics"
: result.getProblems().get(0).getMessage();
+ // Both {@code file} (attacker-committable filename from
+ // the scanned source tree) and {@code firstProblem} (parser
+ // diagnostic that can embed a source-code snippet sourced
+ // from an attacker-controlled file) flow into a default-
+ // visible WARN line. Sanitise both so an attacker who
+ // plants a file with a {@code \n} + fake authoritative
+ // status line in the name or content cannot forge CI log
+ // output. Label is an internal constant, not sanitised.
log.warn("Affected Tests: [{}] failed to parse {} at language level {}: {}",
- label, file, LANGUAGE_LEVEL, firstProblem);
+ label,
+ LogSanitizer.sanitize(String.valueOf(file)),
+ LANGUAGE_LEVEL,
+ LogSanitizer.sanitize(firstProblem));
return null;
} catch (Exception e) {
// Preserve the pre-v1.9.20 behaviour of degrading the
// exception path to DEBUG (I/O races, file-deleted-under-
// JGit, etc. are noisy on CI) while still surfacing the
// much-more-common isSuccessful==false branch above.
- log.debug("Affected Tests: [{}] error parsing {}: {}", label, file, e.getMessage());
+ // Sanitise the same way as the WARN branch so that if an
+ // operator bumps the level to DEBUG to investigate, the
+ // log cannot be retroactively forged.
+ log.debug("Affected Tests: [{}] error parsing {}: {}",
+ label,
+ LogSanitizer.sanitize(String.valueOf(file)),
+ LogSanitizer.sanitize(e.getMessage()));
return null;
}
}
diff --git a/affected-tests-core/src/main/java/io/affectedtests/core/discovery/NamingConventionStrategy.java b/affected-tests-core/src/main/java/io/affectedtests/core/discovery/NamingConventionStrategy.java
index 80c6868..53b588b 100644
--- a/affected-tests-core/src/main/java/io/affectedtests/core/discovery/NamingConventionStrategy.java
+++ b/affected-tests-core/src/main/java/io/affectedtests/core/discovery/NamingConventionStrategy.java
@@ -1,6 +1,7 @@
package io.affectedtests.core.discovery;
import io.affectedtests.core.config.AffectedTestsConfig;
+import io.affectedtests.core.util.LogSanitizer;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
@@ -60,7 +61,12 @@ private Set matchTests(Set changedProductionClasses, Set
for (var entry : expectedTestNames.entrySet()) {
if (entry.getValue().contains(testSimpleName)) {
discoveredTests.add(testFqn);
- log.debug("Naming match: {} → {}", entry.getKey(), testFqn);
+ // Both keys (changed production FQNs from the diff)
+ // and test FQNs (from the scanned source tree) are
+ // attacker-influenced on a merge-gate run.
+ log.debug("Naming match: {} → {}",
+ LogSanitizer.sanitize(entry.getKey()),
+ LogSanitizer.sanitize(testFqn));
}
}
}
diff --git a/affected-tests-core/src/main/java/io/affectedtests/core/discovery/SourceFileScanner.java b/affected-tests-core/src/main/java/io/affectedtests/core/discovery/SourceFileScanner.java
index f01e420..ccbabf6 100644
--- a/affected-tests-core/src/main/java/io/affectedtests/core/discovery/SourceFileScanner.java
+++ b/affected-tests-core/src/main/java/io/affectedtests/core/discovery/SourceFileScanner.java
@@ -1,5 +1,6 @@
package io.affectedtests.core.discovery;
+import io.affectedtests.core.util.LogSanitizer;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
@@ -53,20 +54,65 @@ public static List collectJavaFiles(Path dir) {
/**
* Recursively collects all {@code .java} files under {@code dir},
* appending them to the provided list.
+ *
+ * Symlinks, at either the directory or file level, are never
+ * followed or added to the result. The directory-level
+ * containment check in {@link #findAllMatchingDirs} already
+ * rejects {@code src/main/java -> /} planted by an attacker MR,
+ * but a more subtle attack — committing
+ * {@code src/main/java/com/x/Leak.java -> /etc/passwd} — slid
+ * past that guard and the file would then be parsed by
+ * JavaParser, with its contents potentially surfaced in
+ * {@code --explain} samples or in parse-failure WARN output. The
+ * in-visitor {@link BasicFileAttributes#isSymbolicLink()} check
+ * closes that hole. This is deliberately stricter than the
+ * directory-level containment check: we reject every
+ * file-level symlink unconditionally, even intra-repo ones whose
+ * real path stays inside the project tree. The rationale is
+ * cost/benefit — a per-file {@code toRealPath} call on every
+ * {@code .java} entry would add a syscall to the hot path for
+ * negligible benefit, and the usability cost of dropping
+ * intra-repo file symlinks is low (Git doesn't preserve hard
+ * links across clone either, so tooling that relies on
+ * canonical-reference aliasing already has to handle the "regular
+ * file" case). Projects that currently commit file-level symlinks
+ * for tooling reasons will see them silently ignored by
+ * discovery.
+ *
+ *
A per-subtree {@link IOException} (permission denied on one
+ * nested directory) is logged and swallowed so the scan
+ * continues; before v1.9.21 the default {@link SimpleFileVisitor}
+ * rethrew and the outer catch truncated the walk silently — any
+ * readable siblings of the unreadable subtree vanished from
+ * discovery without any signal to the operator.
*/
public static void collectJavaFiles(Path dir, List result) {
try {
Files.walkFileTree(dir, new SimpleFileVisitor<>() {
@Override
public FileVisitResult visitFile(Path file, BasicFileAttributes attrs) {
+ if (attrs.isSymbolicLink()) {
+ return FileVisitResult.CONTINUE;
+ }
if (file.toString().endsWith(".java")) {
result.add(file);
}
return FileVisitResult.CONTINUE;
}
+
+ @Override
+ public FileVisitResult visitFileFailed(Path file, IOException exc) {
+ log.warn("Affected Tests: skipping unreadable entry {} under {}: {}",
+ LogSanitizer.sanitize(String.valueOf(file)),
+ LogSanitizer.sanitize(String.valueOf(dir)),
+ LogSanitizer.sanitize(exc.getMessage()));
+ return FileVisitResult.CONTINUE;
+ }
});
} catch (IOException e) {
- log.warn("Error collecting Java files from {}: {}", dir, e.getMessage());
+ log.warn("Affected Tests: error collecting Java files from {}: {}",
+ LogSanitizer.sanitize(String.valueOf(dir)),
+ LogSanitizer.sanitize(e.getMessage()));
}
}
@@ -154,6 +200,15 @@ private static void walkFqnsUnder(Path sourceRoot, Map accumulator
Files.walkFileTree(sourceRoot, new SimpleFileVisitor<>() {
@Override
public FileVisitResult visitFile(Path file, BasicFileAttributes attrs) {
+ // Same symlink + containment contract as
+ // collectJavaFiles: file symlinks are never walked
+ // because the real path can escape the project
+ // root (DoS + data-exfiltration surface when the
+ // plugin runs on merge-gate CI against an
+ // attacker-influenced branch).
+ if (attrs.isSymbolicLink()) {
+ return FileVisitResult.CONTINUE;
+ }
if (file.toString().endsWith(".java")) {
Path relative = sourceRoot.relativize(file);
String fqn = relative.toString()
@@ -166,9 +221,20 @@ public FileVisitResult visitFile(Path file, BasicFileAttributes attrs) {
}
return FileVisitResult.CONTINUE;
}
+
+ @Override
+ public FileVisitResult visitFileFailed(Path file, IOException exc) {
+ log.warn("Affected Tests: skipping unreadable entry {} under {}: {}",
+ LogSanitizer.sanitize(String.valueOf(file)),
+ LogSanitizer.sanitize(String.valueOf(sourceRoot)),
+ LogSanitizer.sanitize(exc.getMessage()));
+ return FileVisitResult.CONTINUE;
+ }
});
} catch (IOException e) {
- log.warn("Error scanning directory {}: {}", sourceRoot, e.getMessage());
+ log.warn("Affected Tests: error scanning directory {}: {}",
+ LogSanitizer.sanitize(String.valueOf(sourceRoot)),
+ LogSanitizer.sanitize(e.getMessage()));
}
}
@@ -290,12 +356,38 @@ public FileVisitResult preVisitDirectory(Path dir, BasicFileAttributes attrs) {
return FileVisitResult.CONTINUE;
}
+
+ @Override
+ public FileVisitResult visitFileFailed(Path file, IOException exc) {
+ // Per-subtree permission / race errors must not
+ // truncate the entire module discovery walk. Pre-
+ // v1.9.21 the default SimpleFileVisitor rethrew
+ // here and the outer catch swallowed the whole
+ // remaining walk — any modules living next to an
+ // unreadable subtree were silently invisible.
+ log.warn("Affected Tests: skipping unreadable entry {} under {}: {}",
+ LogSanitizer.sanitize(String.valueOf(file)),
+ LogSanitizer.sanitize(String.valueOf(projectDir)),
+ LogSanitizer.sanitize(exc.getMessage()));
+ return FileVisitResult.CONTINUE;
+ }
});
} catch (IOException e) {
- log.warn("Error searching for '{}' under {}: {}", relativeDir, projectDir, e.getMessage());
+ log.warn("Affected Tests: error searching for '{}' under {}: {}",
+ LogSanitizer.sanitize(relativeDir),
+ LogSanitizer.sanitize(String.valueOf(projectDir)),
+ LogSanitizer.sanitize(e.getMessage()));
}
- log.debug("Found {} directories matching '{}' under {}", matches.size(), relativeDir, projectDir);
+ // relativeDir and projectDir are Gradle-config-sourced and trusted by
+ // our threat model, but sanitising at DEBUG too keeps the log-forgery
+ // contract uniform across every visibility level and matches the WARN
+ // path a few lines up. One less thing for a future contributor to get
+ // subtly wrong.
+ log.debug("Found {} directories matching '{}' under {}",
+ matches.size(),
+ LogSanitizer.sanitize(relativeDir),
+ LogSanitizer.sanitize(String.valueOf(projectDir)));
return matches;
}
diff --git a/affected-tests-core/src/main/java/io/affectedtests/core/discovery/UsageStrategy.java b/affected-tests-core/src/main/java/io/affectedtests/core/discovery/UsageStrategy.java
index ca0e36f..8829f4e 100644
--- a/affected-tests-core/src/main/java/io/affectedtests/core/discovery/UsageStrategy.java
+++ b/affected-tests-core/src/main/java/io/affectedtests/core/discovery/UsageStrategy.java
@@ -10,6 +10,7 @@
import com.github.javaparser.ast.expr.ObjectCreationExpr;
import com.github.javaparser.ast.type.ClassOrInterfaceType;
import io.affectedtests.core.config.AffectedTestsConfig;
+import io.affectedtests.core.util.LogSanitizer;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
@@ -92,7 +93,7 @@ private Set scanTestFiles(Set changedProductionClasses,
if (testReferencesChangedClass(cu, changedFqns, simpleNames, simpleNameToFqns)) {
discoveredTests.add(testFqn);
- log.debug("Usage match: {}", testFqn);
+ log.debug("Usage match: {}", LogSanitizer.sanitize(testFqn));
}
}
@@ -150,15 +151,26 @@ private boolean testReferencesChangedClass(CompilationUnit cu,
// `c.d.Outer` (PathToClassMapper is file-based, so it only
// surfaces the outer FQN for the nested class's change). Without
// this, a test that only uses the inner class is silently missed.
+ // All {@code changedFqn} and {@code imported} values in the
+ // Tier 1 / 1b / 2 / 3 blocks below are diff-derived and may
+ // legitimately carry odd but-valid characters that still need
+ // control-char sanitisation before they hit the logger — a
+ // malicious MR can craft an import line like
+ // {@code import com.evil.\u001b[m;}. Sanitisation is applied
+ // even at DEBUG because operators bumping level to chase a
+ // false-positive selection is exactly when forgery-resistance
+ // matters most.
for (String changedFqn : changedFqns) {
if (importedFqns.contains(changedFqn)) {
- log.debug(" Direct import match: {}", changedFqn);
+ log.debug(" Direct import match: {}", LogSanitizer.sanitize(changedFqn));
return true;
}
String innerPrefix = changedFqn + ".";
for (String imported : importedFqns) {
if (imported.startsWith(innerPrefix)) {
- log.debug(" Inner-class import match: {} <- {}", changedFqn, imported);
+ log.debug(" Inner-class import match: {} <- {}",
+ LogSanitizer.sanitize(changedFqn),
+ LogSanitizer.sanitize(imported));
return true;
}
}
@@ -185,14 +197,16 @@ private boolean testReferencesChangedClass(CompilationUnit cu,
// failed, silently dropping the consumer's coverage.
for (String changedFqn : changedFqns) {
if (wildcardPackages.contains(changedFqn)) {
- log.debug(" Wildcard class-member import match: {}", changedFqn);
+ log.debug(" Wildcard class-member import match: {}",
+ LogSanitizer.sanitize(changedFqn));
return true;
}
String pkg = SourceFileScanner.packageOf(changedFqn);
if (wildcardPackages.contains(pkg)) {
String simpleName = SourceFileScanner.simpleClassName(changedFqn);
if (typeNameAppearsInAst(cu, simpleName)) {
- log.debug(" Wildcard package import + type ref match: {}", changedFqn);
+ log.debug(" Wildcard package import + type ref match: {}",
+ LogSanitizer.sanitize(changedFqn));
return true;
}
}
@@ -207,7 +221,8 @@ private boolean testReferencesChangedClass(CompilationUnit cu,
if (testPackage.equals(changedPkg)) {
String simpleName = SourceFileScanner.simpleClassName(changedFqn);
if (typeNameAppearsInAst(cu, simpleName)) {
- log.debug(" Same-package type ref match: {}", changedFqn);
+ log.debug(" Same-package type ref match: {}",
+ LogSanitizer.sanitize(changedFqn));
return true;
}
}
@@ -231,7 +246,9 @@ private boolean testReferencesChangedClass(CompilationUnit cu,
if (scoped == null || !scoped.contains(".")) continue;
for (String changedFqn : changedFqns) {
if (scoped.equals(changedFqn) || scoped.startsWith(changedFqn + ".")) {
- log.debug(" Inline fully-qualified reference: {} -> {}", scoped, changedFqn);
+ log.debug(" Inline fully-qualified reference: {} -> {}",
+ LogSanitizer.sanitize(scoped),
+ LogSanitizer.sanitize(changedFqn));
return true;
}
}
diff --git a/affected-tests-core/src/main/java/io/affectedtests/core/git/GitChangeDetector.java b/affected-tests-core/src/main/java/io/affectedtests/core/git/GitChangeDetector.java
index 13f7f76..ad0d2f9 100644
--- a/affected-tests-core/src/main/java/io/affectedtests/core/git/GitChangeDetector.java
+++ b/affected-tests-core/src/main/java/io/affectedtests/core/git/GitChangeDetector.java
@@ -1,6 +1,7 @@
package io.affectedtests.core.git;
import io.affectedtests.core.config.AffectedTestsConfig;
+import io.affectedtests.core.util.LogSanitizer;
import org.eclipse.jgit.api.Git;
import org.eclipse.jgit.api.errors.GitAPIException;
import org.eclipse.jgit.diff.DiffEntry;
@@ -103,17 +104,26 @@ public Set detectChangedFiles() {
+ "actions/checkout (GitHub) / GIT_DEPTH: 0 (GitLab).",
e);
} catch (IOException e) {
+ // JGit IOException messages can embed pack-file paths, ref names,
+ // or (on PackInvalidException / CorruptObjectException) raw bytes
+ // from the object database — all attacker-influenceable on an MR
+ // branch. Sanitise before embedding so the exception message,
+ // which Gradle renders verbatim into the build log, cannot carry
+ // a forged status line.
throw new IllegalStateException(
"Affected Tests: I/O error while reading git metadata at '" + projectDir + "': "
- + e.getMessage(), e);
+ + LogSanitizer.sanitize(e.getMessage()), e);
} catch (GitAPIException e) {
throw new IllegalStateException(
"Affected Tests: git diff against '" + config.baseRef() + "' failed: "
- + e.getMessage(), e);
+ + LogSanitizer.sanitize(e.getMessage()), e);
}
log.info("Detected {} changed files", changedFiles.size());
- changedFiles.forEach(f -> log.debug(" Changed: {}", f));
+ // File paths come straight from git diff entries on an
+ // attacker-controllable MR branch, so sanitise even at DEBUG
+ // — same rationale as every other diff-sourced log site.
+ changedFiles.forEach(f -> log.debug(" Changed: {}", LogSanitizer.sanitize(f)));
return changedFiles;
}
diff --git a/affected-tests-core/src/main/java/io/affectedtests/core/mapping/PathToClassMapper.java b/affected-tests-core/src/main/java/io/affectedtests/core/mapping/PathToClassMapper.java
index 34b18d0..074256d 100644
--- a/affected-tests-core/src/main/java/io/affectedtests/core/mapping/PathToClassMapper.java
+++ b/affected-tests-core/src/main/java/io/affectedtests/core/mapping/PathToClassMapper.java
@@ -140,9 +140,15 @@ public MappingResult mapChangedFiles(Set changedFiles) {
// {@code ignorePaths} entry is a contract that nothing about
// the file should influence the engine, including nudging it
// into the out-of-scope bucket.
+ // All {@code filePath} values below originate from the git
+ // diff against an attacker-controllable MR branch, so every
+ // log site routes through {@link LogSanitizer}. Even
+ // DEBUG-level sites are worth sanitising because operators
+ // bumping to DEBUG to investigate incidents is exactly when
+ // log-forgery escalation is most likely to matter.
if (isIgnored(filePath)) {
ignoredFiles.add(filePath);
- log.debug("Ignored by pattern: {}", filePath);
+ log.debug("Ignored by pattern: {}", LogSanitizer.sanitize(filePath));
continue;
}
@@ -154,12 +160,12 @@ public MappingResult mapChangedFiles(Set changedFiles) {
if (matchesAny(filePath, outOfScopeSourceMatchers)
|| matchesAny(filePath, outOfScopeTestMatchers)) {
outOfScopeFiles.add(filePath);
- log.debug("Out-of-scope: {}", filePath);
+ log.debug("Out-of-scope: {}", LogSanitizer.sanitize(filePath));
continue;
}
if (!filePath.endsWith(".java")) {
- log.debug("Non-Java file flagged as unmapped: {}", filePath);
+ log.debug("Non-Java file flagged as unmapped: {}", LogSanitizer.sanitize(filePath));
unmappedChangedFiles.add(filePath);
continue;
}
@@ -178,7 +184,9 @@ public MappingResult mapChangedFiles(Set changedFiles) {
// safety net decide (FULL_SUITE in CI mode by default).
String fileName = extractFileName(filePath);
if ("module-info.java".equals(fileName) || "package-info.java".equals(fileName)) {
- log.debug("Java marker file ({}) flagged as unmapped: {}", fileName, filePath);
+ log.debug("Java marker file ({}) flagged as unmapped: {}",
+ LogSanitizer.sanitize(fileName),
+ LogSanitizer.sanitize(filePath));
unmappedChangedFiles.add(filePath);
continue;
}
@@ -187,7 +195,9 @@ public MappingResult mapChangedFiles(Set changedFiles) {
if (testFqn != null) {
testClasses.add(testFqn);
changedTestFiles.add(filePath);
- log.debug("Mapped test file: {} → {}", filePath, testFqn);
+ log.debug("Mapped test file: {} → {}",
+ LogSanitizer.sanitize(filePath),
+ LogSanitizer.sanitize(testFqn));
continue;
}
@@ -195,13 +205,16 @@ public MappingResult mapChangedFiles(Set changedFiles) {
if (prodFqn != null) {
productionClasses.add(prodFqn);
changedProductionFiles.add(filePath);
- log.debug("Mapped production file: {} → {}", filePath, prodFqn);
+ log.debug("Mapped production file: {} → {}",
+ LogSanitizer.sanitize(filePath),
+ LogSanitizer.sanitize(prodFqn));
continue;
}
// A .java file outside the configured source/test dirs — still
// unmappable, still a potential safety escalation trigger.
- log.debug("Java file outside configured source/test dirs flagged as unmapped: {}", filePath);
+ log.debug("Java file outside configured source/test dirs flagged as unmapped: {}",
+ LogSanitizer.sanitize(filePath));
unmappedChangedFiles.add(filePath);
}
@@ -288,7 +301,9 @@ private boolean isIgnored(String filePath) {
// the engine routes it through the unmapped bucket and the
// safety net picks up the oddity, rather than blowing up
// the whole task with an unhandled InvalidPathException.
- log.debug("Ignore check skipped for unparseable path {}: {}", filePath, e.getMessage());
+ log.debug("Ignore check skipped for unparseable path {}: {}",
+ LogSanitizer.sanitize(filePath),
+ LogSanitizer.sanitize(e.getMessage()));
return false;
}
for (PathMatcher matcher : ignoreMatchers) {
diff --git a/affected-tests-core/src/test/java/io/affectedtests/core/config/AffectedTestsConfigTest.java b/affected-tests-core/src/test/java/io/affectedtests/core/config/AffectedTestsConfigTest.java
index b262488..631e729 100644
--- a/affected-tests-core/src/test/java/io/affectedtests/core/config/AffectedTestsConfigTest.java
+++ b/affected-tests-core/src/test/java/io/affectedtests/core/config/AffectedTestsConfigTest.java
@@ -242,6 +242,86 @@ void baseRefRejectsPathTraversal() {
AffectedTestsConfig.builder().baseRef("refs/../../../secrets").build());
}
+ @Test
+ void baseRefRejectsControlCharsInReflogSuffix() {
+ // Regression for v1.9.21 #14: the REV_EXPR pattern's reflog
+ // segment is {@code @\{[^}]+\}}, which happily matches
+ // newline / ESC / CSI / DEL. Pre-fix these inputs passed
+ // {@link AffectedTestsConfig.Builder#isAcceptableBaseRef} and
+ // then flowed unsanitised into
+ // log.info("Base ref: {}", config.baseRef())
+ // inside AffectedTestsEngine and into three IllegalStateException
+ // messages in GitChangeDetector — a log-forgery surface when
+ // baseRef came from an attacker-controlled CI env var.
+ //
+ // The fix is a blanket control-char rejection at the entry gate
+ // of isAcceptableBaseRef, so every downstream matcher (short
+ // SHA, refname, rev-expression) refuses tainted input.
+ String fakePluginStatus = "master@{1\n"
+ + "\u001b[2JAffected Tests: SELECTED (DISCOVERY_SUCCESS) — 0 tests\u001b[m}";
+ assertThrows(IllegalArgumentException.class,
+ () -> AffectedTestsConfig.builder().baseRef(fakePluginStatus).build(),
+ "baseRef containing newline + ANSI inside @{…} must be rejected at "
+ + "validation time; otherwise an attacker could forge plugin-branded "
+ + "CI status lines by poisoning CI_BASE_REF");
+ assertThrows(IllegalArgumentException.class,
+ () -> AffectedTestsConfig.builder().baseRef("master@{1\n}").build(),
+ "Bare newline in reflog segment must also be rejected");
+ assertThrows(IllegalArgumentException.class,
+ () -> AffectedTestsConfig.builder().baseRef("master@{\u001b}").build(),
+ "Bare ESC in reflog segment must be rejected");
+ assertThrows(IllegalArgumentException.class,
+ () -> AffectedTestsConfig.builder().baseRef("master@{\u007F}").build(),
+ "Bare DEL in reflog segment must be rejected");
+ assertThrows(IllegalArgumentException.class,
+ () -> AffectedTestsConfig.builder().baseRef("master\r\noops").build(),
+ "CRLF anywhere in the baseRef must be rejected pre-regex");
+ }
+
+ @Test
+ void baseRefValidationErrorDoesNotEchoRawControlChars() {
+ // Regression for v1.9.21 review finding T2-1: the reject branch of
+ // isAcceptableBaseRef throws IllegalArgumentException whose message
+ // embeds the offending baseRef. Gradle renders that message verbatim
+ // into the build log (and into build-scan HTML), so echoing raw
+ // newline/ESC/CSI/DEL would reintroduce the exact log-forgery
+ // surface that #14 closed on the accept path — just via the
+ // validation-failure path instead of log.info("Base ref: …").
+ //
+ // LogSanitizer must wrap the echoed value so the thrown message
+ // contains no C0, C1, or DEL characters.
+ String tainted = "master@{1\n\u001b[2JAffected Tests: SELECTED\u001b[m}";
+ IllegalArgumentException thrown = assertThrows(
+ IllegalArgumentException.class,
+ () -> AffectedTestsConfig.builder().baseRef(tainted).build());
+ String message = thrown.getMessage();
+ assertNotNull(message, "exception must carry a diagnostic message");
+ // Each forbidden byte class gets its own assertion so a regression
+ // that partially re-opens the surface (say, closes newline but leaks
+ // ESC) still fails loudly with a pointed message.
+ assertFalse(message.indexOf('\n') >= 0,
+ "baseRef validation message must not echo raw newline: " + debugEscape(message));
+ assertFalse(message.indexOf('\r') >= 0,
+ "baseRef validation message must not echo raw CR: " + debugEscape(message));
+ assertFalse(message.indexOf('\u001b') >= 0,
+ "baseRef validation message must not echo raw ESC: " + debugEscape(message));
+ assertFalse(message.indexOf('\u007F') >= 0,
+ "baseRef validation message must not echo raw DEL: " + debugEscape(message));
+ }
+
+ private static String debugEscape(String s) {
+ StringBuilder out = new StringBuilder(s.length() + 8);
+ for (int i = 0; i < s.length(); i++) {
+ char c = s.charAt(i);
+ if (c < 0x20 || c == 0x7F || (c >= 0x80 && c <= 0x9F)) {
+ out.append(String.format("\\u%04X", (int) c));
+ } else {
+ out.append(c);
+ }
+ }
+ return out.toString();
+ }
+
@Test
void baseRefAllowsValidRefs() {
assertDoesNotThrow(() -> AffectedTestsConfig.builder().baseRef("origin/master").build());
diff --git a/affected-tests-core/src/test/java/io/affectedtests/core/discovery/SourceFileScannerTest.java b/affected-tests-core/src/test/java/io/affectedtests/core/discovery/SourceFileScannerTest.java
index 64534b2..64d552f 100644
--- a/affected-tests-core/src/test/java/io/affectedtests/core/discovery/SourceFileScannerTest.java
+++ b/affected-tests-core/src/test/java/io/affectedtests/core/discovery/SourceFileScannerTest.java
@@ -163,6 +163,71 @@ void acceptsSymlinkThatStaysInsideProjectRoot() throws IOException {
"Symlink resolving to a path under the project root must still be accepted");
}
+ @Test
+ void rejectsFileLevelSymlinkEscapingProjectRoot(@TempDir Path outside) throws IOException {
+ // Threat model (v1.9.19 closed the directory-level leg of this
+ // attack; v1.9.21 closes the per-file leg): even when
+ // `src/main/java` is a normal directory inside the project,
+ // an attacker MR can still commit
+ // `src/main/java/com/x/Leak.java -> /etc/passwd`
+ // and the pre-fix scanner would happily enumerate the file,
+ // hand it to JavaParser, and surface the filename in the
+ // discovery pipeline. `visitFile` now rejects symlinks
+ // unconditionally — a regular {@code .java} file whose real
+ // path legitimately lives inside the tree is just a regular
+ // file, not a symlink.
+ Path pkg = tempDir.resolve("src/main/java/com/x");
+ Files.createDirectories(pkg);
+ Files.writeString(pkg.resolve("Ok.java"), "package com.x;\nclass Ok {}");
+
+ Path attackTarget = outside.resolve("Leak.java");
+ Files.writeString(attackTarget, "package evil;\nclass Leak {}");
+
+ Path fileLink = pkg.resolve("Leak.java");
+ try {
+ Files.createSymbolicLink(fileLink, attackTarget);
+ } catch (UnsupportedOperationException | FileSystemException e) {
+ return;
+ }
+
+ List collected = SourceFileScanner.collectJavaFiles(
+ tempDir.resolve("src/main/java"));
+
+ assertEquals(1, collected.size(),
+ "Only the real Ok.java should be collected; the symlinked Leak.java "
+ + "must be skipped even if its real path is outside the project root");
+ assertTrue(collected.get(0).getFileName().toString().equals("Ok.java"));
+ }
+
+ @Test
+ void scansTestFqnsIgnoresFileLevelSymlinks(@TempDir Path outside) throws IOException {
+ // Same hardening contract as above, but for the path that
+ // feeds {@link ProjectIndex#testFqns()} — the two visitors
+ // must agree or an attacker could plant a symlinked file
+ // that is invisible to `collectTestFiles` but visible to
+ // `scanTestFqns` (or vice versa), producing inconsistent
+ // bucket counts that are hard to debug.
+ Path pkg = tempDir.resolve("src/test/java/com/x");
+ Files.createDirectories(pkg);
+ Files.writeString(pkg.resolve("OkTest.java"), "package com.x;\nclass OkTest {}");
+
+ Path attackTarget = outside.resolve("LeakTest.java");
+ Files.writeString(attackTarget, "package evil;\nclass LeakTest {}");
+ Path fileLink = pkg.resolve("LeakTest.java");
+ try {
+ Files.createSymbolicLink(fileLink, attackTarget);
+ } catch (UnsupportedOperationException | FileSystemException e) {
+ return;
+ }
+
+ var fqns = SourceFileScanner.scanTestFqns(tempDir, List.of("src/test/java"));
+
+ assertTrue(fqns.contains("com.x.OkTest"));
+ assertFalse(fqns.contains("com.x.LeakTest"),
+ "Symlinked file must not contribute to test FQN discovery");
+ assertEquals(1, fqns.size());
+ }
+
@Test
void scansTestFqnsFromDeeplyNestedModules() throws IOException {
Path depth1 = tempDir.resolve("api/src/test/java/com/example");