Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #1628 +/- ##
=======================================
+ Coverage 90.4% 90.6% +0.1%
=======================================
Files 441 445 +4
Lines 38314 39495 +1181
Branches 2347 2407 +60
=======================================
+ Hits 34674 35815 +1141
- Misses 3159 3183 +24
- Partials 481 497 +16 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
This PR introduces MavenWithFallbackDetector, an experimental Maven detector that provides resilient dependency detection by combining Maven CLI execution with static pom.xml parsing fallback. The detector automatically falls back to static parsing when Maven CLI fails (e.g., due to authentication errors or missing CLI), ensuring components are still detected even in challenging build environments.
Changes:
- Added
MavenWithFallbackDetectorwith dual detection strategy (Maven CLI primary, static parsing fallback) and comprehensive telemetry - Registered new experiment configuration comparing the new detector against the standard
MvnCliComponentDetector - Added 13 comprehensive unit tests covering CLI scenarios, static parser edge cases, environment variable control, and nested pom.xml handling
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
src/Microsoft.ComponentDetection.Detectors/maven/MavenWithFallbackDetector.cs |
Main detector implementation with Maven CLI detection, static XML parsing fallback, nested pom filtering, and telemetry tracking |
src/Microsoft.ComponentDetection.Orchestrator/Experiments/Configs/MavenWithFallbackExperiment.cs |
Experiment configuration to compare new detector against existing MvnCliComponentDetector |
src/Microsoft.ComponentDetection.Orchestrator/Extensions/ServiceCollectionExtensions.cs |
Registered new detector and experiment in DI container |
test/Microsoft.ComponentDetection.Detectors.Tests/MavenWithFallbackDetectorTests.cs |
Comprehensive test suite with 13 tests covering all detection scenarios and edge cases |
src/Microsoft.ComponentDetection.Detectors/maven/MavenWithFallbackDetector.cs
Show resolved
Hide resolved
src/Microsoft.ComponentDetection.Detectors/maven/MavenWithFallbackDetector.cs
Outdated
Show resolved
Hide resolved
src/Microsoft.ComponentDetection.Detectors/maven/MavenWithFallbackDetector.cs
Show resolved
Hide resolved
src/Microsoft.ComponentDetection.Detectors/maven/MavenWithFallbackDetector.cs
Outdated
Show resolved
Hide resolved
test/Microsoft.ComponentDetection.Detectors.Tests/MavenWithFallbackDetectorTests.cs
Outdated
Show resolved
Hide resolved
src/Microsoft.ComponentDetection.Detectors/maven/MavenWithFallbackDetector.cs
Show resolved
Hide resolved
|
👋 Hi! It looks like you modified some files in the
If none of the above scenarios apply, feel free to ignore this comment 🙂 |
src/Microsoft.ComponentDetection.Detectors/maven/MavenWithFallbackDetector.cs
Show resolved
Hide resolved
src/Microsoft.ComponentDetection.Detectors/maven/MavenWithFallbackDetector.cs
Outdated
Show resolved
Hide resolved
src/Microsoft.ComponentDetection.Detectors/maven/MavenWithFallbackDetector.cs
Show resolved
Hide resolved
src/Microsoft.ComponentDetection.Detectors/maven/MavenWithFallbackDetector.cs
Outdated
Show resolved
Hide resolved
test/Microsoft.ComponentDetection.Detectors.Tests/MavenWithFallbackDetectorTests.cs
Outdated
Show resolved
Hide resolved
src/Microsoft.ComponentDetection.Detectors/maven/MavenWithFallbackDetector.cs
Show resolved
Hide resolved
src/Microsoft.ComponentDetection.Detectors/maven/MavenWithFallbackDetector.cs
Outdated
Show resolved
Hide resolved
src/Microsoft.ComponentDetection.Detectors/maven/MavenWithFallbackDetector.cs
Show resolved
Hide resolved
src/Microsoft.ComponentDetection.Detectors/maven/MavenWithFallbackDetector.cs
Show resolved
Hide resolved
src/Microsoft.ComponentDetection.Detectors/maven/MavenWithFallbackDetector.cs
Show resolved
Hide resolved
src/Microsoft.ComponentDetection.Detectors/maven/MavenWithFallbackDetector.cs
Show resolved
Hide resolved
src/Microsoft.ComponentDetection.Detectors/maven/MavenWithFallbackDetector.cs
Outdated
Show resolved
Hide resolved
test/Microsoft.ComponentDetection.Detectors.Tests/MavenWithFallbackDetectorTests.cs
Outdated
Show resolved
Hide resolved
src/Microsoft.ComponentDetection.Detectors/maven/MavenWithFallbackDetector.cs
Show resolved
Hide resolved
...Microsoft.ComponentDetection.Orchestrator/Experiments/Configs/MavenWithFallbackExperiment.cs
Outdated
Show resolved
Hide resolved
test/Microsoft.ComponentDetection.Detectors.Tests/MavenWithFallbackDetectorTests.cs
Show resolved
Hide resolved
test/Microsoft.ComponentDetection.Detectors.Tests/MavenWithFallbackDetectorTests.cs
Show resolved
Hide resolved
src/Microsoft.ComponentDetection.Detectors/maven/MavenWithFallbackDetector.cs
Outdated
Show resolved
Hide resolved
src/Microsoft.ComponentDetection.Detectors/maven/MavenWithFallbackDetector.cs
Outdated
Show resolved
Hide resolved
test/Microsoft.ComponentDetection.Detectors.Tests/MavenWithFallbackDetectorTests.cs
Outdated
Show resolved
Hide resolved
src/Microsoft.ComponentDetection.Detectors/maven/MavenWithFallbackDetector.cs
Show resolved
Hide resolved
| public bool IsInExperimentGroup(IComponentDetector componentDetector) => componentDetector is MavenWithFallbackDetector; | ||
|
|
||
| /// <inheritdoc /> | ||
| public bool ShouldRecord(IComponentDetector componentDetector, int numComponents) => true; |
There was a problem hiding this comment.
only record if either MavenWithFallbackDetector or MvnCliComponentDetector have at least one component, otherwise you will be just flushing experiments without any meaningful data.
There was a problem hiding this comment.
Added numComponents > 0
src/Microsoft.ComponentDetection.Detectors/maven/MavenWithFallbackDetector.cs
Outdated
Show resolved
Hide resolved
| "Could not authenticate", | ||
| "Access denied", | ||
| "Forbidden", | ||
| "status code: 401", |
There was a problem hiding this comment.
Aren't this already covered by line 106 and 107?
There was a problem hiding this comment.
Trimmed it to
private static readonly string[] AuthErrorPatterns =
[
"401",
"403",
"Unauthorized",
"Access denied",
];
src/Microsoft.ComponentDetection.Detectors/maven/MavenWithFallbackDetector.cs
Show resolved
Hide resolved
src/Microsoft.ComponentDetection.Detectors/maven/MavenWithFallbackDetector.cs
Outdated
Show resolved
Hide resolved
…tps://github.com/microsoft/component-detection into users/zhentan/maven-combined-detector-experiment
src/Microsoft.ComponentDetection.Detectors/maven/MavenWithFallbackDetector.cs
Outdated
Show resolved
Hide resolved
test/Microsoft.ComponentDetection.Detectors.Tests/MavenWithFallbackDetectorTests.cs
Outdated
Show resolved
Hide resolved
test/Microsoft.ComponentDetection.Detectors.Tests/MavenWithFallbackDetectorTests.cs
Show resolved
Hide resolved
test/Microsoft.ComponentDetection.Detectors.Tests/MavenWithFallbackDetectorTests.cs
Outdated
Show resolved
Hide resolved
src/Microsoft.ComponentDetection.Detectors/maven/MavenWithFallbackDetector.cs
Show resolved
Hide resolved
test/Microsoft.ComponentDetection.Detectors.Tests/MavenWithFallbackDetectorTests.cs
Show resolved
Hide resolved
| public bool IsInExperimentGroup(IComponentDetector componentDetector) => componentDetector is MavenWithFallbackDetector; | ||
|
|
||
| /// <inheritdoc /> | ||
| public bool ShouldRecord(IComponentDetector componentDetector, int numComponents) => numComponents > 0; |
There was a problem hiding this comment.
See other experiments, just looking at components will cause to skip the whole experiment altogether, you should always return true, unless the new detector does not return any dependencies.
There was a problem hiding this comment.
updated with
public bool ShouldRecord(IComponentDetector componentDetector, int numComponents)
{
// Only record telemetry if a Maven detector found components,
// indicating Maven projects were detected and scanned.
if (componentDetector is MvnCliComponentDetector or MavenWithFallbackDetector)
{
return numComponents > 0;
}
return true;
}
Looking at the other experiments and the one I wrote for nuget, this should be correct.
| var mvnCliResults = this.ComponentStreamEnumerableFactory | ||
| .GetComponentStreams( | ||
| this.CurrentScanRequest.SourceDirectory, | ||
| [BcdeMvnDepsFileName], | ||
| this.CurrentScanRequest.DirectoryExclusionPredicate) | ||
| .Select(componentStream => | ||
| { | ||
| var depsDir = Path.GetDirectoryName(componentStream.Location); | ||
| successfulDirectories.Add(depsDir); | ||
|
|
||
| using var reader = new StreamReader(componentStream.Stream); | ||
| var content = reader.ReadToEnd(); | ||
| return new ProcessRequest | ||
| { | ||
| ComponentStream = new ComponentStream | ||
| { | ||
| Stream = new MemoryStream(Encoding.UTF8.GetBytes(content)), | ||
| Location = componentStream.Location, | ||
| Pattern = BcdeMvnDepsFileName, | ||
| }, | ||
| SingleFileComponentRecorder = this.ComponentRecorder.CreateSingleFileComponentRecorder( | ||
| Path.Combine(depsDir, MavenManifest)), | ||
| }; | ||
| }) | ||
| .ToList(); |
There was a problem hiding this comment.
Is this the preexisting behavior of MvnCLI? Seem risky to be enumerating streams within the same file.
| // Check if we should skip Maven CLI and use static parsing only | ||
| if (this.ShouldSkipMavenCli()) | ||
| { | ||
| return processRequests; | ||
| } | ||
|
|
||
| // Check if Maven CLI is available | ||
| if (!await this.TryInitializeMavenCliAsync()) | ||
| { | ||
| return processRequests; | ||
| } | ||
|
|
||
| // Run Maven CLI detection on all pom.xml files | ||
| var pomFileDirectories = await this.RunMavenCliDetectionAsync(processRequests, cancellationToken); | ||
|
|
||
| // Collect results and determine which files succeeded/failed | ||
| var (mvnCliResults, failedPomFiles) = this.CollectMvnCliResults(pomFileDirectories); | ||
|
|
||
| // Determine final detection method and return appropriate results | ||
| return this.DetermineDetectionOutcome(mvnCliResults, failedPomFiles, pomFileDirectories.Count); |
There was a problem hiding this comment.
Confirm this method is protected by the experimental coverage. IIRC it does not have the same guardrails as "OnFileFound", hence wrap all of this with a try catch and chain an additional cancellation token so the execution stops within 4-6 minutes. Assuming most repos complete the full scan within this window. That way we can see execution hit limits in our telemetry but no unexpected hangs for existing users.
| } | ||
|
|
||
| // Partial success - use MvnCli results for successful ones, static parsing for failed ones | ||
| this.LogWarning($"Maven CLI failed for {failedCount} pom.xml files. Using MvnCli results for successful files and falling back to static parsing for failed files."); |
There was a problem hiding this comment.
I haven't manually validated these changes, but have we validated that if a root pom.xml file succeeded with mvn cli, we are excluding child ones? It will be unnecessary work if run CLI or parse directly pom.xml on child pom.xml when the root one covered them. Although the manual parsing is faster, the slow one that significantly impacts perf is MvnCLI.
| return mvnCliResults.Concat(staticParsingResults).ToObservable(); | ||
| } | ||
|
|
||
| protected override async Task OnFileFoundAsync( |
There was a problem hiding this comment.
nit: I don't think there is the need to add the async/await here, we only need to have a Task output.
MavenWithFallbackDetector
Overview
The
MavenWithFallbackDetectoris an experimental Maven detector that combines Maven CLI detection with static pom.xml parsing fallback. It provides resilient Maven dependency detection even when the Maven CLI fails (e.g., due to authentication errors with private repositories).Key Features
mvn dependency:tree) for full transitive dependency resolutionDetection Flow
Core Components
1. Environment Variable Control
CD_MAVEN_DISABLE_CLI:
trueto bypass Maven CLI entirely and use only static parsingUseful when:
2. Separate Maven Local Repository
To avoid race conditions with
MvnCliComponentDetector(which also uses Maven CLI), this detector uses a temporary, isolated Maven local repository:Why? When both detectors run in parallel:
pom.xmlfiles simultaneously~/.m2/repositorycause file locking conflictsLifecycle:
maven-fallback-detector-<guid>)-Dmaven.repo.local=<tempdir>parameter3. Cancellation Token Propagation
Experimental detectors have a 4-minute timeout enforced by the orchestrator. The
MavenWithFallbackDetectorproperly propagates this cancellation token through the entire call chain:Key points:
Process.Kill()MvnCLIFileLevelTimeoutSecondsenvironment variableCreateLinkedTokenSourcecombines both timeouts, so whichever triggers first will cancel the operation4. RemoveNestedPomXmls
Filters pom.xml files to keep only root-level ones for Maven CLI processing.
Why? In multi-module Maven projects:
Running
mvn dependency:treeon the root pom automatically processes all child modules. Processing each nested pom separately would be redundant and slow.Algorithm:
5. Maven CLI Execution
Uses
IMavenCommandService.GenerateDependenciesFileAsync()to run:This generates a
bcde-fallback.mvndepsfile containing the full dependency tree.6. Failure Detection
After Maven CLI runs, the detector scans for
bcde-fallback.mvndepsfiles:7. Static Parsing Fallback
When Maven CLI fails, static parsing extracts dependencies directly from pom.xml:
Limitations of static parsing:
[1.0,2.0))${project.version})8. Nested Pom Restoration
When Maven CLI fails (partial or complete), nested pom.xml files that were filtered out by
RemoveNestedPomXmlsare restored for static parsing.Why? Maven CLI on the root pom would have processed nested modules. Since it failed, we need to parse each nested pom.xml individually.
Implementation:
GetAllPomFilesForStaticParsing()- Re-scans ALL pom.xml files (used for complete failure)GetStaticParsingRequestsForFailedFiles()- Re-scans pom.xml files only under failed directories (used for partial failure)9. Authentication Error Analysis
When Maven CLI fails, the detector analyzes error patterns:
If authentication errors are detected:
fallbackReason = MavenFallbackReason.AuthenticationFailureDetection Methods
MvnCliOnlyStaticParserOnlyMixedFallback Reasons
NoneMvnCliDisabledByUserCD_MAVEN_DISABLE_CLI=trueMavenCliNotAvailablemvncommand not found in PATHAuthenticationFailureOtherMvnCliFailureTelemetry
The detector records the following telemetry:
DetectionMethodMvnCliOnly,StaticParserOnly, orMixedFallbackReasonMvnCliComponentCountStaticParserComponentCountTotalComponentCountMavenCliAvailableOriginalPomFileCountFailedEndpointsVersion Property Resolution
Static parsing supports resolving version properties:
The resolver:
<properties>section in current documentUsage
As Part of Component Detection
The detector is registered as an experimental detector and runs automatically when Component Detection scans a directory containing
pom.xmlfiles.Parallel Execution with MvnCliComponentDetector
This detector can run in parallel with
MvnCliComponentDetectorwithout conflicts because:bcde-fallback.mvndepsvsbcde.mvndeps)This enables A/B testing and gradual migration without requiring environment variable switches.
Disabling Maven CLI (Static Parsing Only)
Set the environment variable to use only static parsing:
Experiment Configuration
The detector is paired with
MavenWithFallbackExperimentwhich compares results against the standardMvnCliComponentDetector:MvnCliComponentDetectorMavenWithFallbackDetectorEnable experiments via:
export CD_DETECTOR_EXPERIMENTS=trueUnit Tests
The following unit tests are implemented in
MavenWithFallbackDetectorTests.cs:Maven CLI Scenarios
WhenMavenCliNotAvailable_FallsBackToStaticParsing_AsyncWhenMavenCliNotAvailable_DetectsMultipleDependencies_AsyncWhenMavenCliSucceeds_UsesMvnCliResults_AsyncWhenMavenCliSucceeds_PreservesTransitiveDependencies_AsyncWhenMavenCliProducesNoOutput_FallsBackToStaticParsing_AsyncStatic Parser Edge Cases
StaticParser_IgnoresDependenciesWithoutVersion_AsyncStaticParser_IgnoresDependenciesWithVersionRanges_Async[1.0,2.0)) are skippedStaticParser_ResolvesPropertyVersions_Async${property}versions are resolved from<properties>StaticParser_IgnoresDependenciesWithUnresolvablePropertyVersions_AsyncEmpty/Missing File Scenarios
WhenNoPomXmlFiles_ReturnsSuccessWithNoComponents_AsyncWhenPomXmlHasNoDependencies_ReturnsSuccessWithNoComponents_AsyncEnvironment Variable Control
WhenDisableMvnCliTrue_UsesStaticParsing_AsyncCD_MAVEN_DISABLE_CLI=truebypasses Maven CLI entirely and uses static parsingWhenDisableMvnCliEnvVarIsFalse_UsesMvnCliNormally_AsyncCD_MAVEN_DISABLE_CLI=falseallows Maven CLI to run normallyWhenDisableMvnCliEnvVarNotSet_UsesMvnCliNormally_AsyncWhenDisableMvnCliEnvVarSetToInvalidValue_UsesMvnCliNormally_AsyncCancellation Token Propagation
The cancellation token from the orchestrator's 4-minute experimental detector timeout is propagated through the call chain:
OnPrepareDetectionAsyncreceivescancellationTokenparameter from the frameworkGenerateDependenciesFileAsyncreceives and passes the token to the CLI serviceMavenCommandServicelinks the token withCreateLinkedTokenSourcefor combined timeoutsCommandLineInvocationServiceregisterscancellationToken.Register(() => process.Kill())to kill MavenThis ensures that if the 4-minute timeout is reached, the Maven CLI process is immediately terminated.
Nested Pom.xml Handling
WhenMvnCliSucceeds_NestedPomXmlsAreFilteredOut_AsyncWhenMvnCliFailsCompletely_AllNestedPomXmlsAreRestoredForStaticParsing_AsyncWhenMvnCliPartiallyFails_NestedPomXmlsRestoredOnlyForFailedDirectories_AsyncError Analysis and Telemetry
WhenMvnCliFailsWithAuthError_LogsFailedEndpointAndSetsTelemetry_AsyncAuthenticationFailureWhenMvnCliFailsWithNonAuthError_SetsFallbackReasonToOther_AsyncOtherMvnCliFailurewithout extracting endpointsFile Structure