feat: PEP 691 registry client for PyPI trusted library recommendations#607
Conversation
Reviewer's GuideIntroduces a PEP 691-based PyPI registry integration that enriches analysis reports with trusted-library recommendations for Python dependencies, wiring it into the existing Camel analysis flow, configuring it via properties, and covering it with unit and integration tests plus golden-file verification. Sequence diagram for PEP 691 PyPI recommendation enrichment in analysis flowsequenceDiagram
actor User
participant ExhortIntegration
participant CamelAnalyzeSbomRoute as analyzeSbom
participant Pep691Integration as enrichPypiRecommendations
participant CamelReportRoute as report
participant CamelPostProcessRoute as postProcess
participant Pep691LookupRoute as pep691Lookup
participant PyPIRegistry as PEP691Registry
User->>ExhortIntegration: POST analysis request
ExhortIntegration->>CamelAnalyzeSbomRoute: direct analyzeSbom
CamelAnalyzeSbomRoute-->>ExhortIntegration: AnalysisReport + DependencyTree
ExhortIntegration->>Pep691Integration: direct enrichPypiRecommendations
activate Pep691Integration
Pep691Integration->>Pep691Integration: enrichRecommendations(AnalysisReport, DependencyTree)
Pep691Integration->>Pep691LookupRoute: direct pep691Lookup (purl, sbomSha256)
activate Pep691LookupRoute
Pep691LookupRoute->>Pep691LookupRoute: processPep691Request
Pep691LookupRoute->>PEP691Registry: HTTP GET /{package}/ (Accept: application/vnd.pypi.simple.v1+json)
alt registry responds
PEP691Registry-->>Pep691LookupRoute: 200 Pep691Response JSON
else timeout or error
PEP691Registry--xPep691LookupRoute: failure/timeout
Pep691LookupRoute->>Pep691LookupRoute: circuit breaker fallback
Pep691LookupRoute-->>Pep691Integration: 504 with null body
end
Pep691LookupRoute-->>Pep691Integration: HTTP response
deactivate Pep691LookupRoute
Pep691Integration->>Pep691Integration: queryRegistryAndCompare
Pep691Integration->>Pep691Integration: update DependencyReport.recommendation and issues.remediation
Pep691Integration-->>ExhortIntegration: enriched AnalysisReport
deactivate Pep691Integration
ExhortIntegration->>CamelReportRoute: direct report
CamelReportRoute-->>ExhortIntegration: rendered report
ExhortIntegration->>CamelPostProcessRoute: direct postProcessAnalysisRequest
CamelPostProcessRoute-->>User: final response
Updated class diagram for Pep691Integration and Pep691Response modelclassDiagram
class Pep691Integration {
<<ApplicationScoped>>
- Logger LOGGER
- String PEP691_ACCEPT
- String PKG_PYPI_PREFIX
- String HASH_ALG_SHA256
- String PEP691_URL_PROPERTY
- String PEP691_PACKAGE_PROPERTY
- String registryHost
- String timeout
- ObjectMapper objectMapper
- ProducerTemplate producerTemplate
+ void configure()
- void processPep691Request(Exchange exchange)
- void handleLookupFallback(Exchange exchange)
+ void enrichRecommendations(Exchange exchange)
+ Optional~PackageRef~ queryRegistryAndCompare(String purlRef, String sbomSha256)
- boolean matchesVersion(String filename, String prefix)
}
class EndpointRouteBuilder {
}
class Pep691Response {
+ String name
+ List~FileInfo~ files
}
class FileInfo {
+ String filename
+ String url
+ Map~String, String~ hashes
}
class AnalysisReport {
}
class DependencyTree {
+ Map~String, Map~String, String~~ componentHashes()
+ Set~PackageRef~ getAll()
}
class DependencyReport {
+ PackageRef ref
+ PackageRef recommendation
}
class PackageRef {
+ String ref()
+ String name()
+ String version()
+ static PackageRefBuilder builder()
}
class PackageRefBuilder {
+ PackageRefBuilder purl(String purl)
+ PackageRef build()
}
class RemediationTrustedContent {
+ RemediationTrustedContent ref(PackageRef ref)
}
class Remediation {
+ Remediation trustedContent(RemediationTrustedContent trustedContent)
}
Pep691Integration ..|> EndpointRouteBuilder
Pep691Response o-- FileInfo
Pep691Integration --> Pep691Response : uses
Pep691Integration --> AnalysisReport : enrichRecommendations
Pep691Integration --> DependencyTree : enrichRecommendations
Pep691Integration --> DependencyReport : creates
Pep691Integration --> PackageRef : builds and compares
Pep691Integration --> PackageRefBuilder : uses builder
Pep691Integration --> RemediationTrustedContent : wraps recommendation
Pep691Integration --> Remediation : sets remediation
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey - I've found 1 issue, and left some high level feedback:
- The
matchesVersionhelper mixes normalized and original strings (normalizedFilename/normalizedPrefixvsfilename/prefix), so thecharAt(prefix.length())check can read the wrong position or even throw for some inputs; consider consistently using the normalized values for both the prefix check and the following-character check. - When constructing the recommendation purl query string in
queryRegistryAndCompare,baseUrlis interpolated directly into?repository_url=without URL-encoding, which can produce invalid purls if the registry host ever contains reserved characters; it would be safer to encode the value before concatenation. - The logic that appends recommendation-only dependencies walks all providers/sources and then attaches the new
DependencyReportto the first non-null source, which may be surprising if multiple providers/sources exist; if that is intentional, consider making the target provider/source explicit (e.g. by key) to avoid accidental attachment in multi-provider scenarios.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The `matchesVersion` helper mixes normalized and original strings (`normalizedFilename`/`normalizedPrefix` vs `filename`/`prefix`), so the `charAt(prefix.length())` check can read the wrong position or even throw for some inputs; consider consistently using the normalized values for both the prefix check and the following-character check.
- When constructing the recommendation purl query string in `queryRegistryAndCompare`, `baseUrl` is interpolated directly into `?repository_url=` without URL-encoding, which can produce invalid purls if the registry host ever contains reserved characters; it would be safer to encode the value before concatenation.
- The logic that appends recommendation-only dependencies walks all providers/sources and then attaches the new `DependencyReport` to the first non-null source, which may be surprising if multiple providers/sources exist; if that is intentional, consider making the target provider/source explicit (e.g. by key) to avoid accidental attachment in multi-provider scenarios.
## Individual Comments
### Comment 1
<location path="src/main/java/io/github/guacsec/trustifyda/integration/registry/Pep691Integration.java" line_range="293-294" />
<code_context>
+ if (sbomSha256 != null && registrySha256.equalsIgnoreCase(sbomSha256)) {
+ return Optional.empty();
+ }
+ return Optional.of(
+ PackageRef.builder()
+ .purl(PKG_PYPI_PREFIX + name + "@" + version + "?repository_url=" + baseUrl)
+ .build());
</code_context>
<issue_to_address>
**issue:** Repository URL should be URL-encoded when embedded as a purl query parameter
`repository_url` is concatenated directly into the purl query string. If `baseUrl` contains reserved characters (e.g. `?`, `#`, spaces), the resulting purl may be invalid and break consumers. Please URL-encode `baseUrl` for use in the query string (or construct the purl via a query-parameter-aware utility) before passing it to `PackageRef.builder().purl(...)`.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
Verification Report — TC-4335
FindingsScope Containment: Commit Commit Traceability: Commits Test Quality: The Sub-Tasks Created
This report was AI-generated by sdlc-workflow/verify-pr v0.8.2. |
51a31ab to
c31e4c8
Compare
…tions Query the Red Hat Trusted Libraries Python registry using PEP 691 JSON API to generate trusted library recommendations for pkg:pypi packages. When a package exists in the registry but its SHA-256 hash differs from the SBOM hash, a recommendation PURL with repository_url qualifier is emitted. The feature is disabled when PYPI_REGISTRY_HOST is empty or unset. Addresses TC-4335 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace java.net.http.HttpClient with Camel HTTP component following the existing patterns used by TrustifyIntegration and LicensesIntegration. Uses ProducerTemplate to invoke a dedicated pep691Lookup sub-route with circuit breaker, proper header management, and .toD() for dynamic URLs. Also fixes two bugs: - Recommend packages even when SBOM has no SHA-256 hash (null hash = always recommend since we can't verify the artifact is already trusted) - Process all pypi packages from DependencyTree, not just those already in the vulnerability report (catches packages with no vulnerabilities) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ichment Extract the three-pass enrichment logic from Pep691Integration into a stateless RegistryEnrichmentService that accepts a BiFunction for the registry query strategy. This enables adding Maven Central, Go proxy, npm, etc. without duplicating enrichment code — each new ecosystem reuses the service with its own query function. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… enrichment Replace hardcoded enrichPypiRecommendations route with a CDI-based discovery pattern. TrustedLibrariesIntegration discovers all RegistryIntegration beans, runs enabled ones sequentially with exception isolation. Adding a new ecosystem (Maven, Go, npm) now requires only one new class — zero changes to existing code. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Ruben Romero Montes <rromerom@redhat.com>
Wrap baseUrl with URLEncoder.encode() when building the PURL qualifier so that special characters (:/&=) are percent-encoded per the PURL spec. Add unit test verifying the encoding with a URL containing query params. Implements TC-4372 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary
repository_urlqualifier values in PURLs per the PURL spec (TC-4372)Jira
TC-4335
Test plan
🤖 Generated with Claude Code