Skip to content

feat: PEP 691 registry client for PyPI trusted library recommendations#607

Merged
ruromero merged 6 commits into
guacsec:mainfrom
ruromero:TC-4335
May 11, 2026
Merged

feat: PEP 691 registry client for PyPI trusted library recommendations#607
ruromero merged 6 commits into
guacsec:mainfrom
ruromero:TC-4335

Conversation

@ruromero
Copy link
Copy Markdown
Collaborator

@ruromero ruromero commented May 8, 2026

Summary

  • Adds PEP 691 registry integration that queries a configurable PyPI-compatible registry to recommend trusted library versions
  • Uses Camel HTTP component with circuit breaker, following the same patterns as TrustifyIntegration and LicensesIntegration
  • Compares SHA-256 hashes from SBOM against registry artifacts: matching hash = skip (already trusted), missing/different hash = recommend
  • Two-pass enrichment: first enriches dependencies already in the vulnerability report, then catches non-vulnerable packages from the DependencyTree
  • URL-encodes repository_url qualifier values in PURLs per the PURL spec (TC-4372)

Jira

TC-4335

Test plan

  • Unit tests for guard clauses (null host, non-pypi deps, existing recommendations, missing hashes)
  • Integration test with WireMock stubs for PEP 691 registry responses (requests + flask)
  • Golden file verification for pypi_report.json including recommendation-only dependencies
  • Unit test verifying URL-encoding of registry URLs with special characters in PURL qualifiers
  • Full test suite passes (241 tests, 0 failures)

🤖 Generated with Claude Code

@sourcery-ai
Copy link
Copy Markdown

sourcery-ai Bot commented May 8, 2026

Reviewer's Guide

Introduces a PEP 691-based PyPI registry integration that enriches analysis reports with trusted-library recommendations for Python dependencies, wiring it into the existing Camel analysis flow, configuring it via properties, and covering it with unit and integration tests plus golden-file verification.

Sequence diagram for PEP 691 PyPI recommendation enrichment in analysis flow

sequenceDiagram
  actor User
  participant ExhortIntegration
  participant CamelAnalyzeSbomRoute as analyzeSbom
  participant Pep691Integration as enrichPypiRecommendations
  participant CamelReportRoute as report
  participant CamelPostProcessRoute as postProcess
  participant Pep691LookupRoute as pep691Lookup
  participant PyPIRegistry as PEP691Registry

  User->>ExhortIntegration: POST analysis request
  ExhortIntegration->>CamelAnalyzeSbomRoute: direct analyzeSbom
  CamelAnalyzeSbomRoute-->>ExhortIntegration: AnalysisReport + DependencyTree

  ExhortIntegration->>Pep691Integration: direct enrichPypiRecommendations
  activate Pep691Integration
  Pep691Integration->>Pep691Integration: enrichRecommendations(AnalysisReport, DependencyTree)
  Pep691Integration->>Pep691LookupRoute: direct pep691Lookup (purl, sbomSha256)

  activate Pep691LookupRoute
  Pep691LookupRoute->>Pep691LookupRoute: processPep691Request
  Pep691LookupRoute->>PEP691Registry: HTTP GET /{package}/ (Accept: application/vnd.pypi.simple.v1+json)
  alt registry responds
    PEP691Registry-->>Pep691LookupRoute: 200 Pep691Response JSON
  else timeout or error
    PEP691Registry--xPep691LookupRoute: failure/timeout
    Pep691LookupRoute->>Pep691LookupRoute: circuit breaker fallback
    Pep691LookupRoute-->>Pep691Integration: 504 with null body
  end
  Pep691LookupRoute-->>Pep691Integration: HTTP response
  deactivate Pep691LookupRoute

  Pep691Integration->>Pep691Integration: queryRegistryAndCompare
  Pep691Integration->>Pep691Integration: update DependencyReport.recommendation and issues.remediation
  Pep691Integration-->>ExhortIntegration: enriched AnalysisReport
  deactivate Pep691Integration

  ExhortIntegration->>CamelReportRoute: direct report
  CamelReportRoute-->>ExhortIntegration: rendered report
  ExhortIntegration->>CamelPostProcessRoute: direct postProcessAnalysisRequest
  CamelPostProcessRoute-->>User: final response
Loading

Updated class diagram for Pep691Integration and Pep691Response model

classDiagram
  class Pep691Integration {
    <<ApplicationScoped>>
    - Logger LOGGER
    - String PEP691_ACCEPT
    - String PKG_PYPI_PREFIX
    - String HASH_ALG_SHA256
    - String PEP691_URL_PROPERTY
    - String PEP691_PACKAGE_PROPERTY
    - String registryHost
    - String timeout
    - ObjectMapper objectMapper
    - ProducerTemplate producerTemplate
    + void configure()
    - void processPep691Request(Exchange exchange)
    - void handleLookupFallback(Exchange exchange)
    + void enrichRecommendations(Exchange exchange)
    + Optional~PackageRef~ queryRegistryAndCompare(String purlRef, String sbomSha256)
    - boolean matchesVersion(String filename, String prefix)
  }

  class EndpointRouteBuilder {
  }

  class Pep691Response {
    + String name
    + List~FileInfo~ files
  }

  class FileInfo {
    + String filename
    + String url
    + Map~String, String~ hashes
  }

  class AnalysisReport {
  }

  class DependencyTree {
    + Map~String, Map~String, String~~ componentHashes()
    + Set~PackageRef~ getAll()
  }

  class DependencyReport {
    + PackageRef ref
    + PackageRef recommendation
  }

  class PackageRef {
    + String ref()
    + String name()
    + String version()
    + static PackageRefBuilder builder()
  }

  class PackageRefBuilder {
    + PackageRefBuilder purl(String purl)
    + PackageRef build()
  }

  class RemediationTrustedContent {
    + RemediationTrustedContent ref(PackageRef ref)
  }

  class Remediation {
    + Remediation trustedContent(RemediationTrustedContent trustedContent)
  }

  Pep691Integration ..|> EndpointRouteBuilder
  Pep691Response o-- FileInfo
  Pep691Integration --> Pep691Response : uses
  Pep691Integration --> AnalysisReport : enrichRecommendations
  Pep691Integration --> DependencyTree : enrichRecommendations
  Pep691Integration --> DependencyReport : creates
  Pep691Integration --> PackageRef : builds and compares
  Pep691Integration --> PackageRefBuilder : uses builder
  Pep691Integration --> RemediationTrustedContent : wraps recommendation
  Pep691Integration --> Remediation : sets remediation
Loading

File-Level Changes

Change Details Files
Add PEP 691 registry client and enrichment route to compute trusted PyPI recommendations based on SBOM hashes and registry metadata.
  • Introduce Pep691Integration Camel route that enriches AnalysisReport objects and performs PEP 691 HTTP lookups with circuit breaker and timeouts.
  • Implement enrichment logic that first updates existing dependency reports (adding recommendations and trusted-content remediation) and then adds recommendation-only dependencies from the DependencyTree.
  • Implement queryRegistryAndCompare logic to normalize package names, call the registry, filter files by version, compare SHA-256 hashes, and construct recommendation purls with repository_url query parameters.
  • Add PEP 691 response model (Pep691Response and nested FileInfo) with Jackson and Quarkus reflection annotations for JSON deserialization and unknown-field tolerance.
  • Wire the new enrichPypiRecommendations direct route into the ExhortIntegration analysis pipeline after analyzeSbom and before report generation.
src/main/java/io/github/guacsec/trustifyda/integration/registry/Pep691Integration.java
src/main/java/io/github/guacsec/trustifyda/model/registry/Pep691Response.java
src/main/java/io/github/guacsec/trustifyda/integration/backend/ExhortIntegration.java
Make the PEP 691 registry configurable and available in tests via application properties and WireMock setup.
  • Add api.pypi.registry.host and api.pypi.registry.timeout properties, with env-var based defaults, to application.properties.
  • Expose api.pypi.registry.host from WiremockExtension so tests can point the client at the mock server.
  • Normalize the mocked registry_url in golden JSON outputs to the production trusted-libraries endpoint via a helper in AbstractAnalysisTest.
src/main/resources/application.properties
src/test/java/io/github/guacsec/trustifyda/extensions/WiremockExtension.java
src/test/java/io/github/guacsec/trustifyda/integration/AbstractAnalysisTest.java
Add unit and integration test coverage for the PEP 691 integration, including guard clauses and golden-file verification of recommendations.
  • Create Pep691IntegrationTest to validate guard behavior (empty/null host, non-AnalysisReport bodies, non-PyPI deps, missing SHA-256, and pre-existing recommendations).
  • Create Pep691AnalysisTest Quarkus integration test that stubs Trustify, deps.dev, and PEP 691 registry responses and asserts on the end-to-end /api/v5/analysis output via a golden pypi_report.json file.
  • Add Pep691ResponseTest to verify JSON deserialization behavior for various PEP 691 registry responses, including unknown fields and multiple files.
  • Introduce CycloneDX SBOM fixture with hashes and multiple WireMock JSON fixtures for deps.dev, PEP 691 responses, and expected reports.
src/test/java/io/github/guacsec/trustifyda/integration/registry/Pep691IntegrationTest.java
src/test/java/io/github/guacsec/trustifyda/integration/Pep691AnalysisTest.java
src/test/java/io/github/guacsec/trustifyda/model/registry/Pep691ResponseTest.java
src/test/resources/cyclonedx/pypi-sbom-with-hashes.json
src/test/resources/__files/depsdev/pypi_request.json
src/test/resources/__files/depsdev/pypi_response.json
src/test/resources/__files/pypi-registry/flask_response.json
src/test/resources/__files/pypi-registry/requests_response.json
src/test/resources/__files/reports/pypi_report.json
src/test/resources/__files/trustify/pypi_report.json
src/test/resources/__files/trustify/pypi_request.json

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Copy Markdown

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 1 issue, and left some high level feedback:

  • The matchesVersion helper mixes normalized and original strings (normalizedFilename/normalizedPrefix vs filename/prefix), so the charAt(prefix.length()) check can read the wrong position or even throw for some inputs; consider consistently using the normalized values for both the prefix check and the following-character check.
  • When constructing the recommendation purl query string in queryRegistryAndCompare, baseUrl is interpolated directly into ?repository_url= without URL-encoding, which can produce invalid purls if the registry host ever contains reserved characters; it would be safer to encode the value before concatenation.
  • The logic that appends recommendation-only dependencies walks all providers/sources and then attaches the new DependencyReport to the first non-null source, which may be surprising if multiple providers/sources exist; if that is intentional, consider making the target provider/source explicit (e.g. by key) to avoid accidental attachment in multi-provider scenarios.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The `matchesVersion` helper mixes normalized and original strings (`normalizedFilename`/`normalizedPrefix` vs `filename`/`prefix`), so the `charAt(prefix.length())` check can read the wrong position or even throw for some inputs; consider consistently using the normalized values for both the prefix check and the following-character check.
- When constructing the recommendation purl query string in `queryRegistryAndCompare`, `baseUrl` is interpolated directly into `?repository_url=` without URL-encoding, which can produce invalid purls if the registry host ever contains reserved characters; it would be safer to encode the value before concatenation.
- The logic that appends recommendation-only dependencies walks all providers/sources and then attaches the new `DependencyReport` to the first non-null source, which may be surprising if multiple providers/sources exist; if that is intentional, consider making the target provider/source explicit (e.g. by key) to avoid accidental attachment in multi-provider scenarios.

## Individual Comments

### Comment 1
<location path="src/main/java/io/github/guacsec/trustifyda/integration/registry/Pep691Integration.java" line_range="293-294" />
<code_context>
+          if (sbomSha256 != null && registrySha256.equalsIgnoreCase(sbomSha256)) {
+            return Optional.empty();
+          }
+          return Optional.of(
+              PackageRef.builder()
+                  .purl(PKG_PYPI_PREFIX + name + "@" + version + "?repository_url=" + baseUrl)
+                  .build());
</code_context>
<issue_to_address>
**issue:** Repository URL should be URL-encoded when embedded as a purl query parameter

`repository_url` is concatenated directly into the purl query string. If `baseUrl` contains reserved characters (e.g. `?`, `#`, spaces), the resulting purl may be invalid and break consumers. Please URL-encode `baseUrl` for use in the query string (or construct the purl via a query-parameter-aware utility) before passing it to `PackageRef.builder().purl(...)`.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@ruromero
Copy link
Copy Markdown
Collaborator Author

ruromero commented May 9, 2026

Verification Report — TC-4335

Check Verdict Details
Scope Containment ⚠️ WARN PR adds fix: allow start backend without any additional provider (commit f7ad8e48) which is not mentioned in the task description — functionally related but outside explicit scope
Diff Size ✅ PASS 1662 additions, 4 deletions across 23 files — within acceptable range for a new feature with tests and fixtures
Commit Traceability ⚠️ WARN Only the first commit references TC-4335 in its body; remaining 4 commits lack a task reference
Sensitive Patterns ✅ PASS No secrets, credentials, or sensitive data patterns detected
CI Status ✅ PASS All CI checks passing
Acceptance Criteria ✅ PASS All 7 acceptance criteria from the task description are satisfied by the implementation
Verification Commands ➖ N/A No verification commands specified in the task
Test Quality ⚠️ WARN Minor repetitive patterns in RegistryEnrichmentServiceTest.java (similar buildReportWithDep helper duplicated across test classes); test-level Javadoc missing on all new test classes
Test Change Classification ➕ ADDITIVE All test files are new additions — no existing test modifications

Findings

Scope Containment: Commit f7ad8e48 (fix: allow start backend without any additional provider) addresses a startup failure when no vulnerability providers are configured. While functionally related to the new registry integration, it is a separate fix not mentioned in TC-4335's scope. Consider whether this should be a separate PR.

Commit Traceability: Commits 21f9a4e8, acf4fd3d, ff82fe05, and f7ad8e48 do not reference TC-4335 in their commit messages. Only 1a97b732 includes "Addresses TC-4335" in the body. Following Conventional Commits, each commit should reference the task.

Test Quality: The buildReportWithDep helper pattern is duplicated across Pep691IntegrationTest, RegistryEnrichmentServiceTest, and Pep691AnalysisTest. Consider extracting a shared test utility. New test classes lack class-level documentation.

Sub-Tasks Created

Sub-Task Review Comment Classification
TC-4372 URL-encode repository_url qualifier in PURL construction Code change request

This report was AI-generated by sdlc-workflow/verify-pr v0.8.2.

@ruromero ruromero force-pushed the TC-4335 branch 2 times, most recently from 51a31ab to c31e4c8 Compare May 9, 2026 19:24
Copy link
Copy Markdown
Contributor

@a-oren a-oren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

ruromero and others added 6 commits May 11, 2026 08:37
…tions

Query the Red Hat Trusted Libraries Python registry using PEP 691 JSON API
to generate trusted library recommendations for pkg:pypi packages. When a
package exists in the registry but its SHA-256 hash differs from the SBOM
hash, a recommendation PURL with repository_url qualifier is emitted.

The feature is disabled when PYPI_REGISTRY_HOST is empty or unset.

Addresses TC-4335

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace java.net.http.HttpClient with Camel HTTP component following
the existing patterns used by TrustifyIntegration and LicensesIntegration.
Uses ProducerTemplate to invoke a dedicated pep691Lookup sub-route with
circuit breaker, proper header management, and .toD() for dynamic URLs.

Also fixes two bugs:
- Recommend packages even when SBOM has no SHA-256 hash (null hash =
  always recommend since we can't verify the artifact is already trusted)
- Process all pypi packages from DependencyTree, not just those already
  in the vulnerability report (catches packages with no vulnerabilities)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ichment

Extract the three-pass enrichment logic from Pep691Integration into a
stateless RegistryEnrichmentService that accepts a BiFunction for the
registry query strategy. This enables adding Maven Central, Go proxy,
npm, etc. without duplicating enrichment code — each new ecosystem
reuses the service with its own query function.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… enrichment

Replace hardcoded enrichPypiRecommendations route with a CDI-based
discovery pattern. TrustedLibrariesIntegration discovers all
RegistryIntegration beans, runs enabled ones sequentially with
exception isolation. Adding a new ecosystem (Maven, Go, npm) now
requires only one new class — zero changes to existing code.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Ruben Romero Montes <rromerom@redhat.com>
Wrap baseUrl with URLEncoder.encode() when building the PURL qualifier
so that special characters (:/&=) are percent-encoded per the PURL spec.
Add unit test verifying the encoding with a URL containing query params.

Implements TC-4372

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@ruromero ruromero merged commit c76ddd4 into guacsec:main May 11, 2026
2 checks passed
@ruromero ruromero deleted the TC-4335 branch May 11, 2026 06:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants