Skip to content

sigstore: Cache verify_image results across policy evaluations#3136

Draft
arewm wants to merge 17 commits into
conforma:mainfrom
arewm:ec-1545/sigstore-verify-image-cache
Draft

sigstore: Cache verify_image results across policy evaluations#3136
arewm wants to merge 17 commits into
conforma:mainfrom
arewm:ec-1545/sigstore-verify-image-cache

Conversation

@arewm

@arewm arewm commented Feb 27, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Add sync.Map + singleflight.Group cache for ec.sigstore.verify_image results keyed by ref|hash(opts)
  • OPA's Memoize only deduplicates within a single Eval() call; this cache persists across component evaluations
  • Prevents redundant signature verification when many components share task bundles (e.g., 100 components using git-clone)
  • Follows the caching pattern established by manifestCache/descriptorCache in internal/rego/oci/oci.go

Test plan

  • go build ./internal/rego/sigstore/ compiles cleanly
  • Verify caching behavior with multiple components sharing task bundles
  • Benchmark signature verification with/without cache on a multi-component snapshot

Ref: EC-1545

🤖 Generated with Claude Code

@arewm arewm force-pushed the ec-1545/sigstore-verify-image-cache branch from 3950ea8 to b0710ba Compare February 28, 2026 02:25
@github-actions github-actions Bot added size: L and removed size: S labels Feb 28, 2026
@arewm arewm force-pushed the ec-1545/sigstore-verify-image-cache branch from b0710ba to d029230 Compare February 28, 2026 02:30
@codecov

codecov Bot commented Feb 28, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

Flag Coverage Δ
acceptance 53.47% <77.77%> (+0.03%) ⬆️
generative 16.76% <0.00%> (-0.03%) ⬇️
integration 27.62% <0.00%> (-0.04%) ⬇️
unit 69.12% <100.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
internal/rego/sigstore/sigstore.go 93.41% <100.00%> (+0.46%) ⬆️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@arewm arewm marked this pull request as ready for review February 28, 2026 02:51
@qodo-code-review

Copy link
Copy Markdown
Contributor

Review Summary by Qodo

Cache verify_image results across policy evaluations

✨ Enhancement

Grey Divider

Walkthroughs

Description
• Add caching layer for verify_image results using sync.Map and singleflight.Group
• Cache key combines image reference and SHA256 hash of verification options
• Prevents redundant signature verification across multiple policy evaluations
• Add comprehensive test coverage for cache hit/miss scenarios
Diagram
flowchart LR
  A["sigstoreVerifyImage called"] --> B{"Cache hit?"}
  B -->|Yes| C["Return cached result"]
  B -->|No| D["Use singleflight dedup"]
  D --> E["doVerifyImage execution"]
  E --> F["Store result in cache"]
  F --> G["Return result"]
  C --> H["Result"]
  G --> H
Loading

Grey Divider

File Changes

1. internal/rego/sigstore/sigstore.go ✨ Enhancement +45/-1

Implement verify_image caching with singleflight deduplication

• Add sync.Map and singleflight.Group package imports for caching infrastructure
• Introduce global verifyImageCache and verifyImageFlight variables for cross-evaluation caching
• Refactor sigstoreVerifyImage to check cache first, then use singleflight for deduplication
• Extract core verification logic into new doVerifyImage function
• Add buildCacheKey function that combines ref and SHA256 hash of options
• Add ClearCaches utility function for testing purposes

internal/rego/sigstore/sigstore.go


2. internal/rego/sigstore/sigstore_test.go 🧪 Tests +86/-0

Add comprehensive cache behavior test coverage

• Add ClearCaches() call in existing TestSigstoreVerifyImage test setup
• Add new TestVerifyImageCache test function with four test cases
• Test cache hit when same ref and opts are used multiple times
• Test cache miss when different ref is used with same opts
• Test cache miss when different opts are used with same ref
• Test that ClearCaches forces re-verification of previously cached results

internal/rego/sigstore/sigstore_test.go


Grey Divider

Qodo Logo

@qodo-code-review

qodo-code-review Bot commented Feb 28, 2026

Copy link
Copy Markdown
Contributor

Code Review by Qodo

🐞 Bugs (3) 📘 Rule violations (0) 📎 Requirement gaps (0)

Grey Divider


Action required

1. Caches failed verifications 🐞 Bug ⛯ Reliability
Description
verify_image stores results in the global cache unconditionally, including failure results
(success=false) returned via signatureFailedResult. This can turn transient verification errors
into persistent failures for the lifetime of the process, preventing retries in later policy
evaluations/workers.
Code

internal/rego/sigstore/sigstore.go[R141-150]

+	resultIface, err, _ := verifyImageFlight.Do(cacheKey, func() (interface{}, error) {
+		return doVerifyImage(bctx, logger, uri, optsTerm)
+	})
+	if err != nil {
+		return nil, err
+	}
+
+	result := resultIface.(*ast.Term)
+	verifyImageCache.Store(cacheKey, result)
+	return result, nil
Evidence
sigstoreVerifyImage stores whatever doVerifyImage returns into verifyImageCache without
checking if verification succeeded. doVerifyImage converts errors into a normal result term via
signatureFailedResult (not a Go error), so failures will be cached too. In contrast, other global
caches (OCI) only store successful results and return nil on errors, avoiding caching transient
failures.

internal/rego/sigstore/sigstore.go[131-150]
internal/rego/sigstore/sigstore.go[156-174]
internal/rego/sigstore/sigstore.go[350-379]
internal/rego/oci/oci.go[570-623]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`ec.sigstore.verify_image` unconditionally caches the returned result term, including failure results. This can make transient verification failures persist across later policy evaluations within the same process.

### Issue Context
`doVerifyImage` converts errors into a normal `{success:false, errors:[...]}` term via `signatureFailedResult`, so the cache sees failures as valid values to store.

### Fix Focus Areas
- internal/rego/sigstore/sigstore.go[131-150]
- internal/rego/sigstore/sigstore.go[350-379]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

2. Truncated opts hash key 🐞 Bug ⛨ Security
Description
Cache key uses only the first 8 bytes (64 bits) of SHA-256 over optsTerm.String(). While
collisions are unlikely in typical runs, a collision would cause results computed with different
verification options to be reused incorrectly.
Code

internal/rego/sigstore/sigstore.go[R180-184]

+func buildCacheKey(ref string, optsTerm *ast.Term) string {
+	h := sha256.Sum256([]byte(optsTerm.String()))
+	optsHash := hex.EncodeToString(h[:8]) // First 8 bytes is enough for dedup
+	return ref + "|" + optsHash
+}
Evidence
buildCacheKey intentionally truncates SHA-256 to 8 bytes before hex encoding, reducing the
collision resistance of the cache key and risking incorrect cache hits across different option sets.

internal/rego/sigstore/sigstore.go[180-184]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`buildCacheKey` truncates SHA-256 to 8 bytes, which increases the chance of key collisions and incorrect cache hits across different verification options.

### Issue Context
Even if collisions are rare in practice, a collision would reuse a verification result computed under different trust/verification settings.

### Fix Focus Areas
- internal/rego/sigstore/sigstore.go[180-184]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


3. Error-return contract break 🐞 Bug ⛯ Reliability
Description
This file documents that rego builtins must not return Go errors, but the new wrapper returns `nil,
err when singleflight.Do` returns an error. This risks inconsistent behavior vs the rest of the
file’s patterns and may cause evaluation failures in unexpected error cases.
Code

internal/rego/sigstore/sigstore.go[R141-146]

+	resultIface, err, _ := verifyImageFlight.Do(cacheKey, func() (interface{}, error) {
+		return doVerifyImage(bctx, logger, uri, optsTerm)
+	})
+	if err != nil {
+		return nil, err
+	}
Evidence
The file header explicitly states these builtins should not return errors. The new wrapper adds an
explicit error-return branch if singleflight returns an error, which conflicts with the documented
behavior and with the rest of the function’s approach of returning structured failure terms.

internal/rego/sigstore/sigstore.go[17-19]
internal/rego/sigstore/sigstore.go[141-146]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`sigstoreVerifyImage` now has a code path that returns a non-nil Go error, contradicting the file’s documented builtin contract.

### Issue Context
This codebase intentionally returns structured `{success:false, errors:[...]}` terms instead of Go errors for rego builtins.

### Fix Focus Areas
- internal/rego/sigstore/sigstore.go[17-19]
- internal/rego/sigstore/sigstore.go[141-150]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

ⓘ The new review experience is currently in Beta. Learn more

Grey Divider

Qodo Logo

@arewm arewm force-pushed the ec-1545/sigstore-verify-image-cache branch from d029230 to c504a19 Compare February 28, 2026 02:55
Comment thread internal/rego/sigstore/sigstore.go Outdated
Comment on lines +141 to +150
resultIface, err, _ := verifyImageFlight.Do(cacheKey, func() (interface{}, error) {
return doVerifyImage(bctx, logger, uri, optsTerm)
})
if err != nil {
return nil, err
}

result := resultIface.(*ast.Term)
verifyImageCache.Store(cacheKey, result)
return result, nil

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

1. Caches failed verifications 🐞 Bug ⛯ Reliability

verify_image stores results in the global cache unconditionally, including failure results
(success=false) returned via signatureFailedResult. This can turn transient verification errors
into persistent failures for the lifetime of the process, preventing retries in later policy
evaluations/workers.
Agent Prompt
### Issue description
`ec.sigstore.verify_image` unconditionally caches the returned result term, including failure results. This can make transient verification failures persist across later policy evaluations within the same process.

### Issue Context
`doVerifyImage` converts errors into a normal `{success:false, errors:[...]}` term via `signatureFailedResult`, so the cache sees failures as valid values to store.

### Fix Focus Areas
- internal/rego/sigstore/sigstore.go[131-150]
- internal/rego/sigstore/sigstore.go[350-379]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Retries are exhausted before results reach the cache
  2. A genuine verification failure would cause policy failure regardless
  3. The misleading-error-pattern concern is real but rare, and the performance cost of not caching failures is a real tradeoff

I can update this caching behavior if desired to not cache failures. If we were to do this then a valid bad signature would result in a much slower evaluation.

With caching failures then we potentially get global signature failures for a transient issue on one task.

I'll need to Conforma team to help decide which option is more reasonable. I am leaning towards not caching failures though.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question:

  1. Retries are exhausted before results reach the cache 2. A genuine verification failure would cause policy failure regardless 3. The misleading-error-pattern concern is real but rare, and the performance cost of not caching failures is a real tradeoff I can update this caching behavior if desired to not cache failures. If we were to do this then a valid bad signature would result in a much slower evaluation. With caching failures then we potentially get global signature failures for a transient issue on one task. I'll need to Conforma team to help decide which option is more reasonable. I am leaning towards not caching failures though.

Answer:

Add an instruction to make the agent treat caching/performance/reliability findings as “tradeoff discussions” unless it can prove a concrete bug, and to explicitly discuss both sides (transient failures vs. cost of retries) before recommending a change.

Suggested config:

[pr_code_suggestions]
extra_instructions = """\
When suggesting changes around caching, retries, or global state:
- Treat them as tradeoffs unless you can demonstrate a concrete incorrect behavior from the diff.
- Always include a short section: "Why this might be a bug" vs "Why this might be intentional".
- Explicitly call out performance impact (extra verification calls), reliability impact (transient failures persisting), and scope (process-wide vs request/worker).
- If the “right” behavior depends on team policy, ask a clarifying question and propose 2-3 options rather than a single directive fix.
- Avoid strong wording like "will prevent retries" unless the code clearly shows retries are possible later and the cache blocks them.
"""

If you also want fewer “debatable” suggestions overall, consider making suggestions more selective:

[pr_code_suggestions]
suggestions_depth = "selective"

Relevant Sources:

@joejstuart

Copy link
Copy Markdown
Contributor

Here's the performance stats I ran. We should try and look for improvements. If we can't, I think we'll have to send out communication on having to up memory requirements.

Current PR

Peak RSS (MB)

Comp w=1 w=35
1c 433.7MB
50c 711.3MB 5574.2MB
191c 5648.3MB

Wall Time (s)

Comp w=1 w=35
1c 33.5s
50c 1155.9s 103.0s
191c 363.6s

main branch

Peak RSS (MB)

Comp w=1 w=35
1c 358.2MB
50c 613.7MB 3136.7MB
191c 3988.8MB

Wall Time (s)

Comp w=1 w=35
1c 33.0s
50c 1005.4s 84.2s
191c 288.4s

@arewm

arewm commented Mar 2, 2026

Copy link
Copy Markdown
Contributor Author

@joejstuart , thanks for that analysis. Is this increase just if we are adding the signature verification? So users would not need to increase memory requirements unless they used that functionality?

If we were trying to keep ~200 components to finish evaluation in under 5 minutes, then we also fail that goal with the associated policy changes.

@joejstuart

Copy link
Copy Markdown
Contributor

@joejstuart , thanks for that analysis. Is this increase just if we are adding the signature verification? So users would not need to increase memory requirements unless they used that functionality?

If we were trying to keep ~200 components to finish evaluation in under 5 minutes, then we also fail that goal with the associated policy changes.

I tested it with the corresponding policy change. Should I test without it?

@arewm

arewm commented Mar 2, 2026

Copy link
Copy Markdown
Contributor Author

I think we needed the test with the policy change to understand the impact of that change. But it would be good to test it without as well to ensure that we don't have regression when that policy isn't being used.

@joejstuart

Copy link
Copy Markdown
Contributor

Without the policy change

Peak RSS (MB)

Comp w=1 w=35
1c 345.4MB
50c 621.9MB 3003.0MB
191c 4111.2MB

Wall Time (s)

Comp w=1 w=35
1c 29.3s
50c 928.7s 81.6s
191c 276.2s

@arewm

arewm commented Mar 3, 2026

Copy link
Copy Markdown
Contributor Author

Thanks, @joejstuart . These are just single runs or averages over multiple runs?

@arewm

arewm commented Mar 5, 2026

Copy link
Copy Markdown
Contributor Author

@joejstuart @simonbaird , would you be okay with this change along with conforma/policy#1680 ?

@joejstuart

joejstuart commented May 1, 2026

Copy link
Copy Markdown
Contributor

Now I'm not seeing much of a performance difference between this and main.

@simonbaird

Copy link
Copy Markdown
Member

/ok-to-test

@simonbaird simonbaird force-pushed the ec-1545/sigstore-verify-image-cache branch from c8a13b9 to c97787f Compare May 14, 2026 19:56
@coderabbitai

coderabbitai Bot commented May 14, 2026

Copy link
Copy Markdown

Caution

Review failed

An error occurred during the review process. Please try again later.

📝 Walkthrough

Walkthrough

The PR adds verification result caching to sigstore.VerifyImage builtin. A SHA-512 hash of the options term combined with the image reference forms the cache key. Concurrent requests for the same key are deduplicated via singleflight to avoid redundant verifications. The ClearCaches() function enables test isolation, and comprehensive tests validate cache behavior across reference changes, option variations, and cache clearing.

Changes

Sigstore verification caching

Layer / File(s) Summary
Cache infrastructure and exports
internal/rego/sigstore/sigstore.go
Imports for hashing (crypto/sha512, encoding/hex) and singleflight are added. Package-level state includes verifyImageCache (sync.Map) for results and verifyImageFlight (singleflight.Group) for deduplication. Exported ClearCaches() resets the cache for test isolation.
Cache key generation and builtin caching
internal/rego/sigstore/sigstore.go
buildCacheKey() helper hashes the options term with SHA-512 and appends to image ref. sigstoreVerifyImage() builtin is reworked to compute cache keys, return cached results on hit, use singleflight to deduplicate concurrent requests for the same key, and store newly computed results back into cache.
Test isolation and cache validation
internal/rego/sigstore/sigstore_test.go
Existing test clears cache before subtests to prevent result leakage. New TestVerifyImageCache() validates cache hits on repeated ref+options (single VerifyImageSignatures call), misses when ref or options differ, and forced re-verification after ClearCaches() via mock call counts.

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 10.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately and concisely describes the main change: adding caching for verify_image results across policy evaluations.
Description check ✅ Passed The description is directly related to the changeset, explaining the caching mechanism, motivation, and test plan for the sigstore verification cache implementation.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@simonbaird

Copy link
Copy Markdown
Member

Rebased to see if it gets closer to green.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
internal/rego/sigstore/sigstore.go (1)

480-483: ⚡ Quick win

ClearCaches reassigning the sync.Map value is racy.

verifyImageCache is a package-level value (not a pointer); reassigning it while other goroutines may still be calling Load/Store on it is a data race. It happens to work in serial test runs, but go test can run tests from other packages or the same package in parallel, and the singleflight callback above also touches it from inner goroutines. Prefer clearing in place.

-func ClearCaches() {
-	verifyImageCache = sync.Map{}
-}
+func ClearCaches() {
+	verifyImageCache.Range(func(k, _ any) bool {
+		verifyImageCache.Delete(k)
+		return true
+	})
+}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/rego/sigstore/sigstore.go` around lines 480 - 483, ClearCaches
currently reassigns the package-level sync.Map verifyImageCache which is racy;
instead clear the map in place to avoid races: in ClearCaches iterate the
existing verifyImageCache (use verifyImageCache.Range) and delete each key
(verifyImageCache.Delete) so any concurrent Load/Store operations remain safe;
update the ClearCaches function to perform in-place deletion rather than
reassigning verifyImageCache.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@internal/rego/sigstore/sigstore_test.go`:
- Around line 224-241: The newFakeContext helper misses a mock for
fake.FakeClient.HasBundles which doVerifyImage calls before
VerifyImageSignatures; update newFakeContext to add an expectation on c (the
*fake.FakeClient) for HasBundles (use mock.Anything for args) and return the
same boolean/error used in other tests (e.g., false, nil) so the HasBundles call
is explicitly verified before VerifyImageSignatures is exercised.

In `@internal/rego/sigstore/sigstore.go`:
- Around line 140-149: The current wrapper unconditionally stores the term
returned by doVerifyImage in verifyImageCache via verifyImageCache.Store,
causing transient verification failures to become permanently cached; change the
flow so only cacheable (deterministic/success) results are stored: have
doVerifyImage indicate cacheability (e.g., return a second bool or make the
ast.Term include a "success"/"cacheable" flag) and update the
verifyImageFlight.Do callback/result handling to inspect that flag and call
verifyImageCache.Store only when cacheable/successful; keep verifyImageFlight.Do
and verifyImageCache usage but skip storing non-cacheable failure terms so
transient errors are not persisted.

---

Nitpick comments:
In `@internal/rego/sigstore/sigstore.go`:
- Around line 480-483: ClearCaches currently reassigns the package-level
sync.Map verifyImageCache which is racy; instead clear the map in place to avoid
races: in ClearCaches iterate the existing verifyImageCache (use
verifyImageCache.Range) and delete each key (verifyImageCache.Delete) so any
concurrent Load/Store operations remain safe; update the ClearCaches function to
perform in-place deletion rather than reassigning verifyImageCache.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: 2a574443-5227-40c2-9e2c-e9fffd56ad98

📥 Commits

Reviewing files that changed from the base of the PR and between 3743934 and c97787f.

📒 Files selected for processing (2)
  • internal/rego/sigstore/sigstore.go
  • internal/rego/sigstore/sigstore_test.go

Comment thread internal/rego/sigstore/sigstore_test.go
Comment on lines +140 to +149
// Use singleflight to deduplicate concurrent requests. doVerifyImage never returns a Go
// error — failures are embedded in the result term — so the error is intentionally discarded.
resultIface, _, _ := verifyImageFlight.Do(cacheKey, func() (interface{}, error) {
return doVerifyImage(bctx, logger, uri, optsTerm)
})

result := resultIface.(*ast.Term)
verifyImageCache.Store(cacheKey, result)
return result, nil
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Transient verification failures are cached for the process lifetime.

doVerifyImage embeds all failures in the returned result term (per the file-level comment, the builtin never returns a Go error). The wrapper stores that result unconditionally in verifyImageCache, so any transient error — e.g. a registry/network failure surfacing from VerifyImageSignatures/VerifyImageAttestations, or a Rekor lookup hiccup — becomes a sticky “failed” verdict for ref|opts for the rest of the process. Subsequent components sharing the same pinned bundle will then deterministically report failure even though a fresh call would succeed. For long evaluations (the 191-component runs mentioned in the PR) this can convert a single flaky call into a wave of false negatives.

Consider only caching successful results, or distinguishing deterministic-input failures (new digest, opts parameter) from verification-call failures and not caching the latter.

🛡️ Sketch: only cache successes

Have doVerifyImage return (*ast.Term, bool /*cacheable*/, error) (or inspect result.Get("success")) and skip verifyImageCache.Store when the result represents a non-deterministic failure path.

-	resultIface, _, _ := verifyImageFlight.Do(cacheKey, func() (interface{}, error) {
-		return doVerifyImage(bctx, logger, uri, optsTerm)
-	})
-
-	result := resultIface.(*ast.Term)
-	verifyImageCache.Store(cacheKey, result)
-	return result, nil
+	resultIface, _, _ := verifyImageFlight.Do(cacheKey, func() (interface{}, error) {
+		return doVerifyImage(bctx, logger, uri, optsTerm)
+	})
+	result := resultIface.(*ast.Term)
+	if success, ok := result.Get(ast.StringTerm("success")).Value.(ast.Boolean); ok && bool(success) {
+		verifyImageCache.Store(cacheKey, result)
+	}
+	return result, nil
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Use singleflight to deduplicate concurrent requests. doVerifyImage never returns a Go
// error — failures are embedded in the result term — so the error is intentionally discarded.
resultIface, _, _ := verifyImageFlight.Do(cacheKey, func() (interface{}, error) {
return doVerifyImage(bctx, logger, uri, optsTerm)
})
result := resultIface.(*ast.Term)
verifyImageCache.Store(cacheKey, result)
return result, nil
}
// Use singleflight to deduplicate concurrent requests. doVerifyImage never returns a Go
// error — failures are embedded in the result term — so the error is intentionally discarded.
resultIface, _, _ := verifyImageFlight.Do(cacheKey, func() (interface{}, error) {
return doVerifyImage(bctx, logger, uri, optsTerm)
})
result := resultIface.(*ast.Term)
if success, ok := result.Get(ast.StringTerm("success")).Value.(ast.Boolean); ok && bool(success) {
verifyImageCache.Store(cacheKey, result)
}
return result, nil
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/rego/sigstore/sigstore.go` around lines 140 - 149, The current
wrapper unconditionally stores the term returned by doVerifyImage in
verifyImageCache via verifyImageCache.Store, causing transient verification
failures to become permanently cached; change the flow so only cacheable
(deterministic/success) results are stored: have doVerifyImage indicate
cacheability (e.g., return a second bool or make the ast.Term include a
"success"/"cacheable" flag) and update the verifyImageFlight.Do callback/result
handling to inspect that flag and call verifyImageCache.Store only when
cacheable/successful; keep verifyImageFlight.Do and verifyImageCache usage but
skip storing non-cacheable failure terms so transient errors are not persisted.

@simonbaird

Copy link
Copy Markdown
Member

/ok-to-test

@st3penta st3penta marked this pull request as draft June 10, 2026 12:42
@arewm

arewm commented Jun 12, 2026

Copy link
Copy Markdown
Contributor Author

@st3penta , did you move this to draft because there is something conflicting or just because there hasn't been much review volume?

Are there any concerns with merging this?

@st3penta

Copy link
Copy Markdown
Contributor

hi @arewm, i moved it to draft because it seemed a bit stale, but talking with the team we agreed we should instead get it green and reviewed, and merge it

@st3penta st3penta marked this pull request as ready for review June 16, 2026 08:02

@fullsend-ai-review fullsend-ai-review Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the review comment for full details.


result := resultIface.(*ast.Term)
verifyImageCache.Store(cacheKey, result)
return result, nil

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[high] logic-error

Error results (including transient failures) are cached permanently. When doVerifyImage encounters a transient error (network timeout, registry unavailability), it returns a failure *ast.Term with success: false. This result is unconditionally stored via verifyImageCache.Store(cacheKey, result). Subsequent calls with the same ref+opts return the cached failure without retrying. This contradicts the pattern in oci.go where .Store() is called only on the success path.

Suggested fix: Only cache successful results. Check the success field of the result term before storing, or restructure doVerifyImage to return nil on failure (matching the oci.go pattern) and only call verifyImageCache.Store when result is non-nil.

}

// Use singleflight to deduplicate concurrent requests. doVerifyImage never returns a Go
// error — failures are embedded in the result term — so the error is intentionally discarded.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[medium] error-handling-gap

The singleflight result and error are both discarded. While doVerifyImage currently never returns a Go error, its signature is (*ast.Term, error). If later modified to return an error, resultIface will be nil and the type assertion will panic. The oci.go pattern handles this with a nil check.

Suggested fix: Add a nil check on resultIface before the type assertion, or change doVerifyImage to return only *ast.Term (no error).

return cached.(*ast.Term), nil
}

// Use singleflight to deduplicate concurrent requests. doVerifyImage never returns a Go

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[low] authorization-bypass-stale-cache

The verify_image cache has no TTL, eviction, or invalidation. Unlike OCI caches (content-addressed immutable data), signature verification results can change due to key revocation or certificate expiry. In a short-lived CLI process this is unlikely to be exploitable.

Suggested fix: Document the assumption that the cache is scoped to a single CLI invocation, or add a TTL.

}
logger = logger.WithField("ref", string(uri))

// Build cache key from ref + opts hash

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[info] cache-key-missing-context

The cache key is constructed from ref and hash(opts) only, without including context identity (credentials). Consistent with oci.go pattern but worth documenting the single-invocation assumption.

The rpms-signature-scan task is in the main catalog now, so no need
to use the non-standard catalog. Also update the digest.
Just noticed it needed fixing while creating previous commit.
I used the bash script like this:

    curl -sL https://github.com/simonbaird/konflux-pipeline-patcher/raw/main/pipeline-patcher  | bash -s bump-task-refs

Note that MintMaker has been disabled recently, so I think the usual
automated PRs are not available.
Commit created like this:

  hack/go-mod-upgrade-helper github.com/containerd/containerd  --ignore-tidy-error
Needed for the next commit.
Commit created like this:

  hack/go-mod-upgrade-helper github.com/containerd/containerd/v2  --ignore-tidy-error
@simonbaird

Copy link
Copy Markdown
Member

Might need a rebase on #3357

simonbaird and others added 11 commits June 26, 2026 08:44
It has gone stale lately, but it was rebased/refreshed recent in
EC-1866. Make sure we're using the refreshed version.

The upstream version we're now using is based on v0.27.7.

Ref: https://redhat.atlassian.net/browse/EC-1866

Co-authored-by: Claude Code <noreply@anthropic.com>
I always include the ubi bump commit in a larger chore PR, so making
a fresh branch and pushing it is pretty much never what I want.
Fixes CI build failure where kubectl v0.34.2 expected
k8s.io/api/scheduling/v1alpha1 which was removed in v0.36.0.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This helped me test the dependency change fixes in the previous
commit.
This fixes some go mod tidy errors that we've been ignoring for a
while.

Older knative.dev/eventing versions import
knative.dev/pkg/{metrics,tracing/config} which have been removed
upstream. Fix by excluding v0.30.3 in tools and requiring minimum
v0.49.2 in acceptance (the first version that dropped the removed
imports).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…36.0

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
My hope is that this gets our acceptance tests, particularly the
task_validate_image feature, passing again.

See conforma/go-containerregistry#7
Add a sync.Map + singleflight.Group cache for ec.sigstore.verify_image
results keyed by ref + opts hash. OPA's Memoize only deduplicates within
a single Eval() call, but components are validated in separate Eval()
calls. Since bundles are pinned by digest, verification results are
stable and can be safely cached for the process lifetime.

This prevents redundant signature verification when many components
share the same task bundles (e.g., 100 components using git-clone).

Ref: EC-1545

Assisted-by: Claude Code (Sonnet 4.6)
Signed-off-by: arewm <arewm@users.noreply.github.com>
The truncated 8-byte SHA-256 is vulnerable to offline collision attacks
since cache key inputs are observable across invocations. SHA-512
provides adequate post-quantum collision resistance for this
security-sensitive cache key.

Assisted-by: Claude Code (Sonnet 4.6)
Signed-off-by: arewm <arewm@users.noreply.github.com>
Should fix the current unit test failure.

Co-authored-by: Claude Code <noreply@anthropic.com>
@simonbaird simonbaird force-pushed the ec-1545/sigstore-verify-image-cache branch from e0a1cdb to 975e565 Compare June 26, 2026 19:55
@simonbaird simonbaird marked this pull request as draft June 26, 2026 19:56
@simonbaird

Copy link
Copy Markdown
Member

I rebased on all the commits from #3357 to see if it goes green. Would probably want to rebase again on updated main once 3357 is merged.

@fullsend-ai-review

fullsend-ai-review Bot commented Jun 26, 2026

Copy link
Copy Markdown

🤖 Finished Review · ✅ Success · Started 7:59 PM UTC · Completed 8:12 PM UTC
Commit: 47d3320 · View workflow run →

@fullsend-ai-review fullsend-ai-review Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the review comment for full details.

resultIface, _, _ := verifyImageFlight.Do(cacheKey, func() (interface{}, error) {
return doVerifyImage(bctx, logger, uri, optsTerm)
})

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[high] logic-error

verifyImageCache.Store(cacheKey, result) is called outside the singleflight callback and unconditionally caches both success and failure results. Transient failures (network errors, DNS blips) are permanently cached for the process lifetime. The OCI caching pattern in internal/rego/oci/oci.go stores inside the singleflight callback and only on the success path.

Suggested fix: Move verifyImageCache.Store inside the singleflight callback and only store when result indicates success.

@@ -210,6 +211,92 @@ func TestSigstoreVerifyImage(t *testing.T) {
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[medium] test-inadequate

TestVerifyImageCache only covers the happy path. No test exists for when verification fails, which per the current implementation would be cached permanently.

Suggested fix: Add a test case where the mock returns an error on VerifyImageSignatures, then verify that a subsequent call retries rather than returning the cached failure.

logger.Debug("using cached image signature verification result")
return cached.(*ast.Term), nil
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[medium] fail-open

The process-level sync.Map cache stores signature verification results indefinitely with no TTL, no size bound, and no invalidation. ClearCaches() exists but is only called in test code. Stale cached success results would persist after key rotation or signature revocation.

Suggested fix: Ensure ClearCaches() is called between policy evaluation runs in production code, or scope the cache to a single evaluation run via context.

@fullsend-ai-review fullsend-ai-review Bot added go Pull requests that update Go code docker Pull requests that update Docker code dependencies Pull requests that update a dependency file labels Jun 26, 2026
@simonbaird

Copy link
Copy Markdown
Member

/ok-to-test

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Pull requests that update a dependency file docker Pull requests that update Docker code go Pull requests that update Go code size: XXL

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants