NO-JIRA: feat(skills): add validate-pr-override-images skill#8616
Conversation
|
Pipeline controller notification For optional jobs, comment This repository is configured in: LGTM mode |
|
@enxebre: This pull request explicitly references no jira issue. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a Claude skill doc, two shell scripts, a GitHub Actions workflow, and updated override entries. The validator parses a PR body for branch->PR mappings, extracts added cpoImage digests from the PR diff, deduplicates images by digest, and calls verify-pr-in-image.sh for each image/PR. The verifier reads an image's vcs-ref via skopeo, ensures the commit exists (fetching if needed), searches git history for merge/cherry-pick markers for the PR, and returns PASS/FAIL. CI runs the validator on PRs that modify overrides.yaml and fails the job if any verification fails. Sequence Diagram(s)sequenceDiagram
participant GitHubActions
participant Runner
participant Validator as validate-overrides.sh
participant GH as gh
participant Verifier as verify-pr-in-image.sh
participant Skopeo as skopeo
participant Registry as ContainerRegistry
participant Repo as GitRepo
GitHubActions->>Runner: trigger on PR modifying overrides.yaml
Runner->>Validator: run validate-overrides.sh --pr <N> (with GH_TOKEN)
Validator->>GH: gh pr view / gh pr diff (read body + diff)
Validator->>Verifier: verify-pr-in-image.sh <image> <pr>
Verifier->>Skopeo: skopeo inspect <image> (read vcs-ref)
Skopeo->>Registry: fetch image metadata
Verifier->>Repo: git cat-file -e <sha> (fetch --all if missing)
Verifier->>Repo: git log --oneline (search for PR markers)
Verifier->>Validator: PASS / FAIL
Validator->>Runner: aggregate results, exit code
Important Pre-merge checks failedPlease resolve all errors before merging. Addressing warnings is optional. ❌ Failed checks (1 error)
✅ Passed checks (10 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: enxebre The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.claude/skills/validate-pr-override-images/SKILL.md:
- Around line 10-12: The markdown in
.claude/skills/validate-pr-override-images/SKILL.md uses unlabeled fenced code
blocks (for the example command "/validate-pr-override-images
<PR-URL-or-number>" and the example sections "Validate a PR by number" and
"Validate a PR by URL" including their outputs) which triggers MD040; update
each triple-backtick fence around those examples to include a language
identifier (use "text" for these CLI output blocks) so the fences read ```text
instead of ``` to satisfy linting. Ensure you change every occurrence around the
command and its example outputs (the blocks containing
"/validate-pr-override-images 8610", the example output lines starting with
"Image:" and "Overall: PASS", and the "/validate-pr-override-images
https://github.com/openshift/hypershift/pull/8610" block) so all unlabeled
fences are labeled consistently.
In @.claude/skills/validate-pr-override-images/verify-pr-in-image.sh:
- Around line 8-10: Add argument-count validation at the top of the script
before reading positional params: check $# and if fewer than 2 print a concise
usage message mentioning the script name and expected args (IMAGE PR [REPO]) and
exit non-zero; then safely assign IMAGE="$1", PR="$2", REPO="${3:-.}". This
prevents unbound-variable errors under set -u and gives a clear error when IMAGE
or PR are missing.
- Around line 15-20: COMMIT is set from podman inspect and Podman returns the
literal string "<no value>" when the label is missing, so update the validation
after the COMMIT assignment in verify-pr-in-image.sh to treat both empty string
and the literal "<no value>" as missing; change the condition that currently
checks [[ -z "$COMMIT" ]] to also check for "$COMMIT" == "<no value>" (or use a
pattern match) and exit with the error message if either case is true so later
logic does not attempt to use "<no value>" as a commit.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 88b5c93f-05e2-4694-b912-363bc695c51f
📒 Files selected for processing (2)
.claude/skills/validate-pr-override-images/SKILL.md.claude/skills/validate-pr-override-images/verify-pr-in-image.sh
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #8616 +/- ##
=======================================
Coverage 41.27% 41.27%
=======================================
Files 755 755
Lines 93446 93446
=======================================
Hits 38566 38566
Misses 52148 52148
Partials 2732 2732
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
8a55aeb to
9b3a02e
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
.claude/skills/validate-pr-override-images/verify-pr-in-image.sh (1)
34-34: Be aware of performance on repos with deep history.The
git log --oneline "$COMMIT"command searches the entire commit history from$COMMITback to the root. On repositories with thousands of commits, this can be slow. However, limiting the search (e.g., with--max-count) risks missing legitimate PR inclusions, so the current approach is the safest for correctness.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.claude/skills/validate-pr-override-images/verify-pr-in-image.sh at line 34, The git log call that sets FOUND (FOUND=$(git -C "$REPO" log --oneline "$COMMIT" ...)) can be very slow on repos with deep history; change it to a two-step approach: first search a bounded recent window (e.g., use git -C "$REPO" log --oneline --max-count="${PR_SEARCH_LIMIT:-10000}" "$COMMIT") and if not found fall back to the full search only when necessary. Update the assignment to FOUND (and any related logic) in verify-pr-in-image.sh to implement the configurable --max-count fast path with a fallback to the original command so correctness is preserved but typical runs are much faster.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.claude/skills/validate-pr-override-images/verify-pr-in-image.sh:
- Around line 29-32: After fetching, re-check the commit existence using the
same git command to ensure the commit was found; if git -C "$REPO" cat-file -e
"$COMMIT" still fails, print a clear error like "Commit $COMMIT not found in any
remote for repo $REPO" and exit with non-zero status. Update the block that
currently uses git -C "$REPO" cat-file -e "$COMMIT" (and the subsequent git -C
"$REPO" fetch --all --quiet) to perform this post-fetch verification and handle
the failure path explicitly.
---
Nitpick comments:
In @.claude/skills/validate-pr-override-images/verify-pr-in-image.sh:
- Line 34: The git log call that sets FOUND (FOUND=$(git -C "$REPO" log
--oneline "$COMMIT" ...)) can be very slow on repos with deep history; change it
to a two-step approach: first search a bounded recent window (e.g., use git -C
"$REPO" log --oneline --max-count="${PR_SEARCH_LIMIT:-10000}" "$COMMIT") and if
not found fall back to the full search only when necessary. Update the
assignment to FOUND (and any related logic) in verify-pr-in-image.sh to
implement the configurable --max-count fast path with a fallback to the original
command so correctness is preserved but typical runs are much faster.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 9b0a8b55-2c5a-4b6e-83c4-a3869f85f320
📒 Files selected for processing (2)
.claude/skills/validate-pr-override-images/SKILL.md.claude/skills/validate-pr-override-images/verify-pr-in-image.sh
✅ Files skipped from review due to trivial changes (1)
- .claude/skills/validate-pr-override-images/SKILL.md
|
Hey @enxebre, thanks for putting this together — the idea of validating override images is solid and I've been thinking about this gap too. My concern is that as a Claude skill this relies on someone remembering to run it. If we forget, we're back to square one. Also the script has some fragilities: it needs podman running locally, depends on merge commit message format (squash merges or different cherry-pick conventions would break it), and git fetch --all can be slow. What if instead of a manual skill we make this a CI check that runs automatically on every PR touching overrides.yaml? We already have the pattern for this:
This way every override PR gets validated automatically — no manual step, no podman dependency, and it catches the most common mistakes (wrong digest, image doesn't exist, duplicate versions). WDYT? |
These are meant to be building blocks for CI automation. You could possibly do the second step deterministically as part of the script without llm, agreeing on a contractual consistent PR desc for overrides changes. |
I just pushed an absolutely untested commit so you can see the idea if doing that deterministically results too convoluted, we can still run the skill from GH actions, it's just a building block /hold |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
.github/workflows/validate-cpo-overrides.yaml (1)
14-27:⚠️ Potential issue | 🟠 Major | 🏗️ Heavy liftAdd SAST/SCA coverage in this workflow path.
This pipeline introduces security-sensitive parsing/validation logic but has no SAST/SCA step (or reusable security workflow invocation).
As per coding guidelines, ".github/workflows/**/*: ... SAST/SCA steps in pipeline".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/validate-cpo-overrides.yaml around lines 14 - 27, The workflow job validate-cpo-overrides lacks SAST/SCA scanning; add a security scan step or call the organization's reusable SAST/SCA workflow in the validate-cpo-overrides job so the validation script (.claude/skills/validate-pr-override-images/validate-overrides.sh) runs under security analysis; update the job to include either a reusable workflow invocation (e.g., call the repo's standard SAST/SCA workflow) or insert the approved scanner step(s) (SCA dependency scan and SAST/static-analysis) before or after the existing run step, ensuring GH_TOKEN or required secrets are set in env and the step uses the same runs-on and timeout settings.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.claude/skills/validate-pr-override-images/validate-overrides.sh:
- Around line 39-55: The script increments FOUND_LINES and records BRANCH_PRS
even when the extracted pr_numbers is empty, allowing a silent pass; modify the
logic in the block that processes wants (and the similar block at the other
location) to validate that at least one PR ID was extracted before incrementing
FOUND_LINES and assigning BRANCH_PRS["$branch"], and if no IDs are found either
emit an error and exit non-zero or skip/continue so the script fails fast;
reference the variables and arrays pr_numbers, FOUND_LINES, BRANCH_PRS, wants
and the branch handling code to locate and update the conditional flow.
- Around line 77-93: The loop only captures version when the diff line is an
added line, so cpoImage changes that appear under an unchanged or context
version line get missed; modify the version-detection branch inside the
while-read loop in validate-overrides.sh to match version: lines regardless of
the leading diff symbol (e.g., accept lines starting with +, -, or space) and
set current_version from those matches (trim as needed), so subsequent cpoImage
matches (the image extraction branch and BRANCH_IMAGES population) will
correctly associate images with the last-seen version; keep using the same
variables BRANCH_IMAGES and current_version so the rest of the logic remains
unchanged.
---
Outside diff comments:
In @.github/workflows/validate-cpo-overrides.yaml:
- Around line 14-27: The workflow job validate-cpo-overrides lacks SAST/SCA
scanning; add a security scan step or call the organization's reusable SAST/SCA
workflow in the validate-cpo-overrides job so the validation script
(.claude/skills/validate-pr-override-images/validate-overrides.sh) runs under
security analysis; update the job to include either a reusable workflow
invocation (e.g., call the repo's standard SAST/SCA workflow) or insert the
approved scanner step(s) (SCA dependency scan and SAST/static-analysis) before
or after the existing run step, ensuring GH_TOKEN or required secrets are set in
env and the step uses the same runs-on and timeout settings.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: ecfa5447-2ecb-422b-8d02-f7a853e2a9c5
📒 Files selected for processing (3)
.claude/skills/validate-pr-override-images/validate-overrides.sh.claude/skills/validate-pr-override-images/verify-pr-in-image.sh.github/workflows/validate-cpo-overrides.yaml
🚧 Files skipped from review as they are similar to previous changes (1)
- .claude/skills/validate-pr-override-images/verify-pr-in-image.sh
| 5. Reporting a pass/fail summary per image and per PR | ||
|
|
||
| Prerequisites: | ||
| - `podman` must be running (`podman machine start` if needed) |
There was a problem hiding this comment.
Shouldn't it also be pointed to a valid pull secret? some overrides can be to actually released newer Z streams, for which skopeo would need the pull secret
jparrill
left a comment
There was a problem hiding this comment.
Dropped some more comments. Thanks!
| ``` | ||
|
|
||
| The script: | ||
| - Pulls the image via `podman` |
There was a problem hiding this comment.
I've tested skopeo command in local and it not perform a pull of the image (podman pull docker://xxxxx) to grab the details, so this needs to be adjusted.
There was a problem hiding this comment.
Right, the advantage of using skopeo is that it does NOT make the pull
| 5. Reporting a pass/fail summary per image and per PR | ||
|
|
||
| Prerequisites: | ||
| - `podman` must be running (`podman machine start` if needed) |
There was a problem hiding this comment.
I would say that this makes sense in darwin environments, but in a regular Linux env this is not necessary, only the service running if applies.
|
Actionable comments posted: 0 |
6d472cc to
da93754
Compare
07ec0e4 to
d9efdad
Compare
d9efdad to
93e2f60
Compare
Empty commit to retrigger the validate-cpo-overrides workflow after updating the PR description to claim PR openshift#8616 (this PR, which is not in the CPO release images) to verify the check fails correctly.
c7cfb5f to
2a9e3a3
Compare
|
/hold cancel |
…description Adds a GitHub Actions workflow that triggers when CPO override images are modified. It parses the PR description for a contract specifying which PRs each branch override should contain, then validates the images using skopeo and git history. Also adds a Claude Code skill wrapping the same validation script for local use.
2a9e3a3 to
921e551
Compare
|
Now I have the complete root cause. Let me produce the final report. Test Failure Analysis CompleteJob Information
Test Failure AnalysisErrorSummaryThe validation workflow fails because the PR description's Root CauseThe failure is caused by a circular self-reference in the test data. Here is the exact causal chain:
In summary: The test data is self-referential and impossible to satisfy. The override images are real pre-existing builds that cannot contain PR #8616 (which introduces the validation tooling itself, not a CPO code fix). The PR claims these images contain PR #8616, but they don't and can't. Recommendations
Evidence
|
|
/hold cancel |
|
/lgtm |
|
/hold |
|
Pipeline controller notification No second-stage tests were triggered for this PR. This can happen when:
Use |
|
/hold cancel |
|
/verified by @enxebre |
|
@bryan-cox: This PR has been marked as verified by DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
Summary
validate-overrides.sh— orchestrates the full validation: parses a structured contract from the PR description, extracts override images per branch from the diff, and validates each image contains the claimed PRs usinggit merge-base --is-ancestorverify-pr-in-image.sh— inspects a container image via skopeo, reads itsvcs-reflabel, and checks whether a given PR merge commit is an ancestor of the image's build commitSKILL.md— Claude Code skill wrapper that invokesvalidate-overrides.shoverrides.yamland runs the validation automaticallyDepends on #8627 to add skopeo and gh CLI to the ARC runner image.
PR description contract
PR authors modifying CPO overrides must include lines in the PR description like:
The workflow fails if no such lines are found.
Usage
Or run the script directly:
Test plan
verify-pr-in-image.shreturns PASS for a known-good image+PR combinationverify-pr-in-image.shreturns FAIL (exit 1) for a PR not in the imagevalidate-overrides.sh 8616parses contract from this PR description and validates images🤖 Generated with Claude Code