Skip to content

OCPBUGS-86774: Pin azure-cli to 2.72.0 in e2e Dockerfile#8638

Merged
openshift-merge-bot[bot] merged 1 commit into
openshift:mainfrom
bryan-cox:fix-e2e-dockerfile-i686-exclude
May 29, 2026
Merged

OCPBUGS-86774: Pin azure-cli to 2.72.0 in e2e Dockerfile#8638
openshift-merge-bot[bot] merged 1 commit into
openshift:mainfrom
bryan-cox:fix-e2e-dockerfile-i686-exclude

Conversation

@bryan-cox
Copy link
Copy Markdown
Member

@bryan-cox bryan-cox commented May 29, 2026

Summary

  • Pin azure-cli to version 2.72.0 in Dockerfile.e2e to fix 100% e2e CI failure

Problem

After openshift/release#79773 switched CI RHEL 9 repos from mirror2.openshift.com (GA content) to cdn.redhat.com E4S/EUS endpoints, the hypershift-tests-amd64 Docker image build fails because:

  1. azure-cli >= 2.73.0 requires python3.12
  2. python3.12 is not available in E4S/EUS repos
  3. dnf install azure-cli picks the latest version (2.86.0), which cannot be installed

Error:

nothing provides python3.12 needed by azure-cli-2.86.0-1.el9.x86_64 from packages-microsoft-com-prod

Fix

Pin to azure-cli-2.72.0, the last version that depends on python3.9 (available in E4S). Version boundary verified from Microsoft's RHEL 9 repo metadata:

  • azure-cli <= 2.72.0 → requires python3.9
  • azure-cli >= 2.73.0 → requires python3.12

Timeline

Time (UTC) Event
May 28 03:17 Last clean e2e-aws pass
May 28 07:49 openshift/release#79773 merged
May 28 08:04 First DockerBuildFailed
May 29 05:38+ 100% failure rate

Test plan

  • Trigger /test e2e-aws to verify image builds successfully
  • Confirm e2e tests pass end-to-end

Fixes: OCPBUGS-86774

Summary by CodeRabbit

  • Chores
    • Improved the test environment Docker setup by pinning a CLI tool version in the final test image to ensure more reproducible and stable test builds and executions.

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: LGTM mode

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label May 29, 2026
@openshift-ci openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 29, 2026
@openshift-ci-robot
Copy link
Copy Markdown

@bryan-cox: This pull request explicitly references no jira issue.

Details

In response to this:

Summary

Context

The CI repo migration from mirror2 to CDN E4S/EUS repos (openshift/release#79773, merged May 28) broke the hypershift-tests Docker image build. The CDN repos include i686 (32-bit) package metadata whose dependencies (libpython3.12.so.1.0 for i686) cannot be satisfied by the E4S/EUS repos, causing dnf install azure-cli to fail with:

nothing provides libpython3.12.so.1.0 needed by python3.12-3.12.12-6.el9_8.i686 from rhel-9-codeready-builder-rpms

This has caused 100% failure rate on all e2e-aws CI since May 29 05:38 UTC.

Adding --exclude='*.i686' prevents dnf from considering 32-bit packages during dependency resolution. azure-cli is x86_64 only so this has no functional impact.

Test plan

  • e2e-aws job passes (image build succeeds with azure-cli installed)
  • Verify az --version works in the built image

🤖 Generated with Claude Code

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 29, 2026

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@bryan-cox bryan-cox marked this pull request as ready for review May 29, 2026 15:14
@bryan-cox
Copy link
Copy Markdown
Member Author

/test e2e-aws

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 29, 2026

📝 Walkthrough

Walkthrough

The final stage of Dockerfile.e2e now installs a pinned azure-cli package version: the dnf install invocation was changed to install azure-cli-2.72.0 instead of the unpinned azure-cli package.

Suggested reviewers

  • muraee
  • Nirshal
🚥 Pre-merge checks | ✅ 11
✅ Passed checks (11 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed PR modifies only Dockerfile.e2e (Docker configuration), not Ginkgo test files. Custom check for stable test names is not applicable to infrastructure/deployment configuration changes.
Test Structure And Quality ✅ Passed This PR only modifies Dockerfile.e2e (pinning azure-cli version), not Ginkgo test code. The custom check for test structure quality is not applicable.
Topology-Aware Scheduling Compatibility ✅ Passed PR only modifies Dockerfile.e2e (build config), not deployment manifests, operator code, or controllers. No topology-aware scheduling constraints are introduced.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed The PR modifies Dockerfile.e2e to pin azure-cli to 2.72.0 but does not add any new Ginkgo e2e tests. The check is only applicable when new Ginkgo tests are added.
No-Weak-Crypto ✅ Passed PR modifies Dockerfile.e2e to pin azure-cli version; contains no cryptographic algorithms, weak crypto, or secret comparisons.
Container-Privileges ✅ Passed PR only modifies Dockerfile.e2e to pin azure-cli version; introduces no privileged, hostPID, hostNetwork, hostIPC, SYS_ADMIN, or allowPrivilegeEscalation configurations in container/K8s manifests.
No-Sensitive-Data-In-Logs ✅ Passed The Dockerfile.e2e change pins azure-cli to 2.72.0 without logging sensitive data. RUN commands use public URLs and standard package management with no credential exposure.
Title check ✅ Passed The PR title accurately describes the main change: pinning azure-cli to version 2.72.0 in the e2e Dockerfile, which directly matches the file modification shown in the raw summary.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci Bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 29, 2026
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 29, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bryan-cox

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot requested review from Nirshal and muraee May 29, 2026 15:15
@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 29, 2026
@bryan-cox bryan-cox changed the title NO-JIRA: Exclude i686 packages from azure-cli install in e2e Dockerfile OCPBUGS-86774: Exclude i686 packages from azure-cli install in e2e Dockerfile May 29, 2026
@openshift-ci-robot openshift-ci-robot added the jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. label May 29, 2026
@openshift-ci-robot
Copy link
Copy Markdown

@bryan-cox: This pull request references Jira Issue OCPBUGS-86774, which is invalid:

  • expected the bug to target the "5.0.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

Summary

Context

The CI repo migration from mirror2 to CDN E4S/EUS repos (openshift/release#79773, merged May 28) broke the hypershift-tests Docker image build. The CDN repos include i686 (32-bit) package metadata whose dependencies (libpython3.12.so.1.0 for i686) cannot be satisfied by the E4S/EUS repos, causing dnf install azure-cli to fail with:

nothing provides libpython3.12.so.1.0 needed by python3.12-3.12.12-6.el9_8.i686 from rhel-9-codeready-builder-rpms

This has caused 100% failure rate on all e2e-aws CI since May 29 05:38 UTC.

Adding --exclude='*.i686' prevents dnf from considering 32-bit packages during dependency resolution. azure-cli is x86_64 only so this has no functional impact.

Test plan

  • e2e-aws job passes (image build succeeds with azure-cli installed)
  • Verify az --version works in the built image

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Chores
  • Optimized Docker build configuration for the testing environment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (4)
Dockerfile.e2e (4)

10-12: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Build tools present in final image.

The final image intentionally reuses the builder base image to retain the Go toolchain (per the comment on line 10). This violates the container security principle of excluding build tools from the final runtime image, increasing the attack surface and image size.

Consider whether the go command is essential for e2e test execution. If possible, refactor to use a minimal runtime base image and copy only the required compiled binaries and runtime dependencies.

As per coding guidelines: "Multi-stage builds; no build tools in final image"

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Dockerfile.e2e` around lines 10 - 12, The final image reuses the builder base
(registry.ci.openshift.org/...-golang-1.25-...) and therefore contains the go
toolchain; remove build tools from the runtime image by converting the
Dockerfile.e2e into a proper multi-stage build: keep the current image as the
build stage (reference to the existing FROM image) that compiles artifacts, then
add a lightweight runtime stage (e.g., scratch/distroless/ubi-minimal) that only
COPYs the compiled binaries and any needed runtime files; update ci-test-e2e.sh
(which expects the go command) to run tests against the compiled binary in the
runtime image or invoke go inside the build stage only, ensuring no go binary is
present in the final image.

1-1: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Base image does not comply with approved catalog.

The base image registry.ci.openshift.org/openshift/release:rhel-9-release-golang-1.25-openshift-4.23 is not UBI minimal or distroless from catalog.redhat.com, and uses a pinned version tag instead of a floating tag. As per coding guidelines, Red Hat base images should use floating tags (Red Hat manages updates), and containers should prefer UBI minimal or distroless images.

As per coding guidelines: "Base image: UBI minimal or distroless from catalog.redhat.com" and "Red Hat images: use floating tags (Red Hat manages updates)"

Also applies to: 12-12

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Dockerfile.e2e` at line 1, The Dockerfile uses a non-approved pinned base
image in the builder stage; replace the FROM image reference
`registry.ci.openshift.org/openshift/release:rhel-9-release-golang-1.25-openshift-4.23`
with an approved UBI minimal or distroless image from catalog.redhat.com and use
a floating tag (for example the appropriate UBI minimal Go builder image with a
floating tag like :latest or the recommended floating stream) while keeping the
builder stage name (AS builder) intact; update the single FROM line accordingly
so the build uses an approved, floating-tag Red Hat base image.

1-34: ⚠️ Potential issue | 🔴 Critical | 🏗️ Heavy lift

Container runs as root, violating security requirements.

The Dockerfile does not specify a non-root USER directive, meaning the container will run as root by default. As per coding guidelines, containers must use a non-root user and never run as root.

🔒 Proposed fix to add non-root user
 COPY --from=builder /hypershift/hack/run-reqserving-e2e.sh /hypershift/hack/run-reqserving-e2e.sh

+RUN useradd -r -u 1001 -g 0 hypershift && \
+    chown -R 1001:0 /hypershift && \
+    chmod -R g=u /hypershift
+
 RUN rpm --import https://packages.microsoft.com/keys/microsoft.asc && \
     dnf install -y https://packages.microsoft.com/config/rhel/9/packages-microsoft-prod.rpm && \
     mv /etc/yum.repos.d/microsoft-prod.repo /etc/yum.repos.art/ci/ && \
     dnf install -y --exclude='*.i686' azure-cli && \
     dnf clean all
+
+USER 1001

As per coding guidelines: "USER non-root; never run as root"

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Dockerfile.e2e` around lines 1 - 34, The image currently runs as root because
there's no USER specified; update the final stage (the second FROM block that
sets WORKDIR /hypershift and copies binaries like /hypershift/bin/test-e2e and
/hypershift/hack/ci-test-e2e.sh) to create a non-root user/group, ensure
ownership of runtime directories (/hypershift, /hypershift/bin,
/hypershift/hack) is changed to that user, and add a USER directive (e.g., a
dedicated uid/gid) before the image exits so the container runs unprivileged
instead of root; ensure the RUN step that installs packages still works in the
build but file ownership is corrected afterward so the non-root user can execute
the copied binaries and scripts.

1-34: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Missing HEALTHCHECK directive.

The Dockerfile does not define a HEALTHCHECK instruction. Health checks are essential for container orchestration platforms to determine container health and perform automatic recovery.

💚 Proposed fix to add HEALTHCHECK
 RUN rpm --import https://packages.microsoft.com/keys/microsoft.asc && \
     dnf install -y https://packages.microsoft.com/config/rhel/9/packages-microsoft-prod.rpm && \
     mv /etc/yum.repos.d/microsoft-prod.repo /etc/yum.repos.art/ci/ && \
     dnf install -y --exclude='*.i686' azure-cli && \
     dnf clean all
+
+HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
+  CMD test -f /hypershift/bin/hypershift || exit 1

As per coding guidelines: "HEALTHCHECK defined"

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Dockerfile.e2e` around lines 1 - 34, The Dockerfile lacks a HEALTHCHECK; add
a HEALTHCHECK instruction (e.g. using CMD-SHELL) after the final RUN block to
probe the container health by invoking a lightweight internal check such as
/hypershift/hack/ci-test-e2e.sh or a small command against
/hypershift/bin/hypershift, and include sensible options like --interval,
--timeout, --start-period and --retries so orchestration can detect and restart
unhealthy containers (place the HEALTHCHECK at the end of the Dockerfile,
referencing the existing /hypershift/hack/ci-test-e2e.sh or
/hypershift/bin/hypershift).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@Dockerfile.e2e`:
- Around line 10-12: The final image reuses the builder base
(registry.ci.openshift.org/...-golang-1.25-...) and therefore contains the go
toolchain; remove build tools from the runtime image by converting the
Dockerfile.e2e into a proper multi-stage build: keep the current image as the
build stage (reference to the existing FROM image) that compiles artifacts, then
add a lightweight runtime stage (e.g., scratch/distroless/ubi-minimal) that only
COPYs the compiled binaries and any needed runtime files; update ci-test-e2e.sh
(which expects the go command) to run tests against the compiled binary in the
runtime image or invoke go inside the build stage only, ensuring no go binary is
present in the final image.
- Line 1: The Dockerfile uses a non-approved pinned base image in the builder
stage; replace the FROM image reference
`registry.ci.openshift.org/openshift/release:rhel-9-release-golang-1.25-openshift-4.23`
with an approved UBI minimal or distroless image from catalog.redhat.com and use
a floating tag (for example the appropriate UBI minimal Go builder image with a
floating tag like :latest or the recommended floating stream) while keeping the
builder stage name (AS builder) intact; update the single FROM line accordingly
so the build uses an approved, floating-tag Red Hat base image.
- Around line 1-34: The image currently runs as root because there's no USER
specified; update the final stage (the second FROM block that sets WORKDIR
/hypershift and copies binaries like /hypershift/bin/test-e2e and
/hypershift/hack/ci-test-e2e.sh) to create a non-root user/group, ensure
ownership of runtime directories (/hypershift, /hypershift/bin,
/hypershift/hack) is changed to that user, and add a USER directive (e.g., a
dedicated uid/gid) before the image exits so the container runs unprivileged
instead of root; ensure the RUN step that installs packages still works in the
build but file ownership is corrected afterward so the non-root user can execute
the copied binaries and scripts.
- Around line 1-34: The Dockerfile lacks a HEALTHCHECK; add a HEALTHCHECK
instruction (e.g. using CMD-SHELL) after the final RUN block to probe the
container health by invoking a lightweight internal check such as
/hypershift/hack/ci-test-e2e.sh or a small command against
/hypershift/bin/hypershift, and include sensible options like --interval,
--timeout, --start-period and --retries so orchestration can detect and restart
unhealthy containers (place the HEALTHCHECK at the end of the Dockerfile,
referencing the existing /hypershift/hack/ci-test-e2e.sh or
/hypershift/bin/hypershift).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 4a8fcb19-e3c9-4901-81e7-495b6d7833e4

📥 Commits

Reviewing files that changed from the base of the PR and between 8b13140 and 6dc40a8.

📒 Files selected for processing (1)
  • Dockerfile.e2e

@codecov
Copy link
Copy Markdown

codecov Bot commented May 29, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 40.68%. Comparing base (9b67f7b) to head (286c605).
⚠️ Report is 21 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #8638      +/-   ##
==========================================
- Coverage   45.84%   40.68%   -5.17%     
==========================================
  Files         440      755     +315     
  Lines       52824    93363   +40539     
==========================================
+ Hits        24218    37985   +13767     
- Misses      26816    52645   +25829     
- Partials     1790     2733     +943     

see 315 files with indirect coverage changes

Flag Coverage Δ
cmd-support 34.70% <ø> (?)
cpo-hostedcontrolplane 41.80% <ø> (ø)
cpo-other 41.39% <ø> (ø)
hypershift-operator 50.82% <ø> (ø)
other 31.61% <ø> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

azure-cli >= 2.73.0 requires python3.12, which is not available
in the E4S/EUS repos that CI now uses after openshift/release#79773
switched from mirror2.openshift.com (GA content) to cdn.redhat.com.

Pin to the last version (2.72.0) that depends on python3.9, which
is available in E4S.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@bryan-cox bryan-cox force-pushed the fix-e2e-dockerfile-i686-exclude branch from 6dc40a8 to 286c605 Compare May 29, 2026 15:44
@bryan-cox
Copy link
Copy Markdown
Member Author

/test e2e-aws

@cwbotbot
Copy link
Copy Markdown

cwbotbot commented May 29, 2026

Test Results

e2e-aws

e2e-aks

@bryan-cox
Copy link
Copy Markdown
Member Author

/test e2e-aks

@bryan-cox bryan-cox changed the title OCPBUGS-86774: Exclude i686 packages from azure-cli install in e2e Dockerfile OCPBUGS-86774: Pin azure-cli to 2.72.0 in e2e Dockerfile May 29, 2026
@bryan-cox
Copy link
Copy Markdown
Member Author

/area ci-tooling

@openshift-ci openshift-ci Bot added area/ci-tooling Indicates the PR includes changes for CI or tooling and removed do-not-merge/needs-area labels May 29, 2026
@bryan-cox
Copy link
Copy Markdown
Member Author

/jira refresh

@openshift-ci-robot openshift-ci-robot added the jira/severity-important Referenced Jira bug's severity is important for the branch this PR is targeting. label May 29, 2026
@openshift-ci-robot openshift-ci-robot added jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. and removed jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels May 29, 2026
@openshift-ci-robot
Copy link
Copy Markdown

@bryan-cox: This pull request references Jira Issue OCPBUGS-86774, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (5.0.0) matches configured target version for branch (5.0.0)
  • bug is in the state New, which is one of the valid states (NEW, ASSIGNED, POST)

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@bryan-cox
Copy link
Copy Markdown
Member Author

/test e2e-aws

@csrwng
Copy link
Copy Markdown
Contributor

csrwng commented May 29, 2026

/lgtm

@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label May 29, 2026
@openshift-merge-bot
Copy link
Copy Markdown
Contributor

Tests from second stage were triggered manually. Pipeline can be controlled only manually, until HEAD changes. Use command to trigger second stage.

@bryan-cox
Copy link
Copy Markdown
Member Author

/verified by e2e

@bryan-cox
Copy link
Copy Markdown
Member Author

/pipeline required

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

Scheduling tests matching the pipeline_run_if_changed or not excluded by pipeline_skip_if_only_changed parameters:
/test e2e-aks
/test e2e-aws
/test e2e-aws-upgrade-hypershift-operator
/test e2e-azure-self-managed
/test e2e-kubevirt-aws-ovn-reduced
/test e2e-v2-aws
/test e2e-v2-gke

@openshift-ci-robot openshift-ci-robot added the verified Signifies that the PR passed pre-merge verification criteria label May 29, 2026
@openshift-ci-robot
Copy link
Copy Markdown

@bryan-cox: This PR has been marked as verified by e2e.

Details

In response to this:

/verified by e2e

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@bryan-cox
Copy link
Copy Markdown
Member Author

/override codecov/project

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 29, 2026

@bryan-cox: Overrode contexts on behalf of bryan-cox: codecov/project

Details

In response to this:

/override codecov/project

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

/retest-required

Remaining retests: 0 against base HEAD 3db064b and 2 for PR HEAD 286c605 in total

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 29, 2026

@bryan-cox: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-merge-bot openshift-merge-bot Bot merged commit 2a57a32 into openshift:main May 29, 2026
32 of 33 checks passed
@openshift-ci-robot
Copy link
Copy Markdown

@bryan-cox: Jira Issue Verification Checks: Jira Issue OCPBUGS-86774
✔️ This pull request was pre-merge verified.
✔️ All associated pull requests have merged.
✔️ All associated, merged pull requests were pre-merge verified.

Jira Issue OCPBUGS-86774 has been moved to the MODIFIED state and will move to the VERIFIED state when the change is available in an accepted nightly payload. 🕓

Details

In response to this:

Summary

  • Pin azure-cli to version 2.72.0 in Dockerfile.e2e to fix 100% e2e CI failure

Problem

After openshift/release#79773 switched CI RHEL 9 repos from mirror2.openshift.com (GA content) to cdn.redhat.com E4S/EUS endpoints, the hypershift-tests-amd64 Docker image build fails because:

  1. azure-cli >= 2.73.0 requires python3.12
  2. python3.12 is not available in E4S/EUS repos
  3. dnf install azure-cli picks the latest version (2.86.0), which cannot be installed

Error:

nothing provides python3.12 needed by azure-cli-2.86.0-1.el9.x86_64 from packages-microsoft-com-prod

Fix

Pin to azure-cli-2.72.0, the last version that depends on python3.9 (available in E4S). Version boundary verified from Microsoft's RHEL 9 repo metadata:

  • azure-cli <= 2.72.0 → requires python3.9
  • azure-cli >= 2.73.0 → requires python3.12

Timeline

Time (UTC) Event
May 28 03:17 Last clean e2e-aws pass
May 28 07:49 openshift/release#79773 merged
May 28 08:04 First DockerBuildFailed
May 29 05:38+ 100% failure rate

Test plan

  • Trigger /test e2e-aws to verify image builds successfully
  • Confirm e2e tests pass end-to-end

Fixes: OCPBUGS-86774

Summary by CodeRabbit

  • Chores
  • Improved the test environment Docker setup by pinning a CLI tool version in the final test image to ensure more reproducible and stable test builds and executions.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@hypershift-jira-solve-ci
Copy link
Copy Markdown

I now have the complete picture. Here is the analysis:

Test Failure Analysis Complete

Job Information

Test Failure Analysis

Error

codecov/project: 40.68% (-5.17%) compared to 9b67f7b

Summary

The codecov/project check failed because Codecov reported a -5.17% project coverage drop (from 45.84% → 40.68%). This is a false positive entirely caused by a stale baseline comparison. The PR itself only modifies Dockerfile.e2e (a non-Go, non-coverable file), changes zero lines of executable code, and Codecov itself confirms "All modified and coverable lines are covered by tests." The coverage drop is an artifact of Codecov comparing against a base commit (9b67f7b) that was 21 commits behind main, during which 315 files were added to the repository by other merges — inflating the denominator of uncovered lines that this PR's report inherits.

Root Cause

The root cause is a stale Codecov baseline, not any issue with this PR's changes.

How Codecov computes project coverage: Codecov compares the full-repo coverage report uploaded for the PR head commit (286c605) against the most recent coverage report it has for the base branch. In this case, the base report corresponds to commit 9b67f7b (the merge of PR #8633), which was 21 commits behind main at the time of comparison.

Why coverage appears to drop: Between 9b67f7b and the PR head, 315 new Go source files were added to the repository by other merged PRs. These new files contain ~40,539 new lines of code, of which only ~13,767 are covered by tests. This shifts the project-wide coverage ratio downward from 45.84% to 40.68%. However, none of these file additions are part of PR #8638 — the PR modifies only Dockerfile.e2e, changing a single dnf install line to pin azure-cli to version 2.72.0.

Why the baseline was stale: Codecov's base report lags behind main when coverage uploads for intermediate commits are missing or delayed. The .codecov.yml has wait_for_ci: false, meaning Codecov doesn't wait for all CI jobs to finish before processing. If coverage-uploading CI jobs for intermediate merges didn't run or didn't upload successfully, the base report remains stuck at an older commit.

Threshold behavior: The .codecov.yml does not define an explicit coverage.status.project.default.threshold. Without one, Codecov uses its default behavior: fail the check if project coverage decreases by any amount. The -5.17% phantom drop triggers this default threshold.

Recommendations
  1. No action needed on this PR — the failure is a false positive. The PR was correctly merged, and the codecov/project check is not a required status check for this repository.

  2. Add an explicit threshold to .codecov.yml to prevent future false positives from stale baselines:

    coverage:
      status:
        project:
          default:
            threshold: 1%   # Allow up to 1% drop to tolerate stale baselines
  3. Consider setting informational: true on the project coverage check if the team does not want Codecov to block PRs:

    coverage:
      status:
        project:
          default:
            informational: true
  4. Ensure coverage uploads run on every merge to main — the 21-commit staleness suggests that coverage-uploading CI jobs may not be running consistently on post-merge commits.

Evidence
Evidence Detail
PR changes Single file: Dockerfile.e2e — pins azure-cli to version 2.72.0 (non-Go, non-coverable)
Codecov verdict on PR lines "All modified and coverable lines are covered by tests" ✅
Base commit 9b67f7b (merge of PR #8633) — 21 commits behind main
Head commit 286c605 — only 1 commit ahead of base, changing 1 file
Reported coverage delta 45.84% → 40.68% = -5.17%
Files delta 440 → 755 = +315 files (from other merges, not this PR)
Lines delta 52,824 → 93,363 = +40,539 lines (from other merges, not this PR)
Codecov config threshold None set — default "any decrease = failure" applies
wait_for_ci setting false — contributes to stale base reports
PR merge status Merged at 2026-05-29T23:42:10Z — failure was non-blocking

@openshift-merge-robot
Copy link
Copy Markdown
Contributor

Fix included in release 5.0.0-0.nightly-2026-05-30-072431

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. area/ci-tooling Indicates the PR includes changes for CI or tooling jira/severity-important Referenced Jira bug's severity is important for the branch this PR is targeting. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. verified Signifies that the PR passed pre-merge verification criteria

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants