Add openshift-e2e-aws-disconnected workflow#77922
Conversation
WalkthroughAdds CI artifacts to run disconnected AWS E2E: a new optional test and presubmit, pre- and post-provision step-registry chains, an openshift-e2e aws disconnected workflow, metadata, and OWNERS entries. Changes
Sequence Diagram(s)sequenceDiagram
participant Runner as Test Runner
participant Pre as Pre-Provision Chain
participant Bastion as Bastion Host
participant Mirror as Mirror Registry
participant Cluster as AWS Cluster
participant E2E as E2E Workflow
participant Post as Post-Deprovision Chain
Runner->>Pre: start ipi-aws-pre-disconnected
Pre->>Cluster: provision VPC/subnets, IAM, bots
Pre->>Bastion: provision bastion and mirror
Pre->>Mirror: mirror payload to bastion
Pre-->>Runner: cluster installed
Runner->>E2E: run openshift-e2e-aws-disconnected
E2E->>Bastion: use bastion for mirror/proxy/ssh
E2E->>Cluster: execute tests
E2E-->>Runner: tests complete
Runner->>Post: start ipi-aws-post-disconnected
Post->>Cluster: gather logs/console artifacts
Post->>Mirror: collect mirror registry content
Post->>Cluster: deprovision SGs/CFN/IAM
Post-->>Runner: teardown complete
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes 🚥 Pre-merge checks | ✅ 10✅ Passed checks (10 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
/pj-rehearse |
|
@mdbooth: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@ci-operator/step-registry/ipi/aws/pre/disconnected/ipi-aws-pre-disconnected-chain.yaml`:
- Around line 27-30: Remove the -x (xtrace) flag before sourcing the proxy
config to prevent secrets from being printed: modify the script where it
currently sets "set -exuo pipefail" so that xtrace is disabled (e.g., use "set
-euo pipefail" or temporarily turn off xtrace) prior to the conditional that
sources "${SHARED_DIR}/proxy-conf.sh", then re-enable xtrace afterwards if
needed; update the lines referencing set -exuo pipefail and the source
"${SHARED_DIR}/proxy-conf.sh" accordingly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: cbfb056e-a337-401a-8a80-ae55f676bf2f
📒 Files selected for processing (11)
ci-operator/config/openshift/cluster-capi-operator/openshift-cluster-capi-operator-main.yamlci-operator/jobs/openshift/cluster-capi-operator/openshift-cluster-capi-operator-main-presubmits.yamlci-operator/step-registry/ipi/aws/post/disconnected/OWNERSci-operator/step-registry/ipi/aws/post/disconnected/ipi-aws-post-disconnected-chain.metadata.jsonci-operator/step-registry/ipi/aws/post/disconnected/ipi-aws-post-disconnected-chain.yamlci-operator/step-registry/ipi/aws/pre/disconnected/OWNERSci-operator/step-registry/ipi/aws/pre/disconnected/ipi-aws-pre-disconnected-chain.metadata.jsonci-operator/step-registry/ipi/aws/pre/disconnected/ipi-aws-pre-disconnected-chain.yamlci-operator/step-registry/openshift/e2e/aws/disconnected/OWNERSci-operator/step-registry/openshift/e2e/aws/disconnected/openshift-e2e-aws-disconnected-workflow.metadata.jsonci-operator/step-registry/openshift/e2e/aws/disconnected/openshift-e2e-aws-disconnected-workflow.yaml
|
Despite reporting failure the pj-rehearse seems to have been a success. The failure is an actual bug in capi-operator: it needs to import the additional trust bundle in order to trust the mirror registry. The environment came up. I verified manually (by logging in to it during the run) that it was disconnected. It seems to have successfully collected artifacts. |
|
I have opened #77955 to separately address the timeout observed during cluster create. The timeout doesn't ultimately cause the cluster provisioning to fail, it just wastes resources. It is a latent issue. |
cb3cc31 to
8dc2ad5
Compare
8dc2ad5 to
53cdaa0
Compare
|
/pj-rehearse |
|
@mdbooth: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
damdo
left a comment
There was a problem hiding this comment.
Not an expert in step registry but it looks reasonable to me.
Thanks
/lgtm
53cdaa0 to
1035a5e
Compare
|
/approve |
1035a5e to
0f86f46
Compare
|
/hold while I iterate on this in #78315. The approach is validated, but I've encountered a couple of things which we'll need to fix in the workflow anyway. |
0f86f46 to
ffa2511
Compare
|
Getting a successful run of |
|
/hold Revision ffa2511 was retested 3 times: holding |
Add a new disconnected AWS workflow for component-level CI testing. The workflow creates an isolated VPC with private subnets and VPC endpoints, a bastion host providing mirror registry, egress proxy, and SSH jump host, then installs OpenShift using mirrored images and manual CCO credentials. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
/test generated-config |
|
/hold cancel |
ffa2511 to
62fdfcb
Compare
|
No changes, just rebased on to a commit after the fixes to generated-config. |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: damdo, mdbooth, xueqzhan The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
[REHEARSALNOTIFIER]
A total of 253 jobs have been affected by this change. The above listing is non-exhaustive and limited to 25 jobs. A full list of affected jobs can be found here Interacting with pj-rehearseComment: Once you are satisfied with the results of the rehearsals, comment: |
|
No need to re-run pj-rehearse as this is just a rebase. /pj-rehearse ack |
|
@mdbooth: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/test generated-config |
|
@mdbooth: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
/retest-required |
* Add openshift-e2e-aws-disconnected workflow Add a new disconnected AWS workflow for component-level CI testing. The workflow creates an isolated VPC with private subnets and VPC endpoints, a bastion host providing mirror registry, egress proxy, and SSH jump host, then installs OpenShift using mirrored images and manual CCO credentials. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Add CAPI credential format to aws-provision-cco-manual-users-static * Add e2e-aws-capi-disconnected-techpreview to cluster-capi-operator --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Add a new disconnected AWS workflow for component-level CI testing.
The workflow creates an isolated VPC with private subnets and VPC
endpoints, a bastion host providing mirror registry, egress proxy,
and SSH jump host, then installs OpenShift using mirrored images
and manual CCO credentials.
Also adds an optional e2e-aws-capi-disconnected-techpreview job
to cluster-capi-operator.
The required change to the AWS credentials format has an associated docs bug: https://redhat.atlassian.net/browse/OCPBUGS-84570
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
Summary by CodeRabbit
New Features
Tests
Chores