CNTRLPLANE-2207: Upgrade to CAPI 1.11#7590
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughThe pull request makes three configuration and dependency-related updates. It adds a new staticcheck exclusion rule in 🚥 Pre-merge checks | ✅ 10✅ Passed checks (10 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
Skipping CI for Draft Pull Request. |
|
@clebs: This pull request references CNTRLPLANE-2207 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@clebs: This pull request references CNTRLPLANE-2207 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/test e2e-aws-minimal verify |
AI Test Failure AnalysisJob: Generated by hypershift-analyze-e2e-failure post-step using Claude claude-opus-4-6 |
|
/retest |
AI Test Failure AnalysisJob: Generated by hypershift-analyze-e2e-failure post-step using Claude claude-opus-4-6 |
AI Test Failure AnalysisJob: Generated by hypershift-analyze-e2e-failure post-step using Claude claude-opus-4-6 |
|
/test e2e-aws e2e-aks |
AI Test Failure AnalysisJob: Generated by hypershift-analyze-e2e-failure post-step using Claude claude-opus-4-6 |
AI Test Failure AnalysisJob: Generated by hypershift-analyze-e2e-failure post-step using Claude claude-opus-4-6 |
|
/test e2e-aws e2e-aks |
|
/test e2e-aws |
- Upgrade all CAPI modules to 1.11. - Update changed import paths - Silence depreciation linter errors - Update make cluster-api goal.
CAPI 1.11 defaults to v1beta2 storage. Override to v1beta1 for HyperShift compatibility. Signed-off-by: Borja Clemente <bclement@redhat.com>
Signed-off-by: Borja Clemente <bclement@redhat.com>
Signed-off-by: Borja Clemente <bclement@redhat.com>
Remove the temporary hardocded CAPI image overrides now that hypershift supports CAPI 1.11 Signed-off-by: Borja Clemente <bclement@redhat.com>
For conversion to work, the CAPI provider needs to be able to access CRDs cluster-wide to list available versions. Signed-off-by: Borja Clemente <bclement@redhat.com>
Update TestScaleFromZero to support both CAPI 1.11+ native Status.Capacity and pre-1.11 annotation-based capacity information. In CAPI 1.11, cluster-api-provider-aws now populates Status.Capacity directly on AWSMachineTemplate, making the workaround annotations unnecessary. The HyperShift controller detects this and skips setting annotations when Status.Capacity is present. The test now: - First checks AWSMachineTemplate.Status.Capacity (CAPI 1.11+) - Falls back to MachineDeployment annotations (pre-CAPI 1.11) - Logs the capacity source for debugging This makes the test backward compatible and fixes the failure in PR openshift#7590.
Setting the MinReadySeconds default to 0 explicitly on the nodepool controller causes infinite reconciliaiton due to a lossy v1beta1 -> v1beta2 conversion and flipping value between 0 and nil. Removing the explicit setting should not have any other side effect since the zero value of the field is the same. Signed-off-by: Borja Clemente <bclement@redhat.com>
…mplete check Replace the 1-second sleep workaround for OCPBUGS-77922 with a deterministic cross-check of the v1beta2 conversion-data annotation. In CAPI v1.11+, the v1beta1 UpdatedReplicas field maps from deprecated.v1beta1.updatedReplicas rather than the native upToDateReplicas, which can transiently disagree. When v1beta1 fields indicate completion, we now verify against the authoritative v1beta2 status in the conversion-data annotation before declaring complete. Jira: https://issues.redhat.com/browse/OCPBUGS-77922 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The word uptodate and all its casing variants are a false positive on codespell. They are defined as such in CAPI. Signed-off-by: Borja Clemente <bclement@redhat.com>
AI Test Failure AnalysisJob: Generated by hypershift-analyze-e2e-failure post-step using Claude claude-opus-4-6 |
|
/test e2e-aws e2e-aks e2e-aws-upgrade-hypershift-operator e2e-azure-self-managed e2e-kubevirt-aws-ovn-reduced |
- Cleanup backwardcompat CAPI images logic - Simplify APIVersionGetter removing filtering versions to look for v1beta1 which is not necessary. Conversion just needs to obtain whichever version is registered for a given CRD. Also improved the comment to reflect the adjustment. - Use upstream constant for conversion-data annotation instead of redefining the value. - Document capi-provider sharing role with capi-manager for CRD read access. - Log an error when conversion-data annotation unmarshal fails. Signed-off-by: Borja Clemente <bclement@redhat.com> apigetter comments juanma
|
/test e2e-aws e2e-aks e2e-aws-upgrade-hypershift-operator e2e-azure-self-managed e2e-kubevirt-aws-ovn-reduced |
|
/test e2e-aws |
|
@clebs: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
/test e2e-aws |
|
/close |
|
PR needs rebase. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
Now I have the complete picture. Let me write the final report: Test Failure Analysis CompleteJob Information
Test Failure AnalysisErrorSummaryThe e2e-gke job has two independent test failures, neither caused by the CAPI 1.11 bump. (1) Root CauseTwo independent, pre-existing GKE platform issues — not caused by CAPI 1.11 bump: Failure 1: The HA (HighAvailability) test creates a HostedCluster with pod anti-affinity. On GKE Autopilot, using pod anti-affinity triggers a minimum CPU request of 500m per pod. The etcd StatefulSet only requests 360m CPU for its A second resource also failed: the This cascading failure prevented the entire hosted control plane from starting:
The Failure 2: The metrics forwarder test timed out because Prometheus in the guest cluster couldn't scrape kube-apiserver metrics via the metrics-forwarder proxy. The Failure 3: The overall test pod timed out at the 2h limit (exit code 127). Teardown was still in progress when the timeout hit, and by then GKE credentials had expired ( Tide "error" state: The PR has merge conflicts ( Recommendations
Evidence
|
What this PR does / why we need it:
Bumps hypershift to use CAPI
v1.11including the following tasks:v1.11compatible version ingo.mod.v1beta1as storage version.v1beta1<->v1beta2.Which issue(s) this PR fixes:
Fixes CNTRLPLANE-2207
Special notes for your reviewer:
Checklist:
Summary by CodeRabbit
Release Notes