[release-4.20] OCPBUGS-92038, OCPBUGS-92039, OCPBUGS-82147, OCPBUGS-92041, OCPBUGS-92042: Replace OLM-based Istio install with Sail Library#1459
Conversation
|
@gcs278: This pull request references NE-2286 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the epic to target either version "4.20." or "openshift-4.20.", but it targets "openshift-4.22" instead. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Repository: openshift/coderabbit/.coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
Skipping CI for Draft Pull Request. |
|
/testwith openshift/cluster-ingress-operator/release-4.20/e2e-aws-operator openshift/api#2869 |
0fdfa93 to
9812b49
Compare
|
/testwith openshift/cluster-ingress-operator/release-4.20/e2e-aws-operator openshift/api#2869 |
|
Ready for early review, but blocked on getting some Jira Tickets set up and the 4.21 NO-OLM backport to merge to GA (openshift/api#2865) |
|
@gcs278: No Jira issue is referenced in the title of this pull request. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/assign @aswinsuryan |
Cherry-picked from: 955a5c0 openshift#1354
Cherry-picked from: 6d2c6c8 openshift#1402
a287c29 to
0aea5da
Compare
Cherry-picked from: 43c978a openshift#1404 Conflicts resolved: - go.mod: Switched sail-operator replace from aslakknutsen's development fork to the official openshift-service-mesh/sail-operator v0.0.0-20260327145107 (OSSM 3.3.1). Added replace directives to pin k8s.io/api, apimachinery, apiextensions-apiserver, apiserver, client-go, component-base, kube-openapi, controller-runtime, gateway-api, and gnostic-models to their original 4.20 versions, preventing the sail-operator's transitive dependencies from bumping them. This avoids the structured-merge-diff v4/v6 incompatibility and preserves compatibility with the 4.20 openshift/client-go and openshift/library-go. - vendor/: Re-vendored from scratch with pinned dependencies. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Cherry-picked from: b1bbbb7 openshift#1404
0aea5da to
5fcd499
Compare
Remove the release.openshift.io/feature-set annotation from the Sail Library ClusterRole and ClusterRoleBinding manifests. This annotation restricts CVO from deploying these RBAC resources on clusters running the Default feature set. Removing it is required before promoting the GatewayAPIWithoutOLM feature gate to GA. On 4.22, PR openshift#1393 switched to the release.openshift.io/feature-gate annotation, but CVO on 4.20 does not support that annotation (openshift/cluster-version-operator#1273 was not backported). On 4.21, this was done in a separate PR (openshift#1462), but for 4.20 we include it here to avoid an additional backport PR. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
@gcs278: This PR was included in a payload test run from openshift/origin#31322
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/fdb1ad50-703a-11f1-8ca6-25963c2f707b-0 |
|
@gcs278: This PR was included in a payload test run from openshift/origin#31322
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/02cda910-703b-11f1-8202-b9ab34f52f2b-0 |
|
@gcs278: This PR was included in a payload test run from openshift/origin#31322
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/09d8a390-703b-11f1-9aa0-93a802cd5146-0 |
|
@gcs278: This PR was included in a payload test run from openshift/origin#31322
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/fcafd840-703b-11f1-962a-ea9f67a1d4d1-0 |
|
@gcs278: This PR was included in a payload test run from openshift/origin#31322
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/06111ed0-703c-11f1-940d-9b8efc8783d7-0 |
|
@gcs278: This PR was included in a payload test run from openshift/origin#31322
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/10ca2d30-703c-11f1-9e9a-43d6051ce295-0 |
|
/lgtm The changes looks good , is the CI failure related to flaky tests? |
|
@aswinsuryan yea - looks like an unrelated flake. Overall CI is looking good for this PR. /test e2e-azure-operator |
|
/jira refresh |
|
@gcs278: This pull request references Jira Issue OCPBUGS-92038, which is valid. 7 validation(s) were run on this bug
Requesting review from QA contact: This pull request references Jira Issue OCPBUGS-92039, which is valid. 7 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Jira (iamin@redhat.com), skipping review request. This pull request references Jira Issue OCPBUGS-92040, which is valid. 7 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Jira (iamin@redhat.com), skipping review request. This pull request references Jira Issue OCPBUGS-92041, which is valid. 7 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Jira (iamin@redhat.com), skipping review request. This pull request references Jira Issue OCPBUGS-92042, which is valid. 7 validation(s) were run on this bug
Requesting review from QA contact: DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
BTW the code has been spun through CI in the test PR: openshift/origin#31322 with various payload jobs with the FG promoted (testing noOLM here in this PR). Everything is looking good, no failures for 4.20 No-OLM. |
|
@gcs278: This pull request references Jira Issue OCPBUGS-92038, which is valid. 7 validation(s) were run on this bug
Requesting review from QA contact: This pull request references Jira Issue OCPBUGS-92039, which is valid. 7 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Jira (iamin@redhat.com), skipping review request. This pull request references Jira Issue OCPBUGS-92040, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. This pull request references Jira Issue OCPBUGS-92041, which is valid. 7 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Jira (iamin@redhat.com), skipping review request. This pull request references Jira Issue OCPBUGS-92042, which is valid. 7 validation(s) were run on this bug
Requesting review from QA contact: DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@gcs278: This pull request references Jira Issue OCPBUGS-92038, which is valid. 7 validation(s) were run on this bug
Requesting review from QA contact: The bug has been updated to refer to the pull request using the external bug tracker. This pull request references Jira Issue OCPBUGS-92039, which is valid. 7 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Jira (iamin@redhat.com), skipping review request. The bug has been updated to refer to the pull request using the external bug tracker. This pull request references Jira Issue OCPBUGS-82147, which is valid. The bug has been moved to the POST state. 7 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Jira (iamin@redhat.com), skipping review request. The bug has been updated to refer to the pull request using the external bug tracker. This pull request references Jira Issue OCPBUGS-92041, which is valid. 7 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Jira (iamin@redhat.com), skipping review request. The bug has been updated to refer to the pull request using the external bug tracker. This pull request references Jira Issue OCPBUGS-92042, which is valid. 7 validation(s) were run on this bug
Requesting review from QA contact: The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@gcs278: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Summary
Backport of the noOLM / Sail Library installation path (NE-2286, shipped in 4.22) to release-4.20. This resolves several fundamental OLM bugs that have no viable OLM-based workaround — most critically OCPBUGS-86778, which blocks all OSSM z-stream upgrades and prevents shipping CVE fixes.
This is part of an approved SBAR to backport the Sail Library (noOLM) from 4.22 to 4.19–4.21. This is an identical backport to the 4.21 PR: #1442 (origin test coverage: openshift/origin#31232).
This PR is intended to merge with the
GatewayAPIWithoutOLMfeature gate disabled, making it a no-op on merge. The goal is to subsequently enable the gate by default (via openshift/api) to activate the Sail Library path and resolve the OLM issues.Background
Gateway API on OCP 4.19–4.21 uses the Cluster Ingress Operator (CIO) to install Istio via OLM (OSSM operator). This path has several critical bugs:
In OCP 4.22, NE-2286 replaced OLM with the Sail Library — CIO now installs Istio directly via embedded Helm charts. This feature shipped as GA behind the
GatewayAPIWithoutOLMfeature gate.Cherry-picked PRs
istio_sail_installer.go,istio_olm.gorefactor,migration.go,status.go, CRD manifests, Sail Library RBAC manifestsNote: #1393 (OCPBUGS-79667: Use feature-gate annotation for Sail Library RBAC) was also a dependency but is being skipped because CVO on this release does not support the
release.openshift.io/feature-gateannotation (openshift/cluster-version-operator#1273 was not backported). On 4.21, therelease.openshift.io/feature-setannotation was removed in a separate PR (#1462) before GA promotion. For 4.20, the annotation removal is included as the final commit in this PR to avoid an additional backport PR.Versioning
This backport does not bump the Gateway API CRDs (remain at v1.3.0) or the Istio version (remains at v1.26.2) for the noOLM code path. When the
GatewayAPIWithoutOLMfeature gate is enabled, the Sail Library will install Istio using the same v1.26.2 version that the OLM path currently uses. This works because the vendored Sail Library (OSSM 3.3.1) still supports Istio 1.26.2.When noOLM shipped in 4.22, the OLM and noOLM versions were already aligned at 3.3.1, so version separation was not needed. On 4.20, the OLM path is on 3.1.0 — keeping both paths at the same Istio version avoids introducing conditional logic or separate deployment manifests in the backport.
Conflicts resolved
pkg/operator/operator.go: AddedGatewayAPIWithoutOLMgate alongside existing 4.20 gates (GatewayAPI,GatewayAPIController,RouteExternalCertificate,IngressControllerLBSubnetsAWS,SetEIPForNLBIngressController)pkg/operator/controller/status/controller.go: Took incoming noOLM logic (useOLM/useSailLibrary, conditional subscription listing) but wrapped in existing 4.20GatewayAPIEnabledguardtest/e2e/gateway_api_test.go: Kept 4.20gatewayAPIControllerEnabledguard, addedgatewayAPIWithoutOLMEnabledconditionals inside for Sail Library vs OLM test selection. KeptxcrdNamesalongside newistioCRDNames. Removed references totestGatewayAPIInfrastructureAnnotations,testGatewayAPIInternalLoadBalancer, andtestGatewayOpenshiftConditionswhich were added in separate PRs not present on release-4.20.go.mod/vendor/: Addedreplacedirectives foropenshift/api(fork with gate),sail-operator(official OSSM 3.3.1), and dependency pins (see Dependency Pinning Approach below). Re-vendored from scratch.Rollout Plan
Phase 1 — Land code (gate OFF)
Phase 2 — TechPreview soak
Phase 3 — GA promotion
TODO: Remove feature-set annotation from Sail Library RBACIncluded in this PR (final commit)Dependency Pinning Approach
Unlike the 4.21 backport which bumped k8s and controller-runtime, this backport keeps all dependencies at their original 4.20 versions. The sail-operator (OSSM 3.3.1) requires k8s 0.34 and controller-runtime 0.22, but its
pkg/installpackage only uses basic CRUD operations (client.New,client.Get,client.Create,client.Update) and stable types (metav1,corev1,runtime,rest.Config) that exist unchanged in the 4.20 versions.To prevent
go mod tidyfrom bumping dependencies transitively, the followingreplacedirectives pin modules to their 4.20 versions:k8s.io/apik8s.io/apimachineryk8s.io/client-gok8s.io/apiextensions-apiserverk8s.io/apiserverk8s.io/component-basek8s.io/kube-openapisigs.k8s.io/controller-runtimesigs.k8s.io/gateway-apigithub.com/google/gnostic-modelsRisk assessment: The sail-operator install package uses only stable controller-runtime interfaces (
client.ClientCRUD operations,pkg/log,pkg/scheme). No APIs introduced in controller-runtime 0.21+ or k8s 0.34+ are used. Thestructured-merge-diff/v4vsv6incompatibility that would arise from bumping k8s is avoided entirely. This approach was validated by building successfully and by auditing every import in the sail-operator'spkg/install,api/v1, andresourcespackages.Verification
makebuilds successfully🤖 Generated with Claude Code