Skip to content

NE-2732: Design proposal for Gateway API CRD management mode#2023

Open
rikatz wants to merge 7 commits into
openshift:masterfrom
rikatz:gateway-api-crd-knob
Open

NE-2732: Design proposal for Gateway API CRD management mode#2023
rikatz wants to merge 7 commits into
openshift:masterfrom
rikatz:gateway-api-crd-knob

Conversation

@rikatz
Copy link
Copy Markdown
Member

@rikatz rikatz commented May 26, 2026

This proposal establishes a new knob to allow users to manage their Gateway API CRDs within an OCP cluster, establishing 3 modes: Managed, Unmanaged and ExternalCRDs

This proposal intends to provide users with a knob that allows them to manage the Gateway API CRDs without fully losing support of an OCP cluster, and to rollback on their decision in case they want back OCP Gateway API, and also allow the usage of external third party controllers.

@openshift-ci openshift-ci Bot requested review from alebedev87 and knobunc May 26, 2026 22:31
@gcs278
Copy link
Copy Markdown
Contributor

gcs278 commented May 27, 2026

I will help review
/assign

Comment on lines +903 to +904
There is a desire to backport this feature all the way back to
OpenShift OCP 4.19. The motivation is to allow customers who are
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or even OCP 4.18, which would enable a smooth upgrade from 4.18 to 4.19.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we go that back on an API backport?

Comment on lines +542 to +545
This enhancement is fully applicable to OKE. Since the
[gateway-api-without-olm](gateway-api-without-olm.md) enhancement
enables Gateway API on OKE by eliminating OSSM licensing concerns,
all three CRD management modes are available on OKE clusters.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, as noted in the "Backport to OCP 4.19" section, we might backport the API to versions of OKE that do rely on an OLM subscription for OSSM.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but this is not a problem, right?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fact that we install OSSM through an OLM subscription shouldn't be a problem. My point is that "the
gateway-api-without-olm enhancement enables Gateway API on OKE by eliminating OSSM licensing concerns" is not relevant when we backport this feature to OCP versions that still depend on OLM for the gateway controller.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Well, I will add some better wording here given we are again discussing about backporting noOLM


## Proposal

Add a new `gatewayAPI` struct field to the `IngressControllerSpec`
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A better place for the gatewayAPI field would be ingresses.config.openshift.io or even a new ingresses.operator.openshift.io resource. We previously proposed adding a ingresses.operator.openshift.io resource as a home for OperatorLogLevel in #1314. Whether we can backport the addition of a new resource is an open question.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer going to what is less friction at this point. If adding a new resource and backporting is a huge friction, I would not do it. As this subject is already "hot" whatever we can make to get less friction I prefer :)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are going for least friction - should we do just do it as an annotation or magical GatewayClass name rather than putting it into an API that isn't quite right? 😄 Playing devils advocate

I wonder if we should have a hybrid approach. A "not so good way, but backports nicely" like tell users to create a "openshift-opt-out-crds" controller-name gatewayclass or an annotation on the the ingresses.config.openshift.io object (yea that's ugly I know). We'd start with and backport this solution.

Then a "proper api" for 5.0 moving forward: ingress.operator.openshift.io. We could delay that design for later, and possibly implement some migration logic, or at least a forced migration (Upgradeable=false) in 5.0 to make sure we can deprecate the "not so good way, but backports nicely".

Definitely more work for us, but does simplify the backporting.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC API reviewers are fine with us moving with a new API definition, I will go with this path

@Miciah
Copy link
Copy Markdown
Contributor

Miciah commented May 27, 2026

/assign

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 27, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from miciah. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@rikatz rikatz changed the title Design proposal for Gateway API CRD management mode NE-2732: Design proposal for Gateway API CRD management mode May 27, 2026
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label May 27, 2026
@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented May 27, 2026

@rikatz: This pull request references NE-2732 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the epic to target the "5.0.0" version, but no target version was set.

Details

In response to this:

This proposal establishes a new knob to allow users to manage their Gateway API CRDs within an OCP cluster, establishing 3 modes: Managed, Unmanaged and ExternalCRDs

This proposal intends to provide users with a knob that allows them to manage the Gateway API CRDs without fully losing support of an OCP cluster, and to rollback on their decision in case they want back OCP Gateway API, and also allow the usage of external third party controllers.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link
Copy Markdown
Contributor

@everettraven everettraven left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, this enhancement looks pretty good. I've only got a few comments on the API side of things.

### API Extensions

This enhancement introduces a **new CRD** in the
`operator.openshift.io/v1` group. The new resource follows the
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a note, net-new APIs must be introduced in the v1alpha1 group so that we never accidentally ship them by default until the feature is GA.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Miciah you told me we could do v1 directly, so leaving the discussion here for you :)

happy to update the EP with the conclusion

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good! I will change here

Comment on lines +427 to +448
// crdManagementMode specifies how the Cluster Ingress
// Operator manages Gateway API Custom Resource Definitions
// (CRDs) and the associated Gateway controller stack.
//
// When set to "Managed" (the default), CIO installs, owns,
// and upgrades the Gateway API CRDs, protects them with a
// Validating Admission Policy, and deploys the full Gateway
// controller stack (the Istio instance deployed by CIO,
// GatewayClass, Gateway resources). This is the only fully
// supported configuration.
//
// When set to "Unmanaged", CIO does not install or manage
// Gateway API CRDs and does not deploy the Gateway controller
// stack. The cluster administrator or a third-party product
// is responsible for providing their own CRDs and Gateway
// controller. CIO reports observational status only.
//
// When set to "ExternalCRDs", CIO does not manage Gateway
// API CRDs but does deploy the OpenShift Gateway controller
// stack. The cluster administrator brings their own CRDs.
// This is an unsupported configuration intended for
// development and testing.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding of the Gateway API is very rudimentary because that isn't my wheelhouse, but the scope of management that the field name implies and what these settings actually align with seems to be a bit at odds.

I wonder if a more generic managementMode is more appropriate here? something like:

apiVersion: operator.openshift.io/v1
kind: Ingress
metadata:
  name: cluster
spec:
  gatewayAPI:
    managementMode: Managed | Unmanaged | ControllersOnly

(naming is arbitrary here, feel free to use different mode names).

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, I actually liked the proposal! Will change here, thanks!

Comment on lines +444 to +448
// When set to "ExternalCRDs", CIO does not manage Gateway
// API CRDs but does deploy the OpenShift Gateway controller
// stack. The cluster administrator brings their own CRDs.
// This is an unsupported configuration intended for
// development and testing.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit hesitant to include this mode of operation. It isn't really clear to me what the use case for an end-user setting this mode would be such that they would get some kind of value out of it.

The only thing I can think of is to test if an existing version of the default OpenShift controllers can handle a changed API, but if we are documenting what the supported pairing is by shipping, testing, and vetting a fully-managed option I'm not sure why this would be something an end-user of OpenShift would care to put their cluster in an unsupported state for?

If you folks feel strongly that this should be an option for configuration, I'd prefer we make it abundantly clear via the enum value that this is unsupported with a naming pattern like UnsupportedCRDsOnlyUnmanaged

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this one is mostly about:

  • layered products (MCP Gateway, RHOAI) wanting to test new CRDs with the current Istio version - eg.: some fields are promoted on a new version of Gateway API that we didn't vetted yet, and users want to test it
  • users that previously deployed their CRD and are willing to upgrade and move toward ours but the process of upgrade means going through the upgrade path

btw on the original proposal from @knobunc this is exactly how it is worded and I am fine making the "Unsupported" word explicit part of the name:

[Possibly] No CRDs, but unsupported Gateway: If you install GWAPI and configure the OpenShift GatewayClass and Gateway, we would run our gateway, but this would be an unsupported configuration

// deploys the full Gateway controller stack (the Istio
// instance deployed by CIO, GatewayClass, Gateway). This is
// the default mode and the only fully supported configuration.
ManagedGatewayAPICRDs GatewayAPIManagementMode = "Managed"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General convention is to use the pattern {alias}{representation}, so:

Suggested change
ManagedGatewayAPICRDs GatewayAPIManagementMode = "Managed"
GatewayAPIManagementModeManaged GatewayAPIManagementMode = "Managed"

// GatewayClass, Gateway). The customer brings their own
// CRDs. This is an UNSUPPORTED configuration useful for
// development, testing, or advanced users.
ControllersOnlyGatewayAPICRDs GatewayAPIManagementMode = "ControllersOnly"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are going to stick with allowing this mode to be configured, and it is truly an unsupported configuration, please make it abundantly clear by prefixing this mode with Unsupported.

It makes it abundantly clear to any future readers of the configuration that they are in an unsupported configuration mode and it makes users think twice because they have to type out the word Unsupported before flipping to that mode.

Suggested change
ControllersOnlyGatewayAPICRDs GatewayAPIManagementMode = "ControllersOnly"
GatewayAPIManagementModeUnsupportedControllersOnly GatewayAPIManagementMode = "UnsupportedControllersOnly"

Comment on lines +481 to +493
The following conditions are set within
`status.gatewayAPI.conditions`:

| Condition Type | Status | Reason | Description |
|---|---|---|---|
| `CRDsManaged` | `True` | `ManagedByCIO` | CIO is actively managing CRDs |
| `CRDsManaged` | `False` | `Unmanaged` | Administrator chose Unmanaged mode |
| `CRDsManaged` | `False` | `ControllersOnly` | Administrator chose ControllersOnly mode (unsupported) |
| `CRDsPresent` | `True` | `CRDsFound` | Gateway API CRDs are present on the cluster |
| `CRDsPresent` | `False` | `CRDsNotFound` | Gateway API CRDs are not present on the cluster |
| `CRDsCompliant` | `True` | `VersionMatch` | Installed CRDs match the expected version |
| `CRDsCompliant` | `False` | `VersionMismatch` | Installed CRDs do not match the expected version. Message includes the expected and actual versions. |
| `CRDsCompliant` | `Unknown` | `NotApplicable` | Compliance check is not applicable (e.g., Unmanaged mode with no CRDs present) |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a note, it would be helpful to include this information in some form in the GoDoc for the conditions field of the status so that end-users can have this same information even when using something like kubectl explain ... to get the field documentation of the CR

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Jun 1, 2026

@rikatz: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants