-
Notifications
You must be signed in to change notification settings - Fork 40
add dual-apiserver awareness/control to cluster-olm-operator #204
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,179 @@ | ||
| --- | ||
| # Example HyperShift deployment for cluster-olm-operator | ||
| # This shows how to configure cluster-olm-operator to manage OLMv1 components | ||
| # for a hosted cluster. | ||
| # | ||
| # In this example: | ||
| # - Management cluster runs in namespace: clusters-customer1 | ||
| # - Hosted cluster name: customer1 | ||
| # - Admin kubeconfig secret: admin-kubeconfig | ||
|
|
||
| apiVersion: apps/v1 | ||
| kind: Deployment | ||
| metadata: | ||
| name: cluster-olm-operator | ||
| namespace: clusters-customer1 | ||
| labels: | ||
| app: cluster-olm-operator | ||
| hypershift.openshift.io/control-plane-component: cluster-olm-operator | ||
| spec: | ||
| replicas: 1 | ||
| selector: | ||
| matchLabels: | ||
| app: cluster-olm-operator | ||
| template: | ||
| metadata: | ||
| labels: | ||
| app: cluster-olm-operator | ||
| hypershift.openshift.io/control-plane-component: cluster-olm-operator | ||
| spec: | ||
| serviceAccountName: cluster-olm-operator | ||
| securityContext: | ||
| runAsNonRoot: true | ||
| seccompProfile: | ||
| type: RuntimeDefault | ||
| initContainers: | ||
| - name: copy-catalogd-manifests | ||
| image: quay.io/openshift/origin-olm-catalogd:latest | ||
| imagePullPolicy: IfNotPresent | ||
| command: | ||
| - /bin/sh | ||
| args: | ||
| - -c | ||
| - /cp-manifests /operand-assets | ||
| volumeMounts: | ||
| - mountPath: /operand-assets | ||
| name: operand-assets | ||
| securityContext: | ||
| readOnlyRootFilesystem: true | ||
| terminationMessagePolicy: FallbackToLogsOnError | ||
| - name: copy-operator-controller-manifests | ||
| image: quay.io/openshift/origin-olm-operator-controller:latest | ||
| imagePullPolicy: IfNotPresent | ||
| command: | ||
| - /bin/sh | ||
| args: | ||
| - -c | ||
| - /cp-manifests /operand-assets | ||
| volumeMounts: | ||
| - mountPath: /operand-assets | ||
| name: operand-assets | ||
| securityContext: | ||
| readOnlyRootFilesystem: true | ||
| terminationMessagePolicy: FallbackToLogsOnError | ||
| containers: | ||
| - name: cluster-olm-operator | ||
| image: quay.io/openshift/origin-cluster-olm-operator:latest | ||
| terminationMessagePolicy: FallbackToLogsOnError | ||
| command: | ||
| - /cluster-olm-operator | ||
| args: | ||
| - start | ||
| imagePullPolicy: IfNotPresent | ||
| env: | ||
| # Standard environment variables | ||
| - name: OPERATOR_NAME | ||
| value: cluster-olm-operator | ||
| - name: OPERATOR_IMAGE_VERSION | ||
| value: 4.16.0 | ||
| - name: KUBE_RBAC_PROXY_IMAGE | ||
| value: quay.io/openshift/origin-kube-rbac-proxy:latest | ||
| - name: CATALOGD_IMAGE | ||
| value: quay.io/openshift/origin-olm-catalogd:latest | ||
| - name: OPERATOR_CONTROLLER_IMAGE | ||
| value: quay.io/openshift/origin-olm-operator-controller:latest | ||
|
|
||
| # HyperShift mode configuration | ||
| # Setting these enables HyperShift mode | ||
| - name: HOSTED_KUBECONFIG_SECRET | ||
| value: admin-kubeconfig | ||
| - name: HOSTED_NAMESPACE | ||
| value: clusters-customer1 | ||
|
Comment on lines
+86
to
+91
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Example deployment is missing the env vars required for
This is a downstream consequence of the root-cause mismatch documented in 🤖 Prompt for AI Agents |
||
|
|
||
| resources: | ||
| requests: | ||
| cpu: 10m | ||
| memory: 20Mi | ||
| securityContext: | ||
| readOnlyRootFilesystem: true | ||
| allowPrivilegeEscalation: false | ||
| capabilities: | ||
| drop: | ||
| - ALL | ||
| volumeMounts: | ||
| - mountPath: /operand-assets | ||
| name: operand-assets | ||
| - mountPath: /tmp | ||
| name: tmp | ||
| volumes: | ||
| - name: operand-assets | ||
| emptyDir: {} | ||
| - name: tmp | ||
| emptyDir: {} | ||
|
|
||
| --- | ||
| # ServiceAccount for cluster-olm-operator | ||
| apiVersion: v1 | ||
| kind: ServiceAccount | ||
| metadata: | ||
| name: cluster-olm-operator | ||
| namespace: clusters-customer1 | ||
|
|
||
| --- | ||
| # RBAC for cluster-olm-operator in management cluster | ||
| apiVersion: rbac.authorization.k8s.io/v1 | ||
| kind: ClusterRole | ||
| metadata: | ||
| name: cluster-olm-operator-management | ||
| rules: | ||
| # Management cluster permissions | ||
| - apiGroups: ["apps"] | ||
| resources: ["deployments"] | ||
| verbs: ["get", "list", "watch", "create", "update", "patch", "delete"] | ||
| - apiGroups: [""] | ||
| resources: ["services", "serviceaccounts", "configmaps"] | ||
| verbs: ["get", "list", "watch", "create", "update", "patch", "delete"] | ||
| - apiGroups: ["rbac.authorization.k8s.io"] | ||
| resources: ["roles", "rolebindings", "clusterroles", "clusterrolebindings"] | ||
| verbs: ["get", "list", "watch", "create", "update", "patch", "delete"] | ||
| - apiGroups: ["config.openshift.io"] | ||
| resources: ["proxies"] | ||
| verbs: ["get", "list", "watch"] | ||
|
|
||
| --- | ||
| apiVersion: rbac.authorization.k8s.io/v1 | ||
| kind: ClusterRoleBinding | ||
| metadata: | ||
| name: cluster-olm-operator-management | ||
| roleRef: | ||
| apiGroup: rbac.authorization.k8s.io | ||
| kind: ClusterRole | ||
| name: cluster-olm-operator-management | ||
| subjects: | ||
| - kind: ServiceAccount | ||
| name: cluster-olm-operator | ||
| namespace: clusters-customer1 | ||
|
|
||
| --- | ||
| # Example: admin-kubeconfig secret | ||
| # This secret contains the kubeconfig for the hosted cluster's API server | ||
| # In a real HyperShift deployment, this is created automatically by HyperShift | ||
| apiVersion: v1 | ||
| kind: Secret | ||
| metadata: | ||
| name: admin-kubeconfig | ||
| namespace: clusters-customer1 | ||
| type: Opaque | ||
| data: | ||
| # Base64-encoded kubeconfig for the hosted cluster | ||
| # This would be generated by HyperShift control-plane-operator | ||
| kubeconfig: | | ||
| YXBpVmVyc2lvbjogdjEKY2x1c3RlcnM6Ci0gY2x1c3RlcjoKICAgIGNlcnRpZmljYXRl | ||
| LWF1dGhvcml0eS1kYXRhOiA8YmFzZTY0LWNhLWNlcnQ+CiAgICBzZXJ2ZXI6IGh0dHBz | ||
| Oi8vYXBpLmN1c3RvbWVyMS5leGFtcGxlLmNvbTo2NDQzCiAgbmFtZTogY3VzdG9tZXIx | ||
| CmNvbnRleHRzOgotIGNvbnRleHQ6CiAgICBjbHVzdGVyOiBjdXN0b21lcjEKICAgIHVz | ||
| ZXI6IGFkbWluCiAgbmFtZTogYWRtaW5AY3VzdG9tZXIxCmN1cnJlbnQtY29udGV4dDog | ||
| YWRtaW5AY3VzdG9tZXIxCmtpbmQ6IENvbmZpZwpwcmVmZXJlbmNlczoge30KdXNlcnM6 | ||
| Ci0gbmFtZTogYWRtaW4KICB1c2VyOgogICAgY2xpZW50LWNlcnRpZmljYXRlLWRhdGE6 | ||
| IDxiYXNlNjQtY2xpZW50LWNlcnQ+CiAgICBjbGllbnQta2V5LWRhdGE6IDxiYXNlNjQt | ||
| Y2xpZW50LWtleT4K | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,182 @@ | ||
| # HyperShift Support | ||
|
|
||
| cluster-olm-operator supports running in HyperShift mode, where it manages OLMv1 components (catalogd and operator-controller) for hosted clusters. | ||
|
|
||
| ## Overview | ||
|
|
||
| In HyperShift deployments, cluster-olm-operator can run in the management cluster and manage OLMv1 components that watch hosted cluster API servers. This enables: | ||
|
|
||
| - catalogd to serve catalogs from the management cluster while watching ClusterCatalog resources in the hosted cluster's API server | ||
| - operator-controller to install operators into hosted cluster worker nodes while watching ClusterExtension resources in the hosted cluster's API server | ||
|
|
||
| This corresponds to **Approach 1: Control Plane Placement** as described in the [HyperShift OLMv1 design document](https://github.com/openshift/enhancements/blob/master/enhancements/olm/hypershift-olmv1.md). | ||
|
|
||
| ## Architecture | ||
|
|
||
| ### Standalone Mode (Default) | ||
|
|
||
| In standalone OpenShift clusters: | ||
| - cluster-olm-operator runs in `openshift-cluster-olm-operator` namespace | ||
| - catalogd and operator-controller watch the local cluster's API server using in-cluster config | ||
| - Components run in `olmv1-system` namespace | ||
|
|
||
| ### HyperShift Mode | ||
|
|
||
| In HyperShift deployments: | ||
| - cluster-olm-operator runs in the management cluster (in the hosted control plane namespace, e.g., `clusters-customer1`) | ||
| - catalogd and operator-controller watch the **hosted cluster's** API server using a mounted kubeconfig | ||
| - Components are configured with `--kubeconfig` and `--system-namespace` flags | ||
| - The `admin-kubeconfig` secret provides connectivity to the hosted cluster's API server | ||
|
|
||
| ## Configuration | ||
|
|
||
| HyperShift mode is enabled by setting environment variables on the cluster-olm-operator deployment: | ||
|
|
||
| ### Required Environment Variables | ||
|
|
||
| | Variable | Description | Example | | ||
| |----------|-------------|---------| | ||
| | `HOSTED_KUBECONFIG_SECRET` | Name of the secret containing the hosted cluster's kubeconfig | `admin-kubeconfig` | | ||
| | `HOSTED_NAMESPACE` | The hosted control plane namespace in the management cluster | `clusters-customer1` | | ||
|
|
||
| ### Example Deployment Configuration | ||
|
|
||
| ```yaml | ||
| apiVersion: apps/v1 | ||
| kind: Deployment | ||
| metadata: | ||
| name: cluster-olm-operator | ||
| namespace: clusters-customer1 # Hosted control plane namespace | ||
| spec: | ||
| template: | ||
| spec: | ||
| containers: | ||
| - name: cluster-olm-operator | ||
| image: quay.io/openshift/origin-cluster-olm-operator:latest | ||
| env: | ||
| - name: HOSTED_KUBECONFIG_SECRET | ||
| value: admin-kubeconfig | ||
| - name: HOSTED_NAMESPACE | ||
| value: clusters-customer1 | ||
| # ... other environment variables ... | ||
| ``` | ||
|
|
||
| ## How It Works | ||
|
|
||
| When HyperShift mode is detected (via `HOSTED_KUBECONFIG_SECRET` environment variable): | ||
|
|
||
| 1. **Kubeconfig Injection Hook**: The `InjectHostedClusterKubeconfigHook` deployment hook is automatically applied to catalogd and operator-controller deployments | ||
|
|
||
| 2. **Volume Mounting**: The hook adds a volume referencing the kubeconfig secret: | ||
| ```yaml | ||
| volumes: | ||
| - name: hosted-kubeconfig | ||
| secret: | ||
| secretName: admin-kubeconfig # Value from HOSTED_KUBECONFIG_SECRET | ||
| ``` | ||
|
|
||
| 3. **Volume Mounts**: The kubeconfig is mounted into all containers: | ||
| ```yaml | ||
| volumeMounts: | ||
| - name: hosted-kubeconfig | ||
| mountPath: /var/run/secrets/kubeconfig | ||
| readOnly: true | ||
| ``` | ||
|
|
||
| 4. **Command-line Flags**: Additional arguments are added to containers: | ||
| ```yaml | ||
| args: | ||
| - --kubeconfig=/var/run/secrets/kubeconfig/kubeconfig | ||
| - --system-namespace=clusters-customer1 # Value from HOSTED_NAMESPACE | ||
| ``` | ||
|
|
||
| ## Components Affected | ||
|
|
||
| The HyperShift configuration is automatically applied to: | ||
|
|
||
| - **catalogd**: Watches ClusterCatalog resources in the hosted cluster's API server | ||
| - **operator-controller**: Watches ClusterExtension resources in the hosted cluster's API server and installs operators into hosted cluster worker nodes | ||
|
|
||
| Both components continue to serve their control plane functions from the management cluster while interacting with hosted cluster API resources. | ||
|
|
||
| ## Upstream Requirements | ||
|
|
||
| For HyperShift mode to work, the upstream components must support: | ||
|
|
||
| - **catalogd**: `--kubeconfig` flag support ([catalogd PR #xyz](https://github.com/operator-framework/catalogd/pull/xyz)) | ||
| - **operator-controller**: `--kubeconfig` flag support ([operator-controller PR #xyz](https://github.com/operator-framework/operator-controller/pull/xyz)) | ||
|
Comment on lines
+106
to
+107
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Replace placeholder PR links before merging.
🤖 Prompt for AI Agents |
||
| - Both components: `--system-namespace` flag to specify the namespace context | ||
|
|
||
| ## Detection and Logging | ||
|
|
||
| When cluster-olm-operator starts in HyperShift mode: | ||
|
|
||
| ``` | ||
| I0312 10:15:23.123456 1 builder.go:150] HyperShift mode detected, injecting kubeconfig configuration deployment="catalogd" kubeconfigSecret="admin-kubeconfig" hostedNamespace="clusters-customer1" | ||
| I0312 10:15:23.234567 1 builder.go:150] HyperShift mode detected, injecting kubeconfig configuration deployment="operator-controller" kubeconfigSecret="admin-kubeconfig" hostedNamespace="clusters-customer1" | ||
| ``` | ||
|
Comment on lines
+114
to
+117
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Add language specifiers to fenced code blocks (markdownlint MD040). Both log-output blocks at Lines 114 and 121 are missing a language identifier. Use 📝 Proposed fix-```
+```text
I0312 10:15:23.123456 ...-```
+```text
I0312 10:15:23.345678 ...Also applies to: 121-124 🧰 Tools🪛 markdownlint-cli2 (0.22.1)[warning] 114-114: Fenced code blocks should have a language specified (MD040, fenced-code-language) 🤖 Prompt for AI Agents |
||
|
|
||
| Individual deployment hooks also log their actions: | ||
|
|
||
| ``` | ||
| I0312 10:15:23.345678 1 builder.go:354] Injecting hosted cluster kubeconfig configuration deployment="catalogd" kubeconfigSecret="admin-kubeconfig" hostedNamespace="clusters-customer1" | ||
| I0312 10:15:23.456789 1 builder.go:380] Configured container container="catalogd" kubeconfigPath="/var/run/secrets/kubeconfig/kubeconfig" systemNamespace="clusters-customer1" | ||
| ``` | ||
|
|
||
| ## Verification | ||
|
|
||
| To verify cluster-olm-operator is running in HyperShift mode: | ||
|
|
||
| 1. Check environment variables: | ||
| ```bash | ||
| kubectl get deployment cluster-olm-operator -n clusters-customer1 -o yaml | grep -A2 HOSTED_ | ||
| ``` | ||
|
|
||
| 2. Check catalogd/operator-controller deployments for kubeconfig configuration: | ||
| ```bash | ||
| kubectl get deployment catalogd -n clusters-customer1 -o yaml | grep -A5 "hosted-kubeconfig" | ||
| kubectl get deployment operator-controller -n clusters-customer1 -o yaml | grep "kubeconfig" | ||
| ``` | ||
|
|
||
| 3. Verify components are watching the hosted cluster API: | ||
| ```bash | ||
| # Check catalogd logs | ||
| kubectl logs -n clusters-customer1 deployment/catalogd | grep "kubeconfig" | ||
|
|
||
| # Check operator-controller logs | ||
| kubectl logs -n clusters-customer1 deployment/operator-controller | grep "kubeconfig" | ||
| ``` | ||
|
|
||
| ## Troubleshooting | ||
|
|
||
| ### Components not connecting to hosted cluster | ||
|
|
||
| **Symptoms**: catalogd or operator-controller cannot list resources, API connection errors | ||
|
|
||
| **Checks**: | ||
| 1. Verify the `admin-kubeconfig` secret exists and is properly mounted | ||
| 2. Check the secret contains a valid kubeconfig | ||
| 3. Verify network connectivity from management cluster to hosted cluster API server | ||
| 4. Check RBAC permissions in the kubeconfig | ||
|
|
||
| ### Missing environment variables | ||
|
|
||
| **Symptoms**: Components use in-cluster config instead of hosted cluster kubeconfig | ||
|
|
||
| **Solution**: Ensure both `HOSTED_KUBECONFIG_SECRET` and `HOSTED_NAMESPACE` environment variables are set on the cluster-olm-operator deployment | ||
|
|
||
| ### Hook not applied | ||
|
|
||
| **Symptoms**: Deployments don't have kubeconfig volumes or --kubeconfig flags | ||
|
|
||
| **Checks**: | ||
| 1. Verify environment variables are set before cluster-olm-operator starts | ||
| 2. Check cluster-olm-operator logs for "HyperShift mode detected" messages | ||
| 3. Verify the deployment controller is processing deployments correctly | ||
|
|
||
| ## References | ||
|
|
||
| - [HyperShift OLMv1 Design Proposal](https://github.com/openshift/enhancements/blob/master/enhancements/olm/hypershift-olmv1.md) | ||
| - [catalogd Documentation](https://github.com/operator-framework/catalogd) | ||
| - [operator-controller Documentation](https://github.com/operator-framework/operator-controller) | ||
| - [HyperShift Documentation](https://hypershift-docs.netlify.app/) | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Init containers are missing
allowPrivilegeEscalation: false(Checkov CKV_K8S_20).Both
copy-catalogd-manifestsandcopy-operator-controller-manifestssetreadOnlyRootFilesystem: truebut omitallowPrivilegeEscalation: falseandcapabilities.drop: [ALL], which the main container already has. Users copy-pasting this example will deploy with insecure init container defaults.🔒 Proposed fix for both init containers
securityContext: readOnlyRootFilesystem: true + allowPrivilegeEscalation: false + capabilities: + drop: + - ALL📝 Committable suggestion
🤖 Prompt for AI Agents