fix: OCPBUGS-81675: hcco: merge additionalTrustBundle into proxy trusted CA bundle#8520
Conversation
|
Pipeline controller notification For optional jobs, comment This repository is configured in: LGTM mode |
|
@dpateriya: This pull request references Jira Issue OCPBUGS-81675, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughWhen reconciling proxy config, if Proxy.Spec.TrustedCA.Name is empty and HostedCluster.Spec.AdditionalTrustBundle is set, the code sets Proxy.Spec.TrustedCA.Name to the user-ca-bundle ConfigMap name. reconcileProxyTrustedCAConfigMap now initializes destCM.Data by copying all key/value pairs from the referenced source ConfigMap, and if AdditionalTrustBundle is present, fetches its control-plane ConfigMap and appends its ca-bundle.crt (certs.UserCABundleMapKey) onto destCM.Data["ca-bundle.crt"] with newline normalization. Tests covering these scenarios were added. Sequence Diagram(s)sequenceDiagram
participant HCO as HostedClusterConfigOperator
participant API as Kubernetes API (Control Plane)
participant SourceCM as Proxy Source ConfigMap
participant ATB_CM as AdditionalTrustBundle ConfigMap
participant DestCM as Destination TrustedCA ConfigMap
HCO->>API: Read Proxy object
alt Proxy.trustedCA empty and AdditionalTrustBundle set
HCO->>HCO: Set Proxy.Spec.TrustedCA.Name = "user-ca-bundle"
end
HCO->>API: Read SourceCM referenced by Proxy.trustedCA
API-->>HCO: return SourceCM data
HCO->>DestCM: initialize destCM.Data = {}
HCO->>HCO: copy all key/value pairs from SourceCM -> destCM.Data
alt AdditionalTrustBundle set
HCO->>API: Read ATB_CM from control plane
API-->>HCO: return ATB_CM data
alt ATB_CM["ca-bundle.crt"] non-empty
HCO->>DestCM: append ATB_CM["ca-bundle.crt"] to destCM.Data["ca-bundle.crt"]
end
end
HCO->>API: Create/Update DestCM with destCM.Data
API-->>HCO: acknowledge create/update
🚥 Pre-merge checks | ✅ 10 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (10 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: dpateriya The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #8520 +/- ##
==========================================
+ Coverage 36.30% 39.74% +3.43%
==========================================
Files 764 773 +9
Lines 93015 94913 +1898
==========================================
+ Hits 33772 37721 +3949
+ Misses 56530 54494 -2036
+ Partials 2713 2698 -15
... and 164 files with indirect coverage changes
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
|
/retest |
45d3f02 to
4319c77
Compare
|
@dpateriya: This pull request references Jira Issue OCPBUGS-81675, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
4319c77 to
7cd780c
Compare
|
@dpateriya: This pull request references Jira Issue OCPBUGS-81675, which is valid. 3 validation(s) were run on this bug
DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
VerifiedTested manually on a HyperShift hosted cluster with Before the fix:
After patching HCCO with this fix:
Validation: The full data flow is now: /verified by me |
|
@dpateriya: The DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/verified by me |
|
@dpateriya: This PR has been marked as verified by DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
jparrill
left a comment
There was a problem hiding this comment.
Dropped some comments. Thanks!
@dpateriya the Could someone from QE pick this up and verify it on a hosted cluster with |
|
The initial What was tested (dev environment):
Request: Could someone from QE independently verify this fix covers the expected scenarios (ATB-only, both ATB+proxy.trustedCA, removal/cleanup)? |
7cd780c to
928ff98
Compare
|
@dpateriya: This pull request references Jira Issue OCPBUGS-81675, which is valid. 3 validation(s) were run on this bug
DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In
`@control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources_test.go`:
- Around line 777-791: The test currently returns immediately when tc.wantErr is
true which skips assertions about the destination ConfigMap; instead, remove the
early return and always run the dest CM checks based on tc.wantDestCMExists so
error-paths also validate whether manifests.ProxyTrustedCAConfigMap(proxyCAName)
was created or not using r.client.Get and apierrors.IsNotFound; keep the
existing expectations for destCM.Data["ca-bundle.crt"] == tc.wantDestCMData when
wantDestCMExists is true and assert apierrors.IsNotFound(err) when it is false.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 3656122c-c14d-412a-b36b-f5db86560311
📒 Files selected for processing (2)
control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources.gocontrol-plane-operator/hostedclusterconfigoperator/controllers/resources/resources_test.go
…ted CA bundle When HostedCluster.spec.additionalTrustBundle is set, HCCO already syncs its contents into openshift-config/user-ca-bundle on the guest cluster. However, that ConfigMap is not referenced by the guest cluster's proxy.config.openshift.io/cluster object, so CNO never merges it into openshift-config-managed/trusted-ca-bundle. Any component reading trusted-ca-bundle (including CVO's update service TLS client) therefore cannot verify TLS certificates signed by the internal CA provided via additionalTrustBundle. Fix this in HCCO with two changes: 1. reconcileProxyTrustedCAConfigMap: when both proxy.trustedCA and additionalTrustBundle are set, merge the additionalTrustBundle CA data into the destination ConfigMap alongside proxy.trustedCA data. CNO then merges both into trusted-ca-bundle. 2. reconcileConfig (Proxy object reconciliation): when no proxy.trustedCA is configured but additionalTrustBundle is set, point the Proxy object's spec.trustedCA.Name at user-ca-bundle (already written by reconcileUserCertCABundle), so CNO picks up the additionalTrustBundle CA via the standard trusted-ca-bundle path. This benefits all consumers of trusted-ca-bundle (CVO, MCO, etc.) and avoids adding HyperShift-specific special-casing in upstream components. Co-authored-by: Cursor <cursoragent@cursor.com>
928ff98 to
b0cc962
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources.go (1)
1145-1157:⚠️ Potential issue | 🟠 Major | ⚡ Quick winDo not delete the management-cluster source ConfigMap.
Line 1150/1151 deletes the ConfigMap from
hcp.Namespace, but this ref comes fromhcp.Spec.Configuration.Proxy.TrustedCA. That is the source bundle, so switching away fromproxy.trustedCAor moving to ATB-only will destroy the admin-managed ConfigMap in the control-plane namespace.Suggested fix
- // log and ignore deletion errors, should not disrupt normal workflow - cm.Namespace = hcp.Namespace - if err := r.cpClient.Delete(ctx, cm); err != nil { - log.Error(err, "failed to delete configmap", "name", cm.Name, "namespace", cm.Namespace) - } - cm.Namespace = manifests.ProxyTrustedCAConfigMap("").Namespace if err := r.client.Delete(ctx, cm); err != nil { log.Error(err, "failed to delete configmap in hosted cluster", "name", cm.Name, "namespace", cm.Namespace) }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources.go` around lines 1145 - 1157, The code currently unconditionally deletes the ConfigMap from hcp.Namespace using r.cpClient.Delete which can remove the management-cluster source referenced by hcp.Spec.Configuration.Proxy.TrustedCA; change the cleanup logic in the resource reconciliation (around currentConfigMapRef handling in resources.go) to skip deleting the management-source ConfigMap: only call r.cpClient.Delete when currentConfigMapRef is not equal to hcp.Spec.Configuration.Proxy.TrustedCA (or alternatively verify a controller-owned label/ownerReference before deletion), and continue to delete the hosted-cluster copy with r.client.Delete (using manifests.ProxyTrustedCAConfigMap("").Namespace) as before.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In
`@control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources.go`:
- Around line 1180-1184: The code is fetching the AdditionalTrustBundle
ConfigMap by hcp.Spec.AdditionalTrustBundle.Name but reconcileUserCertCABundle
uses cpomanifests.UserCAConfigMap(hcp.Namespace), so change the lookup to use
the canonical control-plane ATB name: call r.cpClient.Get with
client.ObjectKey{Name: cpomanifests.UserCAConfigMap(hcp.Namespace).Name,
Namespace: hcp.Namespace} (or otherwise use
cpomanifests.UserCAConfigMap(hcp.Namespace) as the source of truth) when
populating atbCM so the merge path and reconcileUserCertCABundle use the same
ConfigMap name.
---
Duplicate comments:
In
`@control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources.go`:
- Around line 1145-1157: The code currently unconditionally deletes the
ConfigMap from hcp.Namespace using r.cpClient.Delete which can remove the
management-cluster source referenced by hcp.Spec.Configuration.Proxy.TrustedCA;
change the cleanup logic in the resource reconciliation (around
currentConfigMapRef handling in resources.go) to skip deleting the
management-source ConfigMap: only call r.cpClient.Delete when
currentConfigMapRef is not equal to hcp.Spec.Configuration.Proxy.TrustedCA (or
alternatively verify a controller-owned label/ownerReference before deletion),
and continue to delete the hosted-cluster copy with r.client.Delete (using
manifests.ProxyTrustedCAConfigMap("").Namespace) as before.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 2abf2bdb-eef3-44b5-ad74-0976e2ace52b
📒 Files selected for processing (2)
control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources.gocontrol-plane-operator/hostedclusterconfigoperator/controllers/resources/resources_test.go
| if hcp.Spec.AdditionalTrustBundle != nil { | ||
| atbCM := &corev1.ConfigMap{} | ||
| if err := r.cpClient.Get(ctx, client.ObjectKey{Namespace: hcp.Namespace, Name: hcp.Spec.AdditionalTrustBundle.Name}, atbCM); err != nil { | ||
| return fmt.Errorf("failed to get additionalTrustBundle configmap %s/%s: %w", hcp.Namespace, hcp.Spec.AdditionalTrustBundle.Name, err) | ||
| } |
There was a problem hiding this comment.
Use the canonical control-plane ATB ConfigMap here.
This path reads hcp.Spec.AdditionalTrustBundle.Name, but reconcileUserCertCABundle below reads cpomanifests.UserCAConfigMap(hcp.Namespace). If those names ever diverge, the guest user-ca-bundle sync still works while this merge path fails with a not-found.
Suggested fix
- atbCM := &corev1.ConfigMap{}
- if err := r.cpClient.Get(ctx, client.ObjectKey{Namespace: hcp.Namespace, Name: hcp.Spec.AdditionalTrustBundle.Name}, atbCM); err != nil {
- return fmt.Errorf("failed to get additionalTrustBundle configmap %s/%s: %w", hcp.Namespace, hcp.Spec.AdditionalTrustBundle.Name, err)
+ atbCM := cpomanifests.UserCAConfigMap(hcp.Namespace)
+ if err := r.cpClient.Get(ctx, client.ObjectKeyFromObject(atbCM), atbCM); err != nil {
+ return fmt.Errorf("failed to get additionalTrustBundle configmap %s/%s: %w", atbCM.Namespace, atbCM.Name, err)
}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In
`@control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources.go`
around lines 1180 - 1184, The code is fetching the AdditionalTrustBundle
ConfigMap by hcp.Spec.AdditionalTrustBundle.Name but reconcileUserCertCABundle
uses cpomanifests.UserCAConfigMap(hcp.Namespace), so change the lookup to use
the canonical control-plane ATB name: call r.cpClient.Get with
client.ObjectKey{Name: cpomanifests.UserCAConfigMap(hcp.Namespace).Name,
Namespace: hcp.Namespace} (or otherwise use
cpomanifests.UserCAConfigMap(hcp.Namespace) as the source of truth) when
populating atbCM so the merge path and reconcileUserCertCABundle use the same
ConfigMap name.
|
@dpateriya: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Test Failure Analysis CompleteJob Information
Test Failure AnalysisErrorSummaryThe two Konflux Enterprise Contract (EC) verification failures are pre-existing, intermittent infrastructure issues unrelated to PR #8520. The PR only modifies Go source code ( Root CauseThe 6 Enterprise Contract verification failures are intermittent and time-dependent, caused by transient issues in the Konflux EC verification infrastructure — not by any code change in PR #8520. Key evidence for intermittency:
Why only hypershift-operator and not control-plane-operator:
Why push (main) passes but pull-request fails:
Recommendations
Evidence
|
|
@jparrill could you please review this PR once you LGTM, I will review this in a real env. |
Problem
When
HostedCluster.spec.additionalTrustBundleis set, HCCO syncs it intoopenshift-config/user-ca-bundleon the guest cluster. However, that ConfigMap is not referenced by the guest cluster'sproxy.config.openshift.io/clusterobject, so CNO never merges it intoopenshift-config-managed/trusted-ca-bundle.Any component that reads
trusted-ca-bundleto build its TLS trust pool (such as CVO when contacting a customspec.updateService) cannot verify certificates signed by the internal CA that was provided only viaadditionalTrustBundle. The failure surface is:This affects users running an internal Cincinnati-compatible update service (e.g. OSUS) behind a certificate signed by an internal CA that is provided via
additionalTrustBundle.Fix
Two changes in HCCO (
reconcilerinresources.go):Change 1 —
reconcileProxyTrustedCAConfigMapWhen both
proxy.trustedCAandadditionalTrustBundleare set, merge theadditionalTrustBundleCA data into the same destination ConfigMap (inopenshift-config) alongside the proxy CA data. CNO then merges both intoopenshift-config-managed/trusted-ca-bundle.Change 2 — Proxy object reconciliation in
reconcileConfigWhen
proxy.trustedCAis not set butadditionalTrustBundleis set, point the guest cluster'sproxy.config.openshift.io/clusterobject'sspec.trustedCA.Nameat"user-ca-bundle"(whichreconcileUserCertCABundlealready writes). CNO then picks it up and merges it intotrusted-ca-bundle.Why here (HCCO) and not in upstream CVO
A previous PR (openshift/cluster-version-operator#1370) attempted to fix this in CVO gated on
--hypershift. Reviewer @enxebre correctly pointed out that CPO ships with the OCP payload at the same cadence as CVO, so there is no release-speed advantage, and the CVO fix would be unnecessary tech debt. This HCCO fix benefits all consumers oftrusted-ca-bundle(CVO, MCO, etc.) rather than just CVO.Testing
New
TestReconcileProxyTrustedCAConfigMaptable-driven test covers 4 cases:proxy.trustedCAnoradditionalTrustBundleset → no dest ConfigMap createdproxy.trustedCAset → dest ConfigMap contains proxy CA data onlyadditionalTrustBundleset →reconcileProxyTrustedCAConfigMapreturns early (handled by Change 2)Fixes: https://issues.redhat.com/browse/OCPBUGS-81675
Summary by CodeRabbit
Bug Fixes
Tests