Skip to content

fix: OCPBUGS-81675: hcco: merge additionalTrustBundle into proxy trusted CA bundle#8520

Open
dpateriya wants to merge 1 commit into
openshift:mainfrom
dpateriya:fix/hcco-additionaltrustbundle-trusted-ca-bundle
Open

fix: OCPBUGS-81675: hcco: merge additionalTrustBundle into proxy trusted CA bundle#8520
dpateriya wants to merge 1 commit into
openshift:mainfrom
dpateriya:fix/hcco-additionaltrustbundle-trusted-ca-bundle

Conversation

@dpateriya
Copy link
Copy Markdown
Contributor

@dpateriya dpateriya commented May 14, 2026

Problem

When HostedCluster.spec.additionalTrustBundle is set, HCCO syncs it into openshift-config/user-ca-bundle on the guest cluster. However, that ConfigMap is not referenced by the guest cluster's proxy.config.openshift.io/cluster object, so CNO never merges it into openshift-config-managed/trusted-ca-bundle.

Any component that reads trusted-ca-bundle to build its TLS trust pool (such as CVO when contacting a custom spec.updateService) cannot verify certificates signed by the internal CA that was provided only via additionalTrustBundle. The failure surface is:

Unable to retrieve available updates: tls: failed to verify certificate: x509: certificate signed by unknown authority

This affects users running an internal Cincinnati-compatible update service (e.g. OSUS) behind a certificate signed by an internal CA that is provided via additionalTrustBundle.

Fix

Two changes in HCCO (reconciler in resources.go):

Change 1 — reconcileProxyTrustedCAConfigMap

When both proxy.trustedCA and additionalTrustBundle are set, merge the additionalTrustBundle CA data into the same destination ConfigMap (in openshift-config) alongside the proxy CA data. CNO then merges both into openshift-config-managed/trusted-ca-bundle.

Change 2 — Proxy object reconciliation in reconcileConfig

When proxy.trustedCA is not set but additionalTrustBundle is set, point the guest cluster's proxy.config.openshift.io/cluster object's spec.trustedCA.Name at "user-ca-bundle" (which reconcileUserCertCABundle already writes). CNO then picks it up and merges it into trusted-ca-bundle.

Why here (HCCO) and not in upstream CVO

A previous PR (openshift/cluster-version-operator#1370) attempted to fix this in CVO gated on --hypershift. Reviewer @enxebre correctly pointed out that CPO ships with the OCP payload at the same cadence as CVO, so there is no release-speed advantage, and the CVO fix would be unnecessary tech debt. This HCCO fix benefits all consumers of trusted-ca-bundle (CVO, MCO, etc.) rather than just CVO.

Testing

go test ./control-plane-operator/hostedclusterconfigoperator/controllers/resources/ -run "TestReconcileProxyTrustedCAConfigMap|TestReconcileUserCertCABundle" -v

New TestReconcileProxyTrustedCAConfigMap table-driven test covers 4 cases:

  • neither proxy.trustedCA nor additionalTrustBundle set → no dest ConfigMap created
  • only proxy.trustedCA set → dest ConfigMap contains proxy CA data only
  • both set → dest ConfigMap contains merged CA data from both sources
  • only additionalTrustBundle set → reconcileProxyTrustedCAConfigMap returns early (handled by Change 2)

Fixes: https://issues.redhat.com/browse/OCPBUGS-81675

Summary by CodeRabbit

  • Bug Fixes

    • Proxy trusted CA handling now sets the proxy to use the user CA bundle when an additional trust bundle is present, avoids removing that bundle during cleanup, and correctly creates/merges CA data so concatenated bundles preserve newline formatting.
  • Tests

    • Added unit tests validating trusted CA creation, error cases, precedence behavior, and correct merging between proxy trusted CA and additional trust bundles.

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: LGTM mode

@openshift-ci-robot openshift-ci-robot added jira/severity-moderate Referenced Jira bug's severity is moderate for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels May 14, 2026
@openshift-ci-robot
Copy link
Copy Markdown

@dpateriya: This pull request references Jira Issue OCPBUGS-81675, which is invalid:

  • expected the bug to target either version "5.0." or "openshift-5.0.", but it targets "4.22.0" instead

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

Problem

When HostedCluster.spec.additionalTrustBundle is set, HCCO syncs it into openshift-config/user-ca-bundle on the guest cluster. However, that ConfigMap is not referenced by the guest cluster's proxy.config.openshift.io/cluster object, so CNO never merges it into openshift-config-managed/trusted-ca-bundle.

Any component that reads trusted-ca-bundle to build its TLS trust pool (such as CVO when contacting a custom spec.updateService) cannot verify certificates signed by the internal CA that was provided only via additionalTrustBundle. The failure surface is:

Unable to retrieve available updates: tls: failed to verify certificate: x509: certificate signed by unknown authority

This affects users running an internal Cincinnati-compatible update service (e.g. OSUS) behind a certificate signed by an internal CA that is provided via additionalTrustBundle.

Fix

Two changes in HCCO (reconciler in resources.go):

Change 1 — reconcileProxyTrustedCAConfigMap

When both proxy.trustedCA and additionalTrustBundle are set, merge the additionalTrustBundle CA data into the same destination ConfigMap (in openshift-config) alongside the proxy CA data. CNO then merges both into openshift-config-managed/trusted-ca-bundle.

Change 2 — Proxy object reconciliation in reconcileConfig

When proxy.trustedCA is not set but additionalTrustBundle is set, point the guest cluster's proxy.config.openshift.io/cluster object's spec.trustedCA.Name at "user-ca-bundle" (which reconcileUserCertCABundle already writes). CNO then picks it up and merges it into trusted-ca-bundle.

Why here (HCCO) and not in upstream CVO

A previous PR (openshift/cluster-version-operator#1370) attempted to fix this in CVO gated on --hypershift. Reviewer @enxebre correctly pointed out that CPO ships with the OCP payload at the same cadence as CVO, so there is no release-speed advantage, and the CVO fix would be unnecessary tech debt. This HCCO fix benefits all consumers of trusted-ca-bundle (CVO, MCO, etc.) rather than just CVO.

Testing

go test ./control-plane-operator/hostedclusterconfigoperator/controllers/resources/ -run "TestReconcileProxyTrustedCAConfigMap|TestReconcileUserCertCABundle" -v

New TestReconcileProxyTrustedCAConfigMap table-driven test covers 4 cases:

  • neither proxy.trustedCA nor additionalTrustBundle set → no dest ConfigMap created
  • only proxy.trustedCA set → dest ConfigMap contains proxy CA data only
  • both set → dest ConfigMap contains merged CA data from both sources
  • only additionalTrustBundle set → reconcileProxyTrustedCAConfigMap returns early (handled by Change 2)

Fixes: https://issues.redhat.com/browse/OCPBUGS-81675

Made with Cursor

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 14, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

When reconciling proxy config, if Proxy.Spec.TrustedCA.Name is empty and HostedCluster.Spec.AdditionalTrustBundle is set, the code sets Proxy.Spec.TrustedCA.Name to the user-ca-bundle ConfigMap name. reconcileProxyTrustedCAConfigMap now initializes destCM.Data by copying all key/value pairs from the referenced source ConfigMap, and if AdditionalTrustBundle is present, fetches its control-plane ConfigMap and appends its ca-bundle.crt (certs.UserCABundleMapKey) onto destCM.Data["ca-bundle.crt"] with newline normalization. Tests covering these scenarios were added.

Sequence Diagram(s)

sequenceDiagram
    participant HCO as HostedClusterConfigOperator
    participant API as Kubernetes API (Control Plane)
    participant SourceCM as Proxy Source ConfigMap
    participant ATB_CM as AdditionalTrustBundle ConfigMap
    participant DestCM as Destination TrustedCA ConfigMap

    HCO->>API: Read Proxy object
    alt Proxy.trustedCA empty and AdditionalTrustBundle set
        HCO->>HCO: Set Proxy.Spec.TrustedCA.Name = "user-ca-bundle"
    end
    HCO->>API: Read SourceCM referenced by Proxy.trustedCA
    API-->>HCO: return SourceCM data
    HCO->>DestCM: initialize destCM.Data = {}
    HCO->>HCO: copy all key/value pairs from SourceCM -> destCM.Data
    alt AdditionalTrustBundle set
        HCO->>API: Read ATB_CM from control plane
        API-->>HCO: return ATB_CM data
        alt ATB_CM["ca-bundle.crt"] non-empty
            HCO->>DestCM: append ATB_CM["ca-bundle.crt"] to destCM.Data["ca-bundle.crt"]
        end
    end
    HCO->>API: Create/Update DestCM with destCM.Data
    API-->>HCO: acknowledge create/update
Loading
🚥 Pre-merge checks | ✅ 10 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Test Structure And Quality ❓ Inconclusive The check asks to review "Ginkgo test code" with Describe/It/BeforeEach patterns, but the PR adds standard Go unit tests (func TestX(t *testing.T)) with Gomega matchers, not Ginkgo BDD tests. Clarify: Are the check requirements for all tests or only Ginkgo-based tests? If all tests: the PR tests lack assertion messages per requirement 4. If only Ginkgo: check is not applicable (PASS).
✅ Passed checks (10 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the main change: merging additionalTrustBundle into proxy trusted CA bundle, which is the primary purpose of this PR.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed Added tests use standard Go t.Run() table-driven approach, not Ginkgo (It/Describe/Context). Check applies only to Ginkgo tests; not applicable here.
Microshift Test Compatibility ✅ Passed The PR adds only standard Go unit tests (TestReconcileProxyTrustedCAConfigMap, TestReconcileConfigProxyTrustedCA) using testing.T, not Ginkgo e2e tests. The custom check is inapplicable.
Single Node Openshift (Sno) Test Compatibility ✅ Passed No Ginkgo e2e tests added; only standard Go unit tests (TestReconcileProxyTrustedCAConfigMap, TestReconcileConfigProxyTrustedCA). Check applies only to Ginkgo e2e tests.
Topology-Aware Scheduling Compatibility ✅ Passed PR does not introduce scheduling constraints. Changes only modify ConfigMap and Proxy object configurations for trust bundle management, with no pod/deployment/workload scheduling logic.
Ote Binary Stdout Contract ✅ Passed No process-level stdout writes found. PR adds standard Go unit tests and production logic with no main/init/TestMain/suite setup code that could corrupt JSON stdout.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed New tests are standard Go unit tests (testing.T), not Ginkgo e2e tests. Check for Ginkgo e2e tests with IPv4/external connectivity is not applicable.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci Bot requested review from jparrill and sjenning May 14, 2026 12:58
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 14, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: dpateriya
Once this PR has been reviewed and has the lgtm label, please assign jparrill for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added area/control-plane-operator Indicates the PR includes changes for the control plane operator - in an OCP release and removed do-not-merge/needs-area labels May 14, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented May 14, 2026

Codecov Report

❌ Patch coverage is 77.77778% with 8 lines in your changes missing coverage. Please review.
✅ Project coverage is 39.74%. Comparing base (222a19f) to head (b0cc962).
⚠️ Report is 319 commits behind head on main.

Files with missing lines Patch % Lines
...rconfigoperator/controllers/resources/resources.go 77.77% 5 Missing and 3 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #8520      +/-   ##
==========================================
+ Coverage   36.30%   39.74%   +3.43%     
==========================================
  Files         764      773       +9     
  Lines       93015    94913    +1898     
==========================================
+ Hits        33772    37721    +3949     
+ Misses      56530    54494    -2036     
+ Partials     2713     2698      -15     
Files with missing lines Coverage Δ
...rconfigoperator/controllers/resources/resources.go 56.35% <77.77%> (+5.83%) ⬆️

... and 164 files with indirect coverage changes

Flag Coverage Δ
cmd-support 32.68% <ø> (+2.65%) ⬆️
cpo-hostedcontrolplane 41.76% <ø> (+4.71%) ⬆️
cpo-other 40.58% <77.77%> (+4.88%) ⬆️
hypershift-operator 50.72% <ø> (+2.83%) ⬆️
other 31.58% <ø> (+3.89%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@dpateriya
Copy link
Copy Markdown
Contributor Author

/retest

@dpateriya dpateriya force-pushed the fix/hcco-additionaltrustbundle-trusted-ca-bundle branch from 45d3f02 to 4319c77 Compare May 14, 2026 13:11
@dpateriya dpateriya changed the title OCPBUGS-81675: hcco: merge additionalTrustBundle into proxy trusted CA bundle fix: OCPBUGS-81675: hcco: merge additionalTrustBundle into proxy trusted CA bundle May 14, 2026
@openshift-ci-robot openshift-ci-robot added jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. and removed jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels May 14, 2026
@openshift-ci-robot
Copy link
Copy Markdown

@dpateriya: This pull request references Jira Issue OCPBUGS-81675, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (5.0.0) matches configured target version for branch (5.0.0)
  • bug is in the state New, which is one of the valid states (NEW, ASSIGNED, POST)

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

Problem

When HostedCluster.spec.additionalTrustBundle is set, HCCO syncs it into openshift-config/user-ca-bundle on the guest cluster. However, that ConfigMap is not referenced by the guest cluster's proxy.config.openshift.io/cluster object, so CNO never merges it into openshift-config-managed/trusted-ca-bundle.

Any component that reads trusted-ca-bundle to build its TLS trust pool (such as CVO when contacting a custom spec.updateService) cannot verify certificates signed by the internal CA that was provided only via additionalTrustBundle. The failure surface is:

Unable to retrieve available updates: tls: failed to verify certificate: x509: certificate signed by unknown authority

This affects users running an internal Cincinnati-compatible update service (e.g. OSUS) behind a certificate signed by an internal CA that is provided via additionalTrustBundle.

Fix

Two changes in HCCO (reconciler in resources.go):

Change 1 — reconcileProxyTrustedCAConfigMap

When both proxy.trustedCA and additionalTrustBundle are set, merge the additionalTrustBundle CA data into the same destination ConfigMap (in openshift-config) alongside the proxy CA data. CNO then merges both into openshift-config-managed/trusted-ca-bundle.

Change 2 — Proxy object reconciliation in reconcileConfig

When proxy.trustedCA is not set but additionalTrustBundle is set, point the guest cluster's proxy.config.openshift.io/cluster object's spec.trustedCA.Name at "user-ca-bundle" (which reconcileUserCertCABundle already writes). CNO then picks it up and merges it into trusted-ca-bundle.

Why here (HCCO) and not in upstream CVO

A previous PR (openshift/cluster-version-operator#1370) attempted to fix this in CVO gated on --hypershift. Reviewer @enxebre correctly pointed out that CPO ships with the OCP payload at the same cadence as CVO, so there is no release-speed advantage, and the CVO fix would be unnecessary tech debt. This HCCO fix benefits all consumers of trusted-ca-bundle (CVO, MCO, etc.) rather than just CVO.

Testing

go test ./control-plane-operator/hostedclusterconfigoperator/controllers/resources/ -run "TestReconcileProxyTrustedCAConfigMap|TestReconcileUserCertCABundle" -v

New TestReconcileProxyTrustedCAConfigMap table-driven test covers 4 cases:

  • neither proxy.trustedCA nor additionalTrustBundle set → no dest ConfigMap created
  • only proxy.trustedCA set → dest ConfigMap contains proxy CA data only
  • both set → dest ConfigMap contains merged CA data from both sources
  • only additionalTrustBundle set → reconcileProxyTrustedCAConfigMap returns early (handled by Change 2)

Fixes: https://issues.redhat.com/browse/OCPBUGS-81675

Summary by CodeRabbit

  • Bug Fixes

  • Proxy trusted CA handling now correctly creates and merges trusted CA data with any additional trust bundle, ensuring proper certificate validation.

  • Tests

  • Added unit tests covering trusted CA creation and merging behavior between proxy trusted CA and additional trust bundles.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@dpateriya dpateriya force-pushed the fix/hcco-additionaltrustbundle-trusted-ca-bundle branch from 4319c77 to 7cd780c Compare May 14, 2026 13:50
@openshift-ci-robot
Copy link
Copy Markdown

@dpateriya: This pull request references Jira Issue OCPBUGS-81675, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (5.0.0) matches configured target version for branch (5.0.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)
Details

In response to this:

Problem

When HostedCluster.spec.additionalTrustBundle is set, HCCO syncs it into openshift-config/user-ca-bundle on the guest cluster. However, that ConfigMap is not referenced by the guest cluster's proxy.config.openshift.io/cluster object, so CNO never merges it into openshift-config-managed/trusted-ca-bundle.

Any component that reads trusted-ca-bundle to build its TLS trust pool (such as CVO when contacting a custom spec.updateService) cannot verify certificates signed by the internal CA that was provided only via additionalTrustBundle. The failure surface is:

Unable to retrieve available updates: tls: failed to verify certificate: x509: certificate signed by unknown authority

This affects users running an internal Cincinnati-compatible update service (e.g. OSUS) behind a certificate signed by an internal CA that is provided via additionalTrustBundle.

Fix

Two changes in HCCO (reconciler in resources.go):

Change 1 — reconcileProxyTrustedCAConfigMap

When both proxy.trustedCA and additionalTrustBundle are set, merge the additionalTrustBundle CA data into the same destination ConfigMap (in openshift-config) alongside the proxy CA data. CNO then merges both into openshift-config-managed/trusted-ca-bundle.

Change 2 — Proxy object reconciliation in reconcileConfig

When proxy.trustedCA is not set but additionalTrustBundle is set, point the guest cluster's proxy.config.openshift.io/cluster object's spec.trustedCA.Name at "user-ca-bundle" (which reconcileUserCertCABundle already writes). CNO then picks it up and merges it into trusted-ca-bundle.

Why here (HCCO) and not in upstream CVO

A previous PR (openshift/cluster-version-operator#1370) attempted to fix this in CVO gated on --hypershift. Reviewer @enxebre correctly pointed out that CPO ships with the OCP payload at the same cadence as CVO, so there is no release-speed advantage, and the CVO fix would be unnecessary tech debt. This HCCO fix benefits all consumers of trusted-ca-bundle (CVO, MCO, etc.) rather than just CVO.

Testing

go test ./control-plane-operator/hostedclusterconfigoperator/controllers/resources/ -run "TestReconcileProxyTrustedCAConfigMap|TestReconcileUserCertCABundle" -v

New TestReconcileProxyTrustedCAConfigMap table-driven test covers 4 cases:

  • neither proxy.trustedCA nor additionalTrustBundle set → no dest ConfigMap created
  • only proxy.trustedCA set → dest ConfigMap contains proxy CA data only
  • both set → dest ConfigMap contains merged CA data from both sources
  • only additionalTrustBundle set → reconcileProxyTrustedCAConfigMap returns early (handled by Change 2)

Fixes: https://issues.redhat.com/browse/OCPBUGS-81675

Summary by CodeRabbit

  • Bug Fixes

  • Proxy trusted CA handling now automatically points to the merged user CA bundle when an additional trust bundle is present, and correctly creates and merges trusted CA data to ensure proper certificate validation.

  • Tests

  • Added unit tests covering trusted CA creation and merging behavior between proxy trusted CA and additional trust bundles.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@dpateriya
Copy link
Copy Markdown
Contributor Author

dpateriya commented May 19, 2026

Verified

Tested manually on a HyperShift hosted cluster with spec.additionalTrustBundle set.

Before the fix:

  • proxy.config.openshift.io/cluster on the guest had spec.trustedCA.name: ""
  • openshift-config-managed/trusted-ca-bundle did not contain the internal CA
  • CVO failed with x509: certificate signed by unknown authority when contacting spec.updateService

After patching HCCO with this fix:

  • proxy.config.openshift.io/cluster is now updated with spec.trustedCA.name: "user-ca-bundle"
  • CNO picks up user-ca-bundle and merges it into openshift-config-managed/trusted-ca-bundle
  • CVO reads trusted-ca-bundle for TLS verification and can now trust the internal CA

Validation:

$ oc get proxy cluster -o jsonpath='{.spec.trustedCA.name}' --kubeconfig hosted-kubeconfig
user-ca-bundle

The full data flow is now:

hc.spec.additionalTrustBundle
        │
        ▼
HCCO writes "openshift-config/user-ca-bundle" (already existed)
        │
        ▼  ← YOUR FIX: point proxy.spec.trustedCA.Name at "user-ca-bundle"
        │
CNO sees proxy.spec.trustedCA.Name is set
        │
        ▼
CNO merges it into "openshift-config-managed/trusted-ca-bundle"
        │
        ▼
CVO reads trusted-ca-bundle → TLS now trusts the internal CA → updateService works

/verified by me

@openshift-ci-robot
Copy link
Copy Markdown

@dpateriya: The /verified command must be used with one of the following actions: by, later, remove, or bypass. See https://docs.ci.openshift.org/docs/architecture/jira/#premerge-verification for more information.

Details

In response to this:

Verified

Tested manually on a HyperShift hosted cluster with spec.additionalTrustBundle set.

Before the fix:

  • proxy.config.openshift.io/cluster on the guest had spec.trustedCA.name: ""
  • openshift-config-managed/trusted-ca-bundle did not contain the internal CA
  • CVO failed with x509: certificate signed by unknown authority when contacting spec.updateService

After patching HCCO with this fix:

  • proxy.config.openshift.io/cluster is now updated with spec.trustedCA.name: "user-ca-bundle"
  • CNO picks up user-ca-bundle and merges it into openshift-config-managed/trusted-ca-bundle
  • CVO reads trusted-ca-bundle for TLS verification and can now trust the internal CA

Validation:

$ oc get proxy cluster -o jsonpath='{.spec.trustedCA.name}' --kubeconfig hosted-kubeconfig
user-ca-bundle

The full data flow is now:

  1. HostedCluster.spec.additionalTrustBundle → HCCO syncs to openshift-config/user-ca-bundle (existing behavior)
  2. HCCO sets proxy.config.openshift.io/cluster spec.trustedCA.name = "user-ca-bundle" (this fix)
  3. CNO merges the referenced ConfigMap into openshift-config-managed/trusted-ca-bundle
  4. CVO reads trusted-ca-bundle → TLS verification succeeds for internal update service

/verified

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@dpateriya
Copy link
Copy Markdown
Contributor Author

/verified by me

@openshift-ci-robot openshift-ci-robot added the verified Signifies that the PR passed pre-merge verification criteria label May 19, 2026
@openshift-ci-robot
Copy link
Copy Markdown

@dpateriya: This PR has been marked as verified by me.

Details

In response to this:

/verified by me

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link
Copy Markdown
Contributor

@jparrill jparrill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dropped some comments. Thanks!

@jparrill
Copy link
Copy Markdown
Contributor

jparrill commented May 25, 2026

/verified by me

@dpateriya the /verified label should come from someone other than the PR author — ideally someone from QE or another engineer who has independently tested the fix in a real cluster. Self-verification doesn't give us the confidence that the change actually works end-to-end in a live environment, especially for a trust bundle fix like this where the failure mode (silent TLS rejection) is easy to miss in unit tests alone.

Could someone from QE pick this up and verify it on a hosted cluster with additionalTrustBundle + a custom update service behind an internal CA?

@dpateriya
Copy link
Copy Markdown
Contributor Author

dpateriya commented May 25, 2026

The initial /verified label was added based on my own testing in a development environment, and I acknowledge that independent QE verification is needed before this can be considered fully validated.

What was tested (dev environment):

  • Patched HCCO image deployed to a HyperShift hosted cluster
  • Confirmed proxy.config.openshift.io/cluster gets spec.trustedCA.name: user-ca-bundle when only hc.spec.additionalTrustBundle is set
  • Confirmed CNO subsequently creates openshift-config-managed/trusted-ca-bundle with the CA data

Request: Could someone from QE independently verify this fix covers the expected scenarios (ATB-only, both ATB+proxy.trustedCA, removal/cleanup)?

@dpateriya dpateriya force-pushed the fix/hcco-additionaltrustbundle-trusted-ca-bundle branch from 7cd780c to 928ff98 Compare May 25, 2026 14:39
@openshift-ci-robot openshift-ci-robot removed the verified Signifies that the PR passed pre-merge verification criteria label May 25, 2026
@openshift-ci-robot
Copy link
Copy Markdown

@dpateriya: This pull request references Jira Issue OCPBUGS-81675, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (5.0.0) matches configured target version for branch (5.0.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)
Details

In response to this:

Problem

When HostedCluster.spec.additionalTrustBundle is set, HCCO syncs it into openshift-config/user-ca-bundle on the guest cluster. However, that ConfigMap is not referenced by the guest cluster's proxy.config.openshift.io/cluster object, so CNO never merges it into openshift-config-managed/trusted-ca-bundle.

Any component that reads trusted-ca-bundle to build its TLS trust pool (such as CVO when contacting a custom spec.updateService) cannot verify certificates signed by the internal CA that was provided only via additionalTrustBundle. The failure surface is:

Unable to retrieve available updates: tls: failed to verify certificate: x509: certificate signed by unknown authority

This affects users running an internal Cincinnati-compatible update service (e.g. OSUS) behind a certificate signed by an internal CA that is provided via additionalTrustBundle.

Fix

Two changes in HCCO (reconciler in resources.go):

Change 1 — reconcileProxyTrustedCAConfigMap

When both proxy.trustedCA and additionalTrustBundle are set, merge the additionalTrustBundle CA data into the same destination ConfigMap (in openshift-config) alongside the proxy CA data. CNO then merges both into openshift-config-managed/trusted-ca-bundle.

Change 2 — Proxy object reconciliation in reconcileConfig

When proxy.trustedCA is not set but additionalTrustBundle is set, point the guest cluster's proxy.config.openshift.io/cluster object's spec.trustedCA.Name at "user-ca-bundle" (which reconcileUserCertCABundle already writes). CNO then picks it up and merges it into trusted-ca-bundle.

Why here (HCCO) and not in upstream CVO

A previous PR (openshift/cluster-version-operator#1370) attempted to fix this in CVO gated on --hypershift. Reviewer @enxebre correctly pointed out that CPO ships with the OCP payload at the same cadence as CVO, so there is no release-speed advantage, and the CVO fix would be unnecessary tech debt. This HCCO fix benefits all consumers of trusted-ca-bundle (CVO, MCO, etc.) rather than just CVO.

Testing

go test ./control-plane-operator/hostedclusterconfigoperator/controllers/resources/ -run "TestReconcileProxyTrustedCAConfigMap|TestReconcileUserCertCABundle" -v

New TestReconcileProxyTrustedCAConfigMap table-driven test covers 4 cases:

  • neither proxy.trustedCA nor additionalTrustBundle set → no dest ConfigMap created
  • only proxy.trustedCA set → dest ConfigMap contains proxy CA data only
  • both set → dest ConfigMap contains merged CA data from both sources
  • only additionalTrustBundle set → reconcileProxyTrustedCAConfigMap returns early (handled by Change 2)

Fixes: https://issues.redhat.com/browse/OCPBUGS-81675

Summary by CodeRabbit

  • Bug Fixes

  • Proxy trusted CA handling now targets the user CA bundle when an additional trust bundle is present, avoids removing that bundle during cleanup, and correctly creates/merges CA data so concatenated bundles preserve newline formatting.

  • Tests

  • Added unit tests validating trusted CA creation, error cases, and correct merging behavior between proxy trusted CA and additional trust bundles.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources_test.go`:
- Around line 777-791: The test currently returns immediately when tc.wantErr is
true which skips assertions about the destination ConfigMap; instead, remove the
early return and always run the dest CM checks based on tc.wantDestCMExists so
error-paths also validate whether manifests.ProxyTrustedCAConfigMap(proxyCAName)
was created or not using r.client.Get and apierrors.IsNotFound; keep the
existing expectations for destCM.Data["ca-bundle.crt"] == tc.wantDestCMData when
wantDestCMExists is true and assert apierrors.IsNotFound(err) when it is false.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 3656122c-c14d-412a-b36b-f5db86560311

📥 Commits

Reviewing files that changed from the base of the PR and between 7cd780c and 928ff98.

📒 Files selected for processing (2)
  • control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources.go
  • control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources_test.go

…ted CA bundle

When HostedCluster.spec.additionalTrustBundle is set, HCCO already syncs
its contents into openshift-config/user-ca-bundle on the guest cluster.
However, that ConfigMap is not referenced by the guest cluster's
proxy.config.openshift.io/cluster object, so CNO never merges it into
openshift-config-managed/trusted-ca-bundle. Any component reading
trusted-ca-bundle (including CVO's update service TLS client) therefore
cannot verify TLS certificates signed by the internal CA provided via
additionalTrustBundle.

Fix this in HCCO with two changes:

1. reconcileProxyTrustedCAConfigMap: when both proxy.trustedCA and
   additionalTrustBundle are set, merge the additionalTrustBundle CA
   data into the destination ConfigMap alongside proxy.trustedCA data.
   CNO then merges both into trusted-ca-bundle.

2. reconcileConfig (Proxy object reconciliation): when no proxy.trustedCA
   is configured but additionalTrustBundle is set, point the Proxy
   object's spec.trustedCA.Name at user-ca-bundle (already written by
   reconcileUserCertCABundle), so CNO picks up the additionalTrustBundle
   CA via the standard trusted-ca-bundle path.

This benefits all consumers of trusted-ca-bundle (CVO, MCO, etc.) and
avoids adding HyperShift-specific special-casing in upstream components.

Co-authored-by: Cursor <cursoragent@cursor.com>
@dpateriya dpateriya force-pushed the fix/hcco-additionaltrustbundle-trusted-ca-bundle branch from 928ff98 to b0cc962 Compare May 25, 2026 15:29
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources.go (1)

1145-1157: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Do not delete the management-cluster source ConfigMap.

Line 1150/1151 deletes the ConfigMap from hcp.Namespace, but this ref comes from hcp.Spec.Configuration.Proxy.TrustedCA. That is the source bundle, so switching away from proxy.trustedCA or moving to ATB-only will destroy the admin-managed ConfigMap in the control-plane namespace.

Suggested fix
-			// log and ignore deletion errors, should not disrupt normal workflow
-			cm.Namespace = hcp.Namespace
-			if err := r.cpClient.Delete(ctx, cm); err != nil {
-				log.Error(err, "failed to delete configmap", "name", cm.Name, "namespace", cm.Namespace)
-			}
-
 			cm.Namespace = manifests.ProxyTrustedCAConfigMap("").Namespace
 			if err := r.client.Delete(ctx, cm); err != nil {
 				log.Error(err, "failed to delete configmap in hosted cluster", "name", cm.Name, "namespace", cm.Namespace)
 			}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources.go`
around lines 1145 - 1157, The code currently unconditionally deletes the
ConfigMap from hcp.Namespace using r.cpClient.Delete which can remove the
management-cluster source referenced by hcp.Spec.Configuration.Proxy.TrustedCA;
change the cleanup logic in the resource reconciliation (around
currentConfigMapRef handling in resources.go) to skip deleting the
management-source ConfigMap: only call r.cpClient.Delete when
currentConfigMapRef is not equal to hcp.Spec.Configuration.Proxy.TrustedCA (or
alternatively verify a controller-owned label/ownerReference before deletion),
and continue to delete the hosted-cluster copy with r.client.Delete (using
manifests.ProxyTrustedCAConfigMap("").Namespace) as before.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources.go`:
- Around line 1180-1184: The code is fetching the AdditionalTrustBundle
ConfigMap by hcp.Spec.AdditionalTrustBundle.Name but reconcileUserCertCABundle
uses cpomanifests.UserCAConfigMap(hcp.Namespace), so change the lookup to use
the canonical control-plane ATB name: call r.cpClient.Get with
client.ObjectKey{Name: cpomanifests.UserCAConfigMap(hcp.Namespace).Name,
Namespace: hcp.Namespace} (or otherwise use
cpomanifests.UserCAConfigMap(hcp.Namespace) as the source of truth) when
populating atbCM so the merge path and reconcileUserCertCABundle use the same
ConfigMap name.

---

Duplicate comments:
In
`@control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources.go`:
- Around line 1145-1157: The code currently unconditionally deletes the
ConfigMap from hcp.Namespace using r.cpClient.Delete which can remove the
management-cluster source referenced by hcp.Spec.Configuration.Proxy.TrustedCA;
change the cleanup logic in the resource reconciliation (around
currentConfigMapRef handling in resources.go) to skip deleting the
management-source ConfigMap: only call r.cpClient.Delete when
currentConfigMapRef is not equal to hcp.Spec.Configuration.Proxy.TrustedCA (or
alternatively verify a controller-owned label/ownerReference before deletion),
and continue to delete the hosted-cluster copy with r.client.Delete (using
manifests.ProxyTrustedCAConfigMap("").Namespace) as before.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 2abf2bdb-eef3-44b5-ad74-0976e2ace52b

📥 Commits

Reviewing files that changed from the base of the PR and between 928ff98 and b0cc962.

📒 Files selected for processing (2)
  • control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources.go
  • control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources_test.go

Comment on lines +1180 to +1184
if hcp.Spec.AdditionalTrustBundle != nil {
atbCM := &corev1.ConfigMap{}
if err := r.cpClient.Get(ctx, client.ObjectKey{Namespace: hcp.Namespace, Name: hcp.Spec.AdditionalTrustBundle.Name}, atbCM); err != nil {
return fmt.Errorf("failed to get additionalTrustBundle configmap %s/%s: %w", hcp.Namespace, hcp.Spec.AdditionalTrustBundle.Name, err)
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Use the canonical control-plane ATB ConfigMap here.

This path reads hcp.Spec.AdditionalTrustBundle.Name, but reconcileUserCertCABundle below reads cpomanifests.UserCAConfigMap(hcp.Namespace). If those names ever diverge, the guest user-ca-bundle sync still works while this merge path fails with a not-found.

Suggested fix
-			atbCM := &corev1.ConfigMap{}
-			if err := r.cpClient.Get(ctx, client.ObjectKey{Namespace: hcp.Namespace, Name: hcp.Spec.AdditionalTrustBundle.Name}, atbCM); err != nil {
-				return fmt.Errorf("failed to get additionalTrustBundle configmap %s/%s: %w", hcp.Namespace, hcp.Spec.AdditionalTrustBundle.Name, err)
+			atbCM := cpomanifests.UserCAConfigMap(hcp.Namespace)
+			if err := r.cpClient.Get(ctx, client.ObjectKeyFromObject(atbCM), atbCM); err != nil {
+				return fmt.Errorf("failed to get additionalTrustBundle configmap %s/%s: %w", atbCM.Namespace, atbCM.Name, err)
 			}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources.go`
around lines 1180 - 1184, The code is fetching the AdditionalTrustBundle
ConfigMap by hcp.Spec.AdditionalTrustBundle.Name but reconcileUserCertCABundle
uses cpomanifests.UserCAConfigMap(hcp.Namespace), so change the lookup to use
the canonical control-plane ATB name: call r.cpClient.Get with
client.ObjectKey{Name: cpomanifests.UserCAConfigMap(hcp.Namespace).Name,
Namespace: hcp.Namespace} (or otherwise use
cpomanifests.UserCAConfigMap(hcp.Namespace) as the source of truth) when
populating atbCM so the merge path and reconcileUserCertCABundle use the same
ConfigMap name.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 25, 2026

@dpateriya: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@hypershift-jira-solve-ci
Copy link
Copy Markdown

Test Failure Analysis Complete

Job Information

  • Prow Job: Red Hat Konflux / hypershift-operator-main-enterprise-contract / hypershift-operator-main
  • Build ID: hypershift-operator-main-enterprise-contract-4zmlz (and hypershift-operator-enterprise-contract-wttd2)
  • PR: fix: OCPBUGS-81675: hcco: merge additionalTrustBundle into proxy trusted CA bundle #8520fix: OCPBUGS-81675: hcco: merge additionalTrustBundle into proxy trusted CA bundle
  • Snapshot: hypershift-operator-20260525-152929-000
  • Component: hypershift-operator-main

Test Failure Analysis

Error

Enterprise Contract verify task FAILURE:
  250 success(es), 22 warning(s), 6 failure(s)

Both EC scenarios (hypershift-operator-main-enterprise-contract and
hypershift-operator-enterprise-contract) report identical results.

Summary

The two Konflux Enterprise Contract (EC) verification failures are pre-existing, intermittent infrastructure issues unrelated to PR #8520. The PR only modifies Go source code (resources.go and resources_test.go) and cannot affect container image supply-chain policy verification. The same 6-failure pattern (250 successes / 22 warnings / 6 failures) has been observed on other unrelated PRs (#8527, #8553) which were merged despite these failures. Furthermore, the control-plane-operator component built from the same snapshot passed its EC check with 484 successes and 0 failures. The EC checks are not required for merge in the openshift/hypershift repository.

Root Cause

The 6 Enterprise Contract verification failures are intermittent and time-dependent, caused by transient issues in the Konflux EC verification infrastructure — not by any code change in PR #8520.

Key evidence for intermittency:

Why only hypershift-operator and not control-plane-operator:

  • Containerfile.operator includes Red Hat product metadata labels (com.redhat.component, cpe, etc.) that trigger additional EC policy rules related to product certification and RPM provenance. Containerfile.control-plane uses only OpenShift operator labels and does not trigger these rules.
  • The 6 failing rules are specific to the product metadata/supply-chain validation path that applies to images with Red Hat product labels.

Why push (main) passes but pull-request fails:

  • Push builds produce multi-arch images (x86_64 + others) with full signing and attestation. Pull-request builds produce single-arch temporary images (with image-expires-after: 5d). The EC policy scenarios may treat PR images differently — rules that produce warnings for fully-signed push images may produce failures for temporarily-signed PR images.
Recommendations
  1. No action required on PR fix: OCPBUGS-81675: hcco: merge additionalTrustBundle into proxy trusted CA bundle #8520 — These failures are unrelated to the code changes. The PR can be merged; other PRs with identical EC failures (CNTRLPLANE-3222: Port v1 lifecycle tests to v2 Ginkgo framework #8527, NO-JIRA: fix(e2e): lower pull secret in-place propagation test gate to 4.22 #8553) have already been merged.

  2. Retrigger the EC checks — Since the failures are intermittent, a retry may produce a passing result (as happened with PR CNTRLPLANE-3222: Port v1 lifecycle tests to v2 Ginkgo framework #8527, which failed on one run and passed on a subsequent attempt).

  3. Report the intermittent EC failures to the Konflux/RHTAP team — The pattern of 6 non-deterministic failures in the hypershift-operator-main-enterprise-contract scenario should be reported so the root policy or verification infrastructure issue can be addressed. The affected pipeline runs are:

    • hypershift-operator-main-enterprise-contract-4zmlz
    • hypershift-operator-enterprise-contract-wttd2
  4. Consider marking EC checks as non-blocking (if not already) — Since these checks are already non-required for merge and exhibit intermittent failures, they should be clearly labeled as informational to avoid confusion for PR authors.

Evidence
Evidence Detail
PR #8520 changed files resources.go, resources_test.go only — no Containerfile/pipeline/build changes
control-plane-operator EC (same snapshot) PASSED: 484 successes, 0 warnings, 0 failures
hypershift-operator EC (PR #8520) FAILED: 250 successes, 22 warnings, 6 failures
PR #8579 EC (same day, 03:38 UTC) PASSED: 256 successes, 10 warnings, 0 failures
PR #8520 EC (same day, 15:29 UTC) FAILED: 250 successes, 22 warnings, 6 failures
PR #8527 EC (May 24) FAILED with same 250/22/6 pattern — PR was merged despite failure
PR #8553 EC (May 20) FAILED with 254/24/2 pattern — PR was merged despite failure
Main branch EC (push builds) NEUTRAL: 512 successes, 28 warnings, 0 failures — consistently passes
PR #8563 EC (May 21, before issue) NEUTRAL: 256 successes, 8 warnings, 0 failures
Pipeline run (failure 1) hypershift-operator-main-enterprise-contract-4zmlz
Pipeline run (failure 2) hypershift-operator-enterprise-contract-wttd2
EC checks required for merge? No — PRs #8527 and #8553 merged with identical failures

@dpateriya
Copy link
Copy Markdown
Contributor Author

@jparrill could you please review this PR once you LGTM, I will review this in a real env.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/control-plane-operator Indicates the PR includes changes for the control plane operator - in an OCP release jira/severity-moderate Referenced Jira bug's severity is moderate for the branch this PR is targeting. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants