Skip to content

OCPBUGS-86008: Gate Route watch on management cluster capability#8484

Merged
openshift-merge-bot[bot] merged 1 commit into
openshift:mainfrom
smrtrfszm:smrtrfszm/skip-route-watch
Jun 4, 2026
Merged

OCPBUGS-86008: Gate Route watch on management cluster capability#8484
openshift-merge-bot[bot] merged 1 commit into
openshift:mainfrom
smrtrfszm:smrtrfszm/skip-route-watch

Conversation

@smrtrfszm

@smrtrfszm smrtrfszm commented May 12, 2026

Copy link
Copy Markdown
Contributor

What this PR does / why we need it:

The hosted-cluster-config-operator unconditionally watches route.openshift.io/v1 Routes against the management cluster to react to hostname changes on the metrics-proxy Route. On management clusters that do not expose the Routes API (e.g. non-OpenShift management clusters) this watch fails during controller setup and prevents HCCO from starting.

Detect the management cluster Route capability using the existing capabilities.DetectManagementClusterCapabilities helper and only register the watch when route.openshift.io is registered. This mirrors the pattern already used in other parts of the code.

Which issue(s) this PR fixes:

Fixes #OCPBUGS-86008

Special notes for your reviewer:

Checklist:

  • Subject and description added to both, commit and PR.
  • Relevant issues have been referenced.
  • This change includes docs.
  • This change includes unit tests.

Summary by CodeRabbit

  • New Features

    • Operator now performs a one-time management-cluster capability detection at startup.
  • Bug Fixes

    • Skips watches for unsupported APIs, preventing startup failures when resources aren’t available.
    • Only monitors Route resources when the management cluster reports Route support, reducing errors and improving stability.

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: LGTM mode

@openshift-ci

openshift-ci Bot commented May 12, 2026

Copy link
Copy Markdown
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci openshift-ci Bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. do-not-merge/needs-area labels May 12, 2026
@coderabbitai

coderabbitai Bot commented May 12, 2026

Copy link
Copy Markdown
Contributor
📝 Walkthrough

Walkthrough

The operator now does a one-time management-cluster discovery at startup to detect capabilities and stores them in HostedClusterConfigOperatorConfig.ManagementClusterCapabilities. The resources controller's Setup checks that capability checker and registers a watch for OpenShift Route resources only when the Route capability is detected; startup fails if discovery or capability detection returns an error.

Sequence Diagram(s)

sequenceDiagram
    participant Cmd
    participant DiscoveryClient
    participant ManagementClusterAPI
    participant CapabilitiesDetector
    participant ResourcesController
    participant RouteAPI

    Cmd->>DiscoveryClient: create discovery client from kubeconfig
    DiscoveryClient->>ManagementClusterAPI: query API resources
    ManagementClusterAPI-->>DiscoveryClient: API resource list
    DiscoveryClient->>CapabilitiesDetector: provide resource list
    CapabilitiesDetector-->>Cmd: detected capabilities
    Cmd->>ResourcesController: provide ManagementClusterCapabilities
    ResourcesController->>CapabilitiesDetector: check CapabilityRoute
    alt Route capability present
        ResourcesController->>RouteAPI: register watch for Route resources
        RouteAPI-->>ResourcesController: watch started
    else Route capability absent
        ResourcesController-->>ResourcesController: skip Route watch
    end
Loading

Suggested reviewers

  • cblecker
🚥 Pre-merge checks | ✅ 11 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (11 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly summarizes the main change: gating the Route watch on management cluster capability detection, which is the core objective of the PR.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed No Ginkgo tests are present in this PR. The three modified files are source code only, and the PR explicitly excludes unit tests. The custom check is not applicable.
Test Structure And Quality ✅ Passed The PR contains no Ginkgo test files to review. The custom check applies to test code structure/quality, but no tests were added with these changes.
Microshift Test Compatibility ✅ Passed No new Ginkgo e2e tests added. PR adds 25 standard Go unit tests (func Test*) using gomega for assertions, not Ginkgo test framework. MicroShift check does not apply.
Single Node Openshift (Sno) Test Compatibility ✅ Passed No new Ginkgo e2e tests are added in this PR; it only modifies Go controller code for capability-gated Route watch. SNO compatibility check not applicable.
Topology-Aware Scheduling Compatibility ✅ Passed PR modifies operator config and gates Route watch on capability presence. No scheduling constraints, pod affinity, topology logic, or workload-scheduling code introduced.
Ote Binary Stdout Contract ✅ Passed No stdout violations. No fmt.Print/Println, klog, or init() side effects. Zap uses JSONEncoder defaulting to stderr.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed This PR modifies only non-test source files (resources.go, cmd.go, config.go) containing control-plane operator logic. No Ginkgo e2e tests are added, so the check is not applicable.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci Bot added area/control-plane-operator Indicates the PR includes changes for the control plane operator - in an OCP release and removed do-not-merge/needs-area labels May 12, 2026
@smrtrfszm smrtrfszm force-pushed the smrtrfszm/skip-route-watch branch from daa5044 to 94461ec Compare May 12, 2026 06:24
@codecov

codecov Bot commented May 12, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 0% with 38 lines in your changes missing coverage. Please review.
✅ Project coverage is 40.40%. Comparing base (36dfb1b) to head (1f4af14).
⚠️ Report is 187 commits behind head on main.

Files with missing lines Patch % Lines
...-plane-operator/hostedclusterconfigoperator/cmd.go 0.00% 34 Missing ⚠️
...rconfigoperator/controllers/resources/resources.go 0.00% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #8484      +/-   ##
==========================================
- Coverage   40.40%   40.40%   -0.01%     
==========================================
  Files         755      755              
  Lines       93235    93249      +14     
==========================================
  Hits        37675    37675              
- Misses      52858    52872      +14     
  Partials     2702     2702              
Files with missing lines Coverage Δ
...tor/hostedclusterconfigoperator/operator/config.go 0.00% <ø> (ø)
...rconfigoperator/controllers/resources/resources.go 55.35% <0.00%> (-0.02%) ⬇️
...-plane-operator/hostedclusterconfigoperator/cmd.go 0.00% <0.00%> (ø)
Flag Coverage Δ
cmd-support 34.44% <ø> (ø)
cpo-hostedcontrolplane 41.76% <ø> (ø)
cpo-other 40.27% <0.00%> (-0.05%) ⬇️
hypershift-operator 50.72% <ø> (ø)
other 31.54% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@smrtrfszm smrtrfszm force-pushed the smrtrfszm/skip-route-watch branch from 94461ec to f54aa29 Compare May 14, 2026 22:38
@smrtrfszm smrtrfszm marked this pull request as ready for review May 17, 2026 16:07
@openshift-ci openshift-ci Bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 17, 2026
@openshift-ci openshift-ci Bot requested review from cblecker and csrwng May 17, 2026 16:08
@smrtrfszm smrtrfszm changed the title fix(HCCO): gate Route watch on management cluster capability OCPBUGS-86008: Gate Route watch on management cluster capability May 17, 2026
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label May 17, 2026
@openshift-ci-robot

Copy link
Copy Markdown

@smrtrfszm: This pull request references Jira Issue OCPBUGS-86008, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (5.0.0) matches configured target version for branch (5.0.0)
  • bug is in the state New, which is one of the valid states (NEW, ASSIGNED, POST)

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

What this PR does / why we need it:

The hosted-cluster-config-operator unconditionally watches route.openshift.io/v1 Routes against the management cluster to react to hostname changes on the metrics-proxy Route. On management clusters that do not expose the Routes API (e.g. non-OpenShift management clusters) this watch fails during controller setup and prevents HCCO from starting.

Detect the management cluster Route capability using the existing capabilities.DetectManagementClusterCapabilities helper and only register the watch when route.openshift.io is registered. This mirrors the pattern already used in other parts of the code.

Which issue(s) this PR fixes:

Fixes #OCPBUGS-86008

Special notes for your reviewer:

Checklist:

  • Subject and description added to both, commit and PR.
  • Relevant issues have been referenced.
  • This change includes docs.
  • This change includes unit tests.

Summary by CodeRabbit

  • Bug Fixes
  • Only monitor the Route resource when the management cluster reports Route capability, preventing operator startup failures on non-OpenShift management clusters.
  • Controller startup is now more resilient: it detects cluster capabilities and skips incompatible watches so the operator can run on a wider range of management clusters without errors.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot openshift-ci-robot added the jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. label May 17, 2026
@openshift-ci-robot

Copy link
Copy Markdown

@smrtrfszm: This pull request references Jira Issue OCPBUGS-86008, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (5.0.0) matches configured target version for branch (5.0.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)
Details

In response to this:

What this PR does / why we need it:

The hosted-cluster-config-operator unconditionally watches route.openshift.io/v1 Routes against the management cluster to react to hostname changes on the metrics-proxy Route. On management clusters that do not expose the Routes API (e.g. non-OpenShift management clusters) this watch fails during controller setup and prevents HCCO from starting.

Detect the management cluster Route capability using the existing capabilities.DetectManagementClusterCapabilities helper and only register the watch when route.openshift.io is registered. This mirrors the pattern already used in other parts of the code.

Which issue(s) this PR fixes:

Fixes #OCPBUGS-86008

Special notes for your reviewer:

Checklist:

  • Subject and description added to both, commit and PR.
  • Relevant issues have been referenced.
  • This change includes docs.
  • This change includes unit tests.

Summary by CodeRabbit

  • Bug Fixes
  • Only monitor the Route resource when the management cluster reports Route capability, preventing operator startup failures on non-OpenShift management clusters.
  • Controller startup is now more resilient: it detects cluster capabilities and skips incompatible watches so the operator can run on a wider range of management clusters without errors.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@smrtrfszm smrtrfszm force-pushed the smrtrfszm/skip-route-watch branch from f54aa29 to 07098a5 Compare May 21, 2026 05:18

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources.go`:
- Around line 313-318: The reconcile logic still performs a Get on routev1.Route
even when the management cluster lacks the Route API; update the code to gate
that Route-dependent logic by checking
opts.ManagementClusterCapabilities.Has(capabilities.CapabilityRoute) before
attempting any route operations. Add the capability check either at the caller
that invokes reconcileMetricsForwarder or at the start of
reconcileMetricsForwarder itself so that when CapabilityRoute is false it
returns early (or skips the Route.Get/Route-specific flows) and does not attempt
to interact with routev1.Route.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 1d5b9f2e-167f-44f7-93d1-a03784aaf380

📥 Commits

Reviewing files that changed from the base of the PR and between f54aa29 and 07098a5.

📒 Files selected for processing (3)
  • control-plane-operator/hostedclusterconfigoperator/cmd.go
  • control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources.go
  • control-plane-operator/hostedclusterconfigoperator/operator/config.go

@TwoDCube

Copy link
Copy Markdown
Member

/lgtm
/retest

@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label May 21, 2026
@openshift-merge-bot

Copy link
Copy Markdown
Contributor

Scheduling tests matching the pipeline_run_if_changed or not excluded by pipeline_skip_if_only_changed parameters:
/test e2e-aks
/test e2e-aws
/test e2e-aws-upgrade-hypershift-operator
/test e2e-azure-self-managed
/test e2e-kubevirt-aws-ovn-reduced
/test e2e-v2-aws
/test e2e-v2-gke

The hosted-cluster-config-operator unconditionally watches
route.openshift.io/v1 Routes against the management cluster to react
to hostname changes on the metrics-proxy Route. On management clusters
that do not expose the Routes API (e.g. non-OpenShift management
clusters) this watch fails during controller setup and prevents HCCO
from starting.

Detect the management cluster Route capability using the existing
capabilities.DetectManagementClusterCapabilities helper and only
register the watch when route.openshift.io is registered. This mirrors
the pattern already used in other parts of the code.
@smrtrfszm smrtrfszm force-pushed the smrtrfszm/skip-route-watch branch from 07098a5 to 1f4af14 Compare May 21, 2026 12:38
@openshift-ci openshift-ci Bot removed the lgtm Indicates that a PR is ready to be merged. label May 21, 2026
@TwoDCube

Copy link
Copy Markdown
Member

/lgtm

@TwoDCube

Copy link
Copy Markdown
Member

/retest

@hypershift-jira-solve-ci

Copy link
Copy Markdown

AI Test Failure Analysis

Job: pull-ci-openshift-hypershift-main-e2e-aws | Build: 2057486725835395072 | Cost: $3.4691852500000016 | Failed step: hypershift-aws-run-e2e-nested

View full analysis report


Generated by hypershift-analyze-e2e-failure post-step using Claude claude-opus-4-6

@TwoDCube

Copy link
Copy Markdown
Member

/retest

1 similar comment
@TwoDCube

Copy link
Copy Markdown
Member

/retest

@smrtrfszm

Copy link
Copy Markdown
Contributor Author

/verified by @smrtrfszm

Before this change on a vanilla kubernetes, HCCO was in a crash loop with error message:

Error: failed to wait for resources caches to sync: timed out waiting for cache to be synced for Kind *v1.Route

After this change on a vanilla kubernetes, HCCO is 1/1 ready:

oc get pod -n master-cluster1 -l hypershift.openshift.io/control-plane-component=hosted-cluster-config-operator
NAME                                              READY   STATUS    RESTARTS   AGE
hosted-cluster-config-operator-7b64db5b59-vlmjd   1/1     Running   0          30m

@openshift-ci-robot openshift-ci-robot added the verified Signifies that the PR passed pre-merge verification criteria label May 22, 2026
@openshift-ci-robot

Copy link
Copy Markdown

@smrtrfszm: This PR has been marked as verified by @smrtrfszm.

Details

In response to this:

/verified by @smrtrfszm

Before this change on a vanilla kubernetes, HCCO was in a crash loop with error message:

Error: failed to wait for resources caches to sync: timed out waiting for cache to be synced for Kind *v1.Route

After this change on a vanilla kubernetes, HCCO is 1/1 ready:

oc get pod -n master-cluster1 -l hypershift.openshift.io/control-plane-component=hosted-cluster-config-operator
NAME                                              READY   STATUS    RESTARTS   AGE
hosted-cluster-config-operator-7b64db5b59-vlmjd   1/1     Running   0          30m

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@cblecker

Copy link
Copy Markdown
Member

@coderabbitai resume

@coderabbitai

coderabbitai Bot commented May 25, 2026

Copy link
Copy Markdown
Contributor
✅ Actions performed

Reviews resumed.

@openshift-ci-robot

Copy link
Copy Markdown

@smrtrfszm: This pull request references Jira Issue OCPBUGS-86008, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (5.0.0) matches configured target version for branch (5.0.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)
Details

In response to this:

What this PR does / why we need it:

The hosted-cluster-config-operator unconditionally watches route.openshift.io/v1 Routes against the management cluster to react to hostname changes on the metrics-proxy Route. On management clusters that do not expose the Routes API (e.g. non-OpenShift management clusters) this watch fails during controller setup and prevents HCCO from starting.

Detect the management cluster Route capability using the existing capabilities.DetectManagementClusterCapabilities helper and only register the watch when route.openshift.io is registered. This mirrors the pattern already used in other parts of the code.

Which issue(s) this PR fixes:

Fixes #OCPBUGS-86008

Special notes for your reviewer:

Checklist:

  • Subject and description added to both, commit and PR.
  • Relevant issues have been referenced.
  • This change includes docs.
  • This change includes unit tests.

Summary by CodeRabbit

  • New Features

  • Operator now performs a one-time management-cluster capability detection at startup.

  • Bug Fixes

  • Skips watches for unsupported APIs, preventing startup failures when resources aren’t available.

  • Only monitors Route resources when the management cluster reports Route support, reducing errors and improving stability.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@control-plane-operator/hostedclusterconfigoperator/cmd.go`:
- Around line 228-238: Add unit tests covering the startup capability-detection
path: write tests that exercise discovery.NewDiscoveryClientForConfig and
capabilities.DetectManagementClusterCapabilities wiring in cmd.go so you verify
(1) successful discovery returns expected mgmtClusterCaps and those capabilities
are injected into controller setup inputs, (2) discovery client creation failure
returns the expected error, and (3) capability detection failure returns the
expected error. Use a fake discovery client (or mock
NewDiscoveryClientForConfig) to simulate success and failure paths and assert
the returned error messages and that the mgmtClusterCaps value is passed into
the controllers' setup/config functions (the code that consumes mgmtClusterCaps
during controller initialization). Ensure tests reference the functions
NewDiscoveryClientForConfig, DetectManagementClusterCapabilities and the
controller setup entrypoints to validate propagation.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 4cc20e08-83f7-4c1f-b004-e24b4f32f511

📥 Commits

Reviewing files that changed from the base of the PR and between 07098a5 and 1f4af14.

📒 Files selected for processing (3)
  • control-plane-operator/hostedclusterconfigoperator/cmd.go
  • control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources.go
  • control-plane-operator/hostedclusterconfigoperator/operator/config.go

Comment on lines +228 to +238
// Detect the management cluster capabilities once at startup so controllers can
// gate optional behavior (e.g. watching route.openshift.io Routes) without each
// having to build its own discovery client.
cpDiscoveryClient, err := discovery.NewDiscoveryClientForConfig(cpConfig)
if err != nil {
return fmt.Errorf("failed to create management cluster discovery client: %w", err)
}
mgmtClusterCaps, err := capabilities.DetectManagementClusterCapabilities(cpDiscoveryClient)
if err != nil {
return fmt.Errorf("failed to detect management cluster capabilities: %w", err)
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add unit coverage for capability detection and config wiring.

Line 228 and Line 341 introduce startup-time capability detection and behavior-driving config injection, but this path has no accompanying tests in the PR. Please add unit tests for at least: discovery success, discovery failure, and capability propagation into controller setup inputs.

As per coding guidelines: **/*.go: Always include unit tests when creating new functions or modifying existing ones.

Also applies to: 320-342

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@control-plane-operator/hostedclusterconfigoperator/cmd.go` around lines 228 -
238, Add unit tests covering the startup capability-detection path: write tests
that exercise discovery.NewDiscoveryClientForConfig and
capabilities.DetectManagementClusterCapabilities wiring in cmd.go so you verify
(1) successful discovery returns expected mgmtClusterCaps and those capabilities
are injected into controller setup inputs, (2) discovery client creation failure
returns the expected error, and (3) capability detection failure returns the
expected error. Use a fake discovery client (or mock
NewDiscoveryClientForConfig) to simulate success and failure paths and assert
the returned error messages and that the mgmtClusterCaps value is passed into
the controllers' setup/config functions (the code that consumes mgmtClusterCaps
during controller initialization). Ensure tests reference the functions
NewDiscoveryClientForConfig, DetectManagementClusterCapabilities and the
controller setup entrypoints to validate propagation.

@openshift-ci

openshift-ci Bot commented May 29, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cblecker, smrtrfszm

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 29, 2026
@csrwng

csrwng commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

/override codecov/patch

@csrwng

csrwng commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

/override codecov/project

@openshift-ci

openshift-ci Bot commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

@csrwng: Overrode contexts on behalf of csrwng: codecov/patch

Details

In response to this:

/override codecov/patch

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-ci

openshift-ci Bot commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

@csrwng: Overrode contexts on behalf of csrwng: codecov/project

Details

In response to this:

/override codecov/project

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-ci

openshift-ci Bot commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

@smrtrfszm: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@hypershift-jira-solve-ci

Copy link
Copy Markdown

Now I have all the evidence needed. Let me compile the final report.

Test Failure Analysis Complete

Job Information

  • Prow Job: codecov/patch and codecov/project (GitHub Check Runs, not Prow CI)
  • Build ID: Check Run IDs 79544675859 (patch) / 79544671444 (project)
  • PR: #8484 — OCPBUGS-86008: Gate Route watch on management cluster capability
  • Branch: smrtrfszm/skip-route-watch
  • Commit: 1f4af148d55cbfb3518164a36d502e42c75f134b

Test Failure Analysis

Error

codecov/patch: 0.00% of diff hit (target 40.40%) — FAILURE
codecov/project: 40.40% (-0.01%) compared to base 36dfb1b — FAILURE

38 lines in the patch have zero test coverage:
  - control-plane-operator/hostedclusterconfigoperator/cmd.go: 34 lines missing
  - control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources.go: 4 lines missing

Summary

Both Codecov checks fail because none of the 38 new/modified lines introduced by this PR are exercised by any unit test. The codecov/patch check requires that new code meet the project's coverage target of 40.40%, but the patch achieves 0.00%. The codecov/project check fails because overall project coverage dropped by 0.01% (14 net new uncovered lines). The PR adds a capabilities.DetectManagementClusterCapabilities() call in cmd.go (the HCCO startup path, which has 0% existing coverage) and wraps the Route watch in resources.go behind a capabilities.Has(CapabilityRoute) guard — but neither call site is reached by tests. The support/capabilities package itself is well-tested (existing management_cluster_capabilities_test.go covers detection logic), but the integration points where the capability is created and consumed are not.

Root Cause

The root cause is straightforward: the two files modified by this PR lack unit test coverage for the changed lines.

  1. cmd.go (34 uncovered lines): The entire HostedClusterConfigOperator.Run() method has 0% test coverage. This is the HCCO startup/bootstrap path that wires up the operator config. The PR adds 13 new lines here (discovery client creation, capability detection, and the new ManagementClusterCapabilities struct field assignment), but the surrounding code was already untested — there is no cmd_test.go in the hostedclusterconfigoperator package. The remaining 21 lines counted as "missing" are the re-indented struct literal field assignments, which Codecov counts as modified lines requiring coverage.

  2. resources.go (4 uncovered lines): The Setup() function in the resources controller has partial coverage (55.35%). The PR adds a capability guard (if opts.ManagementClusterCapabilities.Has(capabilities.CapabilityRoute)) around the existing Route watch call. While resources_test.go exists and tests many paths through Setup(), the test does not exercise the Route watch registration path — specifically, it doesn't set up ManagementClusterCapabilities on the HostedClusterConfigOperatorConfig struct in test fixtures. The 4 uncovered lines are the new if condition, the c.Watch(...) call, the error return, and the closing brace.

The capability detection logic itself (DetectManagementClusterCapabilities, Has(), isAPIResourceRegistered) is thoroughly tested in support/capabilities/management_cluster_capabilities_test.go. The gap is specifically at the consumption sites.

Recommendations
  1. For resources.go (quick win): Update the existing resources_test.go to populate ManagementClusterCapabilities on the test's HostedClusterConfigOperatorConfig. Use the existing MockCapabilityChecker from the capabilities package:

    opts.ManagementClusterCapabilities = &capabilities.MockCapabilityChecker{
        MockHas: func(caps ...capabilities.CapabilityType) bool { return true },
    }

    This would cover the new guard and the Route watch path (4 lines).

  2. For cmd.go (structural gap): The Run() method is a large integration-style bootstrap function with 0% coverage. Adding meaningful unit tests for it would require significant refactoring (dependency injection for the discovery client, config, manager, etc.). Given the existing project baseline, this is likely acceptable tech debt — the capability detection logic is already tested at the unit level in management_cluster_capabilities_test.go.

  3. If coverage gates are blocking merge: The codecov/project threshold is very tight (-0.01%). Consider whether the Codecov checks are configured as required status checks — if so, at minimum fix resources.go coverage (item 1) to recover the 4 lines. The cmd.go gap is pre-existing (0% baseline) and the PR simply adds to already-uncovered code.

  4. Alternatively: If these Codecov checks are advisory (not required for merge), the failures can be noted and the PR merged on the basis that the core logic is tested and the integration points follow the same untested pattern as the rest of cmd.go.

Evidence
Evidence Detail
codecov/patch result 0.00% of diff hit (target 40.40%) — 38 lines with zero coverage
codecov/project result 40.40% (-0.01%) compared to base 36dfb1b — 14 net new uncovered lines
cmd.go baseline coverage 0.00% (unchanged by this PR — no cmd_test.go exists)
resources.go baseline coverage 55.35% → dropped to 55.33% (-0.02%)
Files changed cmd.go, resources.go, config.go (3 files, no new test files)
config.go coverage 0.00% — struct definition only, marked <ø> (no executable code)
Capabilities test coverage management_cluster_capabilities_test.go exists and tests DetectManagementClusterCapabilities, Has(), isAPIResourceRegistered
MockCapabilityChecker Available in support/capabilities package for test use — not used by this PR's changed files
resources_test.go Exists but does not set ManagementClusterCapabilities on test config
Net lines added +14 lines (93,235 → 93,249), all 14 uncovered

@openshift-merge-bot openshift-merge-bot Bot merged commit 25817d4 into openshift:main Jun 4, 2026
41 of 43 checks passed
@openshift-ci-robot

Copy link
Copy Markdown

@smrtrfszm: Jira Issue Verification Checks: Jira Issue OCPBUGS-86008
✔️ This pull request was pre-merge verified.
✔️ All associated pull requests have merged.
✔️ All associated, merged pull requests were pre-merge verified.

Jira Issue OCPBUGS-86008 has been moved to the MODIFIED state and will move to the VERIFIED state when the change is available in an accepted nightly payload. 🕓

Details

In response to this:

What this PR does / why we need it:

The hosted-cluster-config-operator unconditionally watches route.openshift.io/v1 Routes against the management cluster to react to hostname changes on the metrics-proxy Route. On management clusters that do not expose the Routes API (e.g. non-OpenShift management clusters) this watch fails during controller setup and prevents HCCO from starting.

Detect the management cluster Route capability using the existing capabilities.DetectManagementClusterCapabilities helper and only register the watch when route.openshift.io is registered. This mirrors the pattern already used in other parts of the code.

Which issue(s) this PR fixes:

Fixes #OCPBUGS-86008

Special notes for your reviewer:

Checklist:

  • Subject and description added to both, commit and PR.
  • Relevant issues have been referenced.
  • This change includes docs.
  • This change includes unit tests.

Summary by CodeRabbit

  • New Features

  • Operator now performs a one-time management-cluster capability detection at startup.

  • Bug Fixes

  • Skips watches for unsupported APIs, preventing startup failures when resources aren’t available.

  • Only monitors Route resources when the management cluster reports Route support, reducing errors and improving stability.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@smrtrfszm smrtrfszm deleted the smrtrfszm/skip-route-watch branch June 4, 2026 15:09
@openshift-merge-robot

Copy link
Copy Markdown
Contributor

Fix included in release 5.0.0-0.nightly-2026-06-06-100407

@smrtrfszm

Copy link
Copy Markdown
Contributor Author

/cherry-pick release-4.22

@openshift-cherrypick-robot

Copy link
Copy Markdown

@smrtrfszm: new pull request created: #8692

Details

In response to this:

/cherry-pick release-4.22

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. area/control-plane-operator Indicates the PR includes changes for the control plane operator - in an OCP release jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. verified Signifies that the PR passed pre-merge verification criteria

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants