OCPBUGS-85584: Fix stale api.openshift.com labels on HostedControlPlane#8507
OCPBUGS-85584: Fix stale api.openshift.com labels on HostedControlPlane#8507dustman9000 wants to merge 1 commit into
Conversation
|
Pipeline controller notification For optional jobs, comment This repository is configured in: LGTM mode |
|
@dustman9000: This pull request references Jira Issue OCPBUGS-85584, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
📝 WalkthroughWalkthroughThe Sequence Diagram(s)sequenceDiagram
participant Reconciler as Reconciler
participant HC as HostedCluster
participant HCP as HostedControlPlane
participant API as Kubernetes API
Reconciler->>API: Get HostedCluster (HC)
API-->>Reconciler: HC (labels)
Reconciler->>API: Get HostedControlPlane (HCP)
API-->>Reconciler: HCP (labels)
alt HCP has stale api.openshift.com/* labels
Reconciler->>API: Patch HCP to delete stale api.openshift.com/* labels
API-->>Reconciler: Update response
end
Reconciler->>API: Patch HCP to add/update api.openshift.com/* labels from HC
API-->>Reconciler: Update response
🚥 Pre-merge checks | ✅ 10 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (10 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: dustman9000 The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
2a35868 to
f024b83
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
hypershift-operator/controllers/hostedcluster/hostedcluster_controller.go (1)
2426-2442: ⚡ Quick winAdd/adjust unit tests for the new bidirectional label sync path.
Please add coverage for both stale-label removal and prefix scoping so regressions here are caught.
As per coding guidelines, "Unit test any code changes and additions".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@hypershift-operator/controllers/hostedcluster/hostedcluster_controller.go` around lines 2426 - 2442, Add unit tests that exercise the api.openshift.com label sync logic using the hcluster and hcp objects (the code that iterates over hcp.Labels and hcluster.Labels and uses strings.HasPrefix("api.openshift.com")). Write tests that (1) verify stale-label removal: start with an api.openshift.com/* label present on hcp but missing on hcluster and assert the label is deleted after running the sync routine, and (2) verify prefix scoping/correct copying: ensure labels that start with the exact "api.openshift.com" prefix on hcluster are copied to hcp and that other labels (similar but not matching the prefix) are not touched. Place these cases in the hostedcluster controller tests (e.g., hostedcluster_controller_test.go) and invoke the function or reconcile path that performs the hcp/hcluster label sync so regressions are caught.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@hypershift-operator/controllers/hostedcluster/hostedcluster_controller.go`:
- Around line 2432-2440: The label-prefix checks currently use
strings.HasPrefix(key, "api.openshift.com") which can match keys outside the
intended namespace; update both occurrences to use the exact managed prefix
"api.openshift.com/" so only labels under api.openshift.com/* are processed
(modify the two HasPrefix checks around the loops that reference hcluster.Labels
and hcp.Labels).
---
Nitpick comments:
In `@hypershift-operator/controllers/hostedcluster/hostedcluster_controller.go`:
- Around line 2426-2442: Add unit tests that exercise the api.openshift.com
label sync logic using the hcluster and hcp objects (the code that iterates over
hcp.Labels and hcluster.Labels and uses strings.HasPrefix("api.openshift.com")).
Write tests that (1) verify stale-label removal: start with an
api.openshift.com/* label present on hcp but missing on hcluster and assert the
label is deleted after running the sync routine, and (2) verify prefix
scoping/correct copying: ensure labels that start with the exact
"api.openshift.com" prefix on hcluster are copied to hcp and that other labels
(similar but not matching the prefix) are not touched. Place these cases in the
hostedcluster controller tests (e.g., hostedcluster_controller_test.go) and
invoke the function or reconcile path that performs the hcp/hcluster label sync
so regressions are caught.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 06ee7aec-1edd-42a0-8e8b-ff1d514ac695
📒 Files selected for processing (1)
hypershift-operator/controllers/hostedcluster/hostedcluster_controller.go
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #8507 +/- ##
=======================================
Coverage 40.00% 40.00%
=======================================
Files 751 751
Lines 92838 92843 +5
=======================================
+ Hits 37137 37146 +9
+ Misses 53014 53011 -3
+ Partials 2687 2686 -1
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
♻️ Duplicate comments (1)
hypershift-operator/controllers/hostedcluster/hostedcluster_controller.go (1)
2431-2440:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winUse the exact managed prefix
api.openshift.com/in both loops.Line 2432 and Line 2439 currently match
"api.openshift.com"(no trailing slash), which can include unintended labels outsideapi.openshift.com/*.Proposed fix
+ apiOpenShiftComLabelPrefix := "api.openshift.com/" for key := range hcp.Labels { - if strings.HasPrefix(key, "api.openshift.com") { + if strings.HasPrefix(key, apiOpenShiftComLabelPrefix) { if _, exists := hcluster.Labels[key]; !exists { delete(hcp.Labels, key) } } } for key, val := range hcluster.Labels { - if strings.HasPrefix(key, "api.openshift.com") { + if strings.HasPrefix(key, apiOpenShiftComLabelPrefix) { hcp.Labels[key] = val } }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@hypershift-operator/controllers/hostedcluster/hostedcluster_controller.go` around lines 2431 - 2440, The prefix match in both loops uses "api.openshift.com" and should be the exact managed prefix "api.openshift.com/"; update the strings.HasPrefix checks in the loops that iterate hcp.Labels and hcluster.Labels to use "api.openshift.com/" so only labels under that namespace (api.openshift.com/*) are considered when deleting from hcp.Labels and when copying into hcp.Labels.
🧹 Nitpick comments (2)
hypershift-operator/controllers/hostedcluster/hostedcluster_controller_test.go (2)
359-370: ⚡ Quick winTighten assertions to catch unexpected non-target label mutations.
The current checks won’t fail if reconcile adds extra non-
api.openshift.com/*labels. Assert there are no unexpected labels at all to make this test regression-proof.Suggested assertion tightening
for key, val := range test.expectedLabels { if hcp.Labels[key] != val { t.Errorf("expected label %s=%s, got %s", key, val, hcp.Labels[key]) } } for key := range hcp.Labels { - if strings.HasPrefix(key, "api.openshift.com") { - if _, expected := test.expectedLabels[key]; !expected { - t.Errorf("unexpected label %s=%s still on HCP", key, hcp.Labels[key]) - } - } + if _, expected := test.expectedLabels[key]; !expected { + t.Errorf("unexpected label %s=%s on HCP", key, hcp.Labels[key]) + } }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@hypershift-operator/controllers/hostedcluster/hostedcluster_controller_test.go` around lines 359 - 370, The test currently only verifies expected api.openshift.com/* labels and that no unexpected api.openshift.com/* keys remain, but it doesn't fail if reconcile added extra non-target labels; update the assertions around hcp.Labels and test.expectedLabels so they assert exact equality: verify the count matches (len(hcp.Labels) == len(test.expectedLabels)) and iterate over every key in hcp.Labels to ensure each key exists in test.expectedLabels and has the expected value (using the existing hcp and test.expectedLabels variables), failing the test on any missing, extra, or mismatched label.
322-344: ⚡ Quick winUse required “When … it should …” test case naming format.
Please rename the new table test case titles to follow the required deterministic
When ... it should ...pattern for unit tests.As per coding guidelines,
Always use "When ... it should ..." format for describing test cases when creating unit tests.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@hypershift-operator/controllers/hostedcluster/hostedcluster_controller_test.go` around lines 322 - 344, Rename the table-driven test case "name" values in the test table inside hostedcluster_controller_test.go to follow the deterministic "When ... it should ..." format; specifically update entries currently titled "copies api.openshift.com labels from HC to HCP", "removes stale api.openshift.com labels from HCP", "preserves non-api.openshift.com labels on HCP", and "handles nil HC labels" to descriptive phrases like "When HC has api.openshift.com labels it should copy them to HCP", "When HC labels change it should remove stale api.openshift.com labels from HCP", "When HCP has non-api.openshift.com labels it should preserve them", and "When HC labels are nil it should result in no api.openshift.com labels on HCP" so the test names (the name field) use the required "When ... it should ..." pattern in the test table used by the test function that validates label sync behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Duplicate comments:
In `@hypershift-operator/controllers/hostedcluster/hostedcluster_controller.go`:
- Around line 2431-2440: The prefix match in both loops uses "api.openshift.com"
and should be the exact managed prefix "api.openshift.com/"; update the
strings.HasPrefix checks in the loops that iterate hcp.Labels and
hcluster.Labels to use "api.openshift.com/" so only labels under that namespace
(api.openshift.com/*) are considered when deleting from hcp.Labels and when
copying into hcp.Labels.
---
Nitpick comments:
In
`@hypershift-operator/controllers/hostedcluster/hostedcluster_controller_test.go`:
- Around line 359-370: The test currently only verifies expected
api.openshift.com/* labels and that no unexpected api.openshift.com/* keys
remain, but it doesn't fail if reconcile added extra non-target labels; update
the assertions around hcp.Labels and test.expectedLabels so they assert exact
equality: verify the count matches (len(hcp.Labels) == len(test.expectedLabels))
and iterate over every key in hcp.Labels to ensure each key exists in
test.expectedLabels and has the expected value (using the existing hcp and
test.expectedLabels variables), failing the test on any missing, extra, or
mismatched label.
- Around line 322-344: Rename the table-driven test case "name" values in the
test table inside hostedcluster_controller_test.go to follow the deterministic
"When ... it should ..." format; specifically update entries currently titled
"copies api.openshift.com labels from HC to HCP", "removes stale
api.openshift.com labels from HCP", "preserves non-api.openshift.com labels on
HCP", and "handles nil HC labels" to descriptive phrases like "When HC has
api.openshift.com labels it should copy them to HCP", "When HC labels change it
should remove stale api.openshift.com labels from HCP", "When HCP has
non-api.openshift.com labels it should preserve them", and "When HC labels are
nil it should result in no api.openshift.com labels on HCP" so the test names
(the name field) use the required "When ... it should ..." pattern in the test
table used by the test function that validates label sync behavior.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 205166cf-a1bc-42e3-a2d1-62a0e1b237cc
📒 Files selected for processing (2)
hypershift-operator/controllers/hostedcluster/hostedcluster_controller.gohypershift-operator/controllers/hostedcluster/hostedcluster_controller_test.go
OCPBUGS-85584 The label sync from HostedCluster to HostedControlPlane was additive-only. When OCM removes a label from HC (e.g., clearing limited support), the label was never removed from HCP. This caused downstream consumers like Route Monitor Operator to permanently skip probe creation for fully supported clusters. Fix: before copying HC labels to HCP, remove any api.openshift.com/* labels from HCP that no longer exist on HC.
f024b83 to
f5a37aa
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
hypershift-operator/controllers/hostedcluster/hostedcluster_controller_test.go (2)
322-344: ⚡ Quick winNormalize subtest names to
When ... it should ...format.These new test titles are descriptive, but they don’t follow the required naming convention for unit tests.
As per coding guidelines, Always use "When ... it should ..." format for describing test cases when creating unit tests.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@hypershift-operator/controllers/hostedcluster/hostedcluster_controller_test.go` around lines 322 - 344, Update the test case "name" strings in the table-driven tests that exercise label sync (the entries with name values like "copies api.openshift.com labels from HC to HCP", "removes stale api.openshift.com labels from HCP", "preserves non-api.openshift.com labels on HCP", "handles nil HC labels") to follow the required "When ... it should ..." format (e.g. "When HC has api.openshift.com labels it should copy them to HCP"). Keep the test logic and expectedLabels/hcLabels/hcpLabels unchanged; only change the human-readable name fields used by t.Run so they conform to the convention.
359-370: ⚡ Quick winPrefer gomega assertions for these label assertions.
Replacing manual
t.Errorfchecks with gomega matchers will align with repo test style and produce cleaner failures.As per coding guidelines, Prefer gomega for unit test assertions in Go tests.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@hypershift-operator/controllers/hostedcluster/hostedcluster_controller_test.go` around lines 359 - 370, Replace the manual t.Errorf label checks in the test (the loops that inspect hcp.Labels and test.expectedLabels) with gomega assertions: for each expected label use g.Expect(hcp.Labels).To(HaveKeyWithValue(key, val)) (or g.Expect(hcp.Labels).To(HaveKey(key)) plus value assertion) and for the unexpected-label check use g.Expect(hcp.Labels).ToNot(HaveKey(key)) or g.Expect(hcp.Labels).ToNot(HaveKeyWithValue(key, _)) as appropriate; ensure you use the existing g := gomega.NewWithT(t) test harness at the top of the test so failures produce gomega-style output and replace both t.Errorf occurrences accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In
`@hypershift-operator/controllers/hostedcluster/hostedcluster_controller_test.go`:
- Line 365: The stale-label check currently uses strings.HasPrefix(key,
"api.openshift.com") which can match unintended keys; update the prefix to
include the slash so it only matches managed OCM label keys by using
strings.HasPrefix(key, "api.openshift.com/") where the check occurs (the
strings.HasPrefix call operating on the variable `key` in the hostedcluster
controller test).
---
Nitpick comments:
In
`@hypershift-operator/controllers/hostedcluster/hostedcluster_controller_test.go`:
- Around line 322-344: Update the test case "name" strings in the table-driven
tests that exercise label sync (the entries with name values like "copies
api.openshift.com labels from HC to HCP", "removes stale api.openshift.com
labels from HCP", "preserves non-api.openshift.com labels on HCP", "handles nil
HC labels") to follow the required "When ... it should ..." format (e.g. "When
HC has api.openshift.com labels it should copy them to HCP"). Keep the test
logic and expectedLabels/hcLabels/hcpLabels unchanged; only change the
human-readable name fields used by t.Run so they conform to the convention.
- Around line 359-370: Replace the manual t.Errorf label checks in the test (the
loops that inspect hcp.Labels and test.expectedLabels) with gomega assertions:
for each expected label use g.Expect(hcp.Labels).To(HaveKeyWithValue(key, val))
(or g.Expect(hcp.Labels).To(HaveKey(key)) plus value assertion) and for the
unexpected-label check use g.Expect(hcp.Labels).ToNot(HaveKey(key)) or
g.Expect(hcp.Labels).ToNot(HaveKeyWithValue(key, _)) as appropriate; ensure you
use the existing g := gomega.NewWithT(t) test harness at the top of the test so
failures produce gomega-style output and replace both t.Errorf occurrences
accordingly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 12160745-be70-4ee3-b562-f84a6ce5df7f
📒 Files selected for processing (2)
hypershift-operator/controllers/hostedcluster/hostedcluster_controller.gohypershift-operator/controllers/hostedcluster/hostedcluster_controller_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
- hypershift-operator/controllers/hostedcluster/hostedcluster_controller.go
|
@dustman9000: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
cblecker
left a comment
There was a problem hiding this comment.
Fix looks correct — delete-then-copy is the right approach, and the nil map case is handled cleanly. A few minor style suggestions inline.
| // on the HC, then copy current values. This ensures label removals (e.g., | ||
| // when limited support is cleared) propagate to the HCP. | ||
| for key := range hcp.Labels { | ||
| if strings.HasPrefix(key, "api.openshift.com") { |
There was a problem hiding this comment.
nit: Since you're already touching both loops, would you mind pulling the prefix out into a const with a trailing slash? Something like:
const apiOpenShiftComLabelPrefix = "api.openshift.com/"Without the trailing slash, this technically matches anything that starts with api.openshift.com — including a hypothetical api.openshift.company/foo. A shared const fixes that and avoids the duplicated string literal across both loops.
| expectedLabels map[string]string | ||
| }{ | ||
| { | ||
| name: "copies api.openshift.com labels from HC to HCP", |
There was a problem hiding this comment.
nit: The project convention for test case names is Gherkin-style — "When... it should...". Would you mind renaming these? Something like:
"When HC has api.openshift.com labels, it should copy them to HCP""When HC removes an api.openshift.com label, it should remove the stale label from HCP""When HCP has non-api.openshift.com labels, it should preserve them""When HC labels are nil, it should remove all api.openshift.com labels from HCP"
| } | ||
| err := reconcileHostedControlPlane(hcp, hc, false, false, func() (map[string]string, error) { return nil, nil }) | ||
| if err != nil { | ||
| t.Fatal(err) |
There was a problem hiding this comment.
nit: The rest of this file uses gomega — would be nice to stay consistent here. Switching to g.Expect(err).ToNot(HaveOccurred()) and g.Expect(hcp.Labels).To(Equal(test.expectedLabels)) would also let you collapse the two verification loops below into a single assertion, which is simpler and catches unexpected mutations to any label — not just the api.openshift.com ones.
| } | ||
| // All labels on the HostedCluster with this special prefix are copied | ||
| // Those are labels set by OCM | ||
| // Sync api.openshift.com/* labels from HostedCluster to HostedControlPlane. |
There was a problem hiding this comment.
nit: Most of this comment describes what the code does, which the two loops already make clear. The useful parts are that OCM owns these labels and why delete-before-copy is needed. Maybe condense to:
// These labels are managed by OCM. Delete-then-copy ensures removals
// on the HostedCluster propagate; a copy-only loop leaves stale labels.
Summary
Fixes stale
api.openshift.com/*labels on HostedControlPlane CRs that persist after being removed from the HostedCluster.Problem
The label sync in
reconcileHostedControlPlane(line 2426-2432) only copies labels from HC to HCP. It never removes HCP labels that were deleted from HC.When OCM/CS removes
api.openshift.com/limited-supportfrom the HostedCluster (via ManifestWork when LS is cleared), the label stays on the HostedControlPlane forever.Reproduction (stage, 2026-05-13)
Reproduced on stage cluster
test-hcp-stable-420(2q4jijtpbjlrtjavc29s8n54bnah70kl), MChs-mc-13vnfh7o0:api.openshift.com/limited-support: "true". Correct.ocm delete. OCM confirmed 0 LS reasons."true"."true". Label removal never propagated.Also confirmed on production clusters
dai-test-hcpandclairp02ue1(0 LS reasons in OCM, stale label on HCP).Impact
Route Monitor Operator checks this label on the HCP to skip probe creation for LS clusters. Stale labels cause RMO to permanently skip synthetic monitoring probe creation for fully supported clusters. ~60% of clusters on some RHOBS cells have no synthetic probes due to this bug.
Fix
Before copying HC labels to HCP, remove any
api.openshift.com/*labels from HCP that no longer exist on HC. Includes 4 unit tests covering label copy, stale removal, non-OCM label preservation, and nil HC labels.Jira
Summary by CodeRabbit
Release Notes
Bug Fixes
Tests