Skip to content

fix(hostedclustersizing): add override fall back#7554

Draft
SudoBrendan wants to merge 1 commit into
openshift:mainfrom
SudoBrendan:sudobrendan/fix-size-override-fallback
Draft

fix(hostedclustersizing): add override fall back#7554
SudoBrendan wants to merge 1 commit into
openshift:mainfrom
SudoBrendan:sudobrendan/fix-size-override-fallback

Conversation

@SudoBrendan

@SudoBrendan SudoBrendan commented Jan 20, 2026

Copy link
Copy Markdown

What this PR does / why we need it:

Today, if you set the size override annotation, but that size isn't an exact string match of what's in the config, your cluster will remain unsized rather than falling back to the standard logic (autoscale / node size). The only indication this has happened is a log message in the controller.

NOTE: I think this seems odd - do others agree this is a bug, not a feature? Is there a reason to prefer an unsized cluster when the override is set incorrectly that I'm unaware of?

Which issue(s) this PR fixes:

Adjacent work I happened to discover while trying to refine what CNTRLPLANE-2581 would be

Special notes for your reviewer:

Checklist:

  • Subject and description added to both, commit and PR.
  • Relevant issues have been referenced.
  • This change includes docs.
  • This change includes unit tests.

When ClusterSizeOverrideAnnotation is set to a value not found in the
ClusterSizingConfiguration, the controller now gracefully falls back to
autoscaling (if enabled) or node count sizing, rather than skipping
sizing entirely.

Signed-off-by: Brendan Bergen <bbergen@redhat.com>
Assisted-by: Claude Opus 4.5 (via Cursor)
@openshift-ci openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 20, 2026
@openshift-ci

openshift-ci Bot commented Jan 20, 2026

Copy link
Copy Markdown
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@coderabbitai

coderabbitai Bot commented Jan 20, 2026

Copy link
Copy Markdown
Contributor

Important

Review skipped

Auto reviews are limited based on label configuration.

🚫 Review skipped — only excluded labels are configured. (1)
  • do-not-merge/work-in-progress

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.


Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci Bot added do-not-merge/needs-area area/hypershift-operator Indicates the PR includes changes for the hypershift operator and API - outside an OCP release labels Jan 20, 2026
@openshift-ci

openshift-ci Bot commented Jan 20, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: SudoBrendan
Once this PR has been reviewed and has the lgtm label, please assign devguyio for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci

openshift-ci Bot commented May 11, 2026

Copy link
Copy Markdown
Contributor

@SudoBrendan: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-ci openshift-ci Bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 11, 2026
@openshift-ci

openshift-ci Bot commented May 11, 2026

Copy link
Copy Markdown
Contributor

PR needs rebase.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@hypershift-jira-solve-ci

Copy link
Copy Markdown

I now have all the evidence to produce the report. Let me compile the final analysis.

Test Failure Analysis Complete

Job Information

Test Failure Analysis

Error

tide: Not mergeable. PR has a merge conflict.

Summary

The tide error is not a test failure — it is a merge-conflict blocker. PR #7554 was created on 2026-01-20 and has never been rebased. The branch is now 1,350 commits behind main and has merge conflicts in both files it touches (hostedclustersizing_controller.go and hostedclustersizing_controller_test.go). The direct cause of the conflict is PR #8309 ("Enable gocyclo linter and reduce cyclomatic complexity"), merged 2026-05-07, which refactored the same hostedclustersizing_controller.go file with 370 changed lines (224 additions, 146 deletions) — extracting helper functions and restructuring the code that PR #7554 also modifies. Additionally, 3 Konflux Enterprise Contract checks fail because the branch carries stale .tekton/ pipeline task bundle digests from before PR #8048 (merged 2026-03-23). All actual Prow CI jobs (e2e-azure-self-managed, security, unit, verify-workflows) passed. The PR is also still in Draft state with do-not-merge/work-in-progress and needs-rebase labels.

Root Cause

Merge conflict due to a 4-month-old un-rebased branch, with conflicting refactors on main.

The PR modifies hostedclustersizing_controller.go (lines ~208–260) to change the sizing priority logic — making override annotation fall back gracefully to autoscaling or node-count sizing instead of silently ignoring invalid overrides. Since the PR was created on 2026-01-20, two commits on main modified the same file:

  1. PR #8309 (merged 2026-05-07) — Major refactor of hostedclustersizing_controller.go to reduce cyclomatic complexity. This commit changed 370 lines in the controller (224 additions, 146 deletions) and 613 lines in the test file. It extracted helper functions and restructured the reconcile() method — the exact same function PR fix(hostedclustersizing): add override fall back #7554 modifies. This is the primary source of the merge conflict.

  2. PR with commit aee93fe8e865 (merged 2026-04-16) — Lint fix that renamed unused parameters to _ in the same file. Minor (2-line change) but contributes to the divergence.

The branch is 1,350 commits behind main (status: diverged, ahead by 1, behind by 1,350), making GitHub report the PR as CONFLICTING with mergeStateStatus: DIRTY.

Secondary issue — Konflux Enterprise Contract failures (3 checks):
The .tekton/ pipeline YAML files on the branch reference Tekton task bundle digests that have been rotated out of the trusted catalog since PR #8048 updated them on 2026-03-23. This causes trusted_task.trusted policy violations in all Enterprise Contract validation checks. This is a staleness issue, not a code problem.

Importantly: All actual CI test jobs passed — e2e-azure-self-managed (success), security (success), unit (success, context retired), verify-workflows (success, context retired). The code change itself is not causing any test failures.

Recommendations
  1. Rebase onto main — This is the only action needed to resolve both the tide merge conflict and the Konflux Enterprise Contract failures:

    git fetch origin main
    git rebase origin/main
    # Resolve conflicts in hostedclustersizing_controller.go and _test.go
    # The refactor in PR #8309 restructured the reconcile() method —
    # re-apply the override-fallback logic on top of the new structure
    git push --force-with-lease
  2. Expect non-trivial conflict resolution — PR CNTRLPLANE-3324: Enable gocyclo linter and reduce cyclomatic complexity #8309's refactor extracted helper functions from the reconcile() method. The sizing-priority changes from this PR will need to be re-applied to the new code structure. Review the current main version of hostedclustersizing_controller.go carefully before resolving.

  3. Mark PR as ready for review — The PR is currently in Draft state with do-not-merge/work-in-progress label. After rebasing, convert it from draft and remove the WIP label to allow tide to consider it for merging.

  4. Re-trigger required CI jobs after rebase — the unit and verify-workflows contexts show as "retired without replacement", so verify which required checks are currently configured for the repo.

Evidence
Evidence Detail
Tide error "Not mergeable. PR has a merge conflict." (state: ERROR, 2026-05-11T23:24:32Z)
PR age Created 2026-01-20, last commit 2026-01-20 — ~4 months without rebase
Branch divergence 1,350 commits behind main, 1 ahead (status: diverged)
GitHub merge state mergeable: CONFLICTING, mergeStateStatus: DIRTY
Conflicting file hostedclustersizing_controller.go — modified by both PR #7554 and PR #8309
Primary conflict source PR #8309 (merged 2026-05-07): gocyclo refactor, 370 changed lines in controller, 613 in test file
Secondary conflict source Commit aee93fe8e865 (merged 2026-04-16): unparam lint fix, 2-line change
PR labels needs-rebase, do-not-merge/work-in-progress, area/hypershift-operator
PR state Draft (isDraft: true)
Konflux EC failures 3 checks fail due to stale .tekton/ task bundle digests (rotated since PR #8048, merged 2026-03-23)
Passing CI jobs e2e-azure-self-managed ✅, security ✅, unit ✅ (retired), verify-workflows ✅ (retired)
Code changes (PR #7554) 2 Go files only: sizing controller + tests — unrelated to CI/pipeline config

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/hypershift-operator Indicates the PR includes changes for the hypershift operator and API - outside an OCP release do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant