Skip to content

upgrade AKS only if needed and wait until cluster upgrade is finished#4870

Merged
openshift-merge-bot[bot] merged 3 commits intomainfrom
wait_until_cluster_upgrade_has_finished
Apr 15, 2026
Merged

upgrade AKS only if needed and wait until cluster upgrade is finished#4870
openshift-merge-bot[bot] merged 3 commits intomainfrom
wait_until_cluster_upgrade_has_finished

Conversation

@tmstff
Copy link
Copy Markdown
Collaborator

@tmstff tmstff commented Apr 14, 2026

https://redhat.atlassian.net/browse/ARO-25883

What

Makes sure that the az cli AKS upgrade is only triggered when needed - and that the script that calls it (upgrade-aks-cluster.sh) will wait until the upgrade is finished in case.

Why

#4836 upgrades AKS cluster using a shell script (upgrade-aks-cluster.sh) that performs the upgrade (via az cli) - but we experienced problems during the rollout. @geoberle & @raelga deduced, that the upgrade causes problems when run in parallel to other pipeline steps. To resolve this, @raelga created #4868 - which makes sure that following steps will wait for the upgrade-aks-cluster step. As that step does not wait for update completion, this is not sufficient. This will be resolved with the current PR.

Testing

  • manual testing (target version smaller, equal or greater than actual version - via pure script execution and also make personal-dev-env)
  • e2e test will (hopefully) show that everything still works afterwards (upgrade will not be triggered by the test, but is handled in the rollout)

Special notes for your reviewer

Copilot AI review requested due to automatic review settings April 14, 2026 11:44
@tmstff
Copy link
Copy Markdown
Collaborator Author

tmstff commented Apr 14, 2026

/assign @raelga @geoberle

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves the AKS upgrade step used by the dev-infrastructure pipelines by making the upgrade-aks-cluster.sh script (1) run upgrades only when the control plane or any node pool is below the configured target version, and (2) block subsequent pipeline steps by waiting for the AKS upgrade to complete.

Changes:

  • Added version comparison logic to detect whether an AKS upgrade is necessary (control plane + node pools).
  • Added an explicit az aks wait --updated call after triggering an upgrade to ensure completion before the step finishes.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread dev-infrastructure/scripts/upgrade-aks-cluster.sh
Comment thread dev-infrastructure/scripts/upgrade-aks-cluster.sh
Comment thread dev-infrastructure/scripts/upgrade-aks-cluster.sh
Comment thread dev-infrastructure/scripts/upgrade-aks-cluster.sh Outdated
@tmstff
Copy link
Copy Markdown
Collaborator Author

tmstff commented Apr 14, 2026

currently the timeout for shell steps in ev2 is 1h, so we won't be able to increase this further. @geoberle and I suggest to keep it this way until we have a better solution with fleet.

@raelga raelga requested a review from Copilot April 14, 2026 13:16
@raelga
Copy link
Copy Markdown
Collaborator

raelga commented Apr 14, 2026

/lgtm

@openshift-ci openshift-ci bot added the lgtm label Apr 14, 2026
@geoberle
Copy link
Copy Markdown
Collaborator

/lgtm

@openshift-ci
Copy link
Copy Markdown

openshift-ci bot commented Apr 14, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: geoberle, raelga, tmstff

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread dev-infrastructure/scripts/upgrade-aks-cluster.sh
Comment thread dev-infrastructure/scripts/upgrade-aks-cluster.sh
@tmstff
Copy link
Copy Markdown
Collaborator Author

tmstff commented Apr 14, 2026

/test e2e-parallel

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

/retest-required

Remaining retests: 0 against base HEAD e93e7c1 and 2 for PR HEAD e3d5109 in total

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

/retest-required

Remaining retests: 0 against base HEAD 5802938 and 1 for PR HEAD e3d5109 in total

@openshift-merge-bot openshift-merge-bot bot merged commit 110f641 into main Apr 15, 2026
23 checks passed
@openshift-merge-bot openshift-merge-bot bot deleted the wait_until_cluster_upgrade_has_finished branch April 15, 2026 00:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants