upgrade AKS only if needed and wait until cluster upgrade is finished#4870
Conversation
There was a problem hiding this comment.
Pull request overview
This PR improves the AKS upgrade step used by the dev-infrastructure pipelines by making the upgrade-aks-cluster.sh script (1) run upgrades only when the control plane or any node pool is below the configured target version, and (2) block subsequent pipeline steps by waiting for the AKS upgrade to complete.
Changes:
- Added version comparison logic to detect whether an AKS upgrade is necessary (control plane + node pools).
- Added an explicit
az aks wait --updatedcall after triggering an upgrade to ensure completion before the step finishes.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
currently the timeout for shell steps in ev2 is 1h, so we won't be able to increase this further. @geoberle and I suggest to keep it this way until we have a better solution with fleet. |
|
/lgtm |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: geoberle, raelga, tmstff The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 1 out of 1 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
/test e2e-parallel |
https://redhat.atlassian.net/browse/ARO-25883
What
Makes sure that the az cli AKS upgrade is only triggered when needed - and that the script that calls it (upgrade-aks-cluster.sh) will wait until the upgrade is finished in case.
Why
#4836 upgrades AKS cluster using a shell script (upgrade-aks-cluster.sh) that performs the upgrade (via az cli) - but we experienced problems during the rollout. @geoberle & @raelga deduced, that the upgrade causes problems when run in parallel to other pipeline steps. To resolve this, @raelga created #4868 - which makes sure that following steps will wait for the upgrade-aks-cluster step. As that step does not wait for update completion, this is not sufficient. This will be resolved with the current PR.
Testing
Special notes for your reviewer