OCPBUGS-81505: Fix update-hosts.sh crash when no hosts registered yet#684
OCPBUGS-81505: Fix update-hosts.sh crash when no hosts registered yet#684rwsu wants to merge 1 commit intoopenshift:masterfrom
Conversation
When update-hosts.service starts, it may call the infra-envs hosts API before any hosts have registered with assisted-service. In this race condition, assisted-service returns an error JSON object instead of an empty array, causing 'jq -r .[].id' to fail with "Cannot index string with string 'id'". With set -e, this kills the script before it can patch the install ignition on any host. Fix by skipping the hosts API call until the cluster reaches 'ready' status, which guarantees all hosts have registered and been validated. Patching also continues through 'preparing-for-installation' to ensure all hosts are updated before disk installation begins. Assisted-by: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
|
@rwsu: This pull request references Jira Issue OCPBUGS-81505, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: rwsu The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
@rwsu: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
| # Wait until cluster is ready (all hosts registered and validated) before patching. | ||
| # This avoids querying the hosts API before any hosts have registered, which causes | ||
| # assisted-service to return an error object instead of an empty array. | ||
| if [[ $cluster_status != "ready" && $cluster_status != "preparing-for-installation" ]]; then |
There was a problem hiding this comment.
@rwsu the service (to restart the registry) is required to be run at the very beginning, the update host instead was used to configure the host after the reboot (not strictly related, but the the update-host should be not used at all for the live iso as a longer term strategy, so in any case better to stay away from it)
There was a problem hiding this comment.
After speaking with @andfasano, the right solution is to move the reconfigure scripts to the bootstrap ignition and we should test with the HA topology as the problem occurs on worker nodes which do not have the IRI registry running on them.
|
Closing. Replaced by #687 |
|
@rwsu: This pull request references Jira Issue OCPBUGS-81505. The bug has been updated to no longer refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
When update-hosts.service starts, it may call the infra-envs hosts API before any hosts have registered with assisted-service. In this race condition, assisted-service returns an error JSON object instead of an empty array, causing 'jq -r .[].id' to fail with "Cannot index string with string 'id'". With set -e, this kills the script before it can patch the install ignition on any host.
Fix by skipping the hosts API call until the cluster reaches 'ready' status, which guarantees all hosts have registered and been validated. Patching also continues through 'preparing-for-installation' to ensure all hosts are updated before disk installation begins.
Assisted-by: Claude Sonnet 4.6 (1M context) noreply@anthropic.com