[cinder-csi-plugin] Wait for volume availability before attach#3124
[cinder-csi-plugin] Wait for volume availability before attach#3124hemna wants to merge 1 commit into
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Hi @hemna. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Regular contributors should join the org to skip this step. Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
ControllerPublishVolume now waits for the volume to reach 'available'
or 'in-use' status before calling the Cinder attachment API.
Previously, if the CO called ControllerPublishVolume immediately after
CreateVolume, the volume could still be in 'creating' state on the
backend. This caused Cinder to reject the attachment with a 409 Conflict
('status must be available or downloading to reserve, but the current
status is creating'), forcing the CO to retry blindly.
The new behavior uses a context-aware poll (WaitVolumeTargetStatusWithContext)
that respects the gRPC request deadline. The volume status is checked
every 3 seconds until it reaches a target state, enters an error state,
or the context expires. This eliminates unnecessary 409 errors against
Cinder and reduces time-to-attach for volumes still being provisioned.
Signed-off-by: Walter Boring <waboring@hemna.com>
4e4519a to
35031dc
Compare
What this PR does / why we need it:
ControllerPublishVolumenow waits for the volume to reachavailableorin-usestatus before calling the Cinder attachment API.Previously, if the CO called
ControllerPublishVolumeimmediately afterCreateVolume, the volume could still be increatingstate on the backend. This caused Cinder to reject the attachment with a 409 Conflict:This forced the CO (external-attacher) to retry blindly, generating unnecessary API calls to Cinder and delaying volume attachment. The issue is most pronounced with storage backends where volume creation is asynchronous and takes several seconds (e.g., when volumes are backed by network-attached storage).
How the fix works:
A new
WaitVolumeTargetStatusWithContextmethod useswait.PollUntilContextCancelwhich:availableorin-useIf the context expires before the volume is ready, the driver returns
FAILED_PRECONDITIONwhich tells the CO to retry with exponential backoff.Which issue this PR fixes(if applicable):
fixes #
Special notes for reviewers:
The
volumeReadyPollIntervalis set to 3 seconds, which balances responsiveness against API load on the Cinder service. The total wait time is bounded by the gRPC context deadline (typically 15-30s depending on the external-attacher configuration), not a fixed backoff.The existing
WaitVolumeTargetStatus(used inControllerExpandVolume) is left unchanged to avoid disrupting existing behavior.Release note: