SNOW-3192362: Fix auth callback recv blind sleep with MSG_DONTWAIT#2804
Draft
sfc-gh-fpawlowski wants to merge 1 commit intomainfrom
Draft
SNOW-3192362: Fix auth callback recv blind sleep with MSG_DONTWAIT#2804sfc-gh-fpawlowski wants to merge 1 commit intomainfrom
sfc-gh-fpawlowski wants to merge 1 commit intomainfrom
Conversation
Replace time.sleep(cooldown) with select.select([client_socket], [], [], cooldown) in the MSG_DONTWAIT retry path so the cooldown is data-aware. If data arrives during the wait, select returns early and the next recv succeeds immediately, fixing flaky test_auth_callback_success failures on loaded CI runners. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
SNOW-3192362: Fix auth callback recv blind sleep with MSG_DONTWAIT
Summary
time.sleep(cooldown)withselect.select([client_socket], [], [], cooldown)in theMSG_DONTWAITretry path of_try_receive_block, making the cooldown data-awaretest_auth_callback_successfailures (particularly withdontwait=true, timeout=0.05) on loaded CI runnersRoot Cause
In
_try_receive_block(src/snowflake/connector/auth/_http_server.py:194-201), theMSG_DONTWAITcode path uses a blindtime.sleep()between non-blockingrecvattempts:With
timeout=0.05andmax_attempts=15:attempt_timeout = 0.05 / 15 = ~3.3msrecvcheck (0ms useful work) + 3.3ms blind sleepCompare with the
dontwait=falsepath:recvblocks for up toattempt_timeout, returning as soon as data arrives. Every millisecond is spent actively waiting for data.The core issue:
time.sleep()is not socket-aware. Data arriving on the socket during the sleep cannot wake it up. The process sleeps for the full cooldown duration regardless of whether data is available.Fix
One-line change in
src/snowflake/connector/auth/_http_server.py, line 201:select.select()monitors the socket's read-readiness with a timeout ofcooldown. If data arrives during the wait,selectreturns early and the nextrecvsucceeds immediately. If no data arrives,selecttimes out aftercooldown— same wallclock behavior as the originaltime.sleep.Why this is correct and safe
MSG_DONTWAITpath —BlockingIOErroris never raised withoutMSG_DONTWAIT, so thedontwait=falsepath is completely untouched_use_msg_dont_wait()returnsFalseon Windows (lines 26-30), so this code path is Unix-only whereselectworks on socketsselectalready imported and used — module-level import at line 9, used in_try_pollat line 180selectreturns readable butrecvstill raisesBlockingIOError, the existing retry loop handles it gracefullytimeimport stays — still used in the bind retry loop (line 116)Test plan
python -m pytest test/unit/test_auth_callback_server.py -v— all 318 tests passpython -m pytest test/unit/test_auth_callback_server.py -v -k "test_auth_callback_success"— the specific failing testspython -m pytest test/unit/ -x --timeout=120