Deflake the issue-1363 tests: wait for lifespan startup instead of sleeping#2879
Merged
Merged
Conversation
…eeping The three tests in test_1363_race_condition_streamable_http.py waited a fixed 0.1s for the ServerThread to start the app lifespan before sending requests. On a loaded CI runner the thread is sometimes not ready in time, so the first request reaches handle_request() before the session manager's task group exists and the test fails with "RuntimeError: Task group is not initialized" (seen intermittently on both Ubuntu and Windows jobs). Replace the fixed sleep with a threading.Event that the server thread sets once lifespan startup has completed; the tests wait for it (bounded at 5s) before sending the first request.
felixweinberger
approved these changes
Jun 15, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The three tests in
tests/issues/test_1363_race_condition_streamable_http.pysynchronize with their server thread via a fixedawait anyio.sleep(0.1). On a loaded CI runner the thread sometimes isn't ready within that window, and the test fails withRuntimeError: Task group is not initialized. Make sure to use run().This replaces the fixed sleep with a readiness handshake.Motivation and Context
These tests (added in #1384) start a
ServerThreadthat spins up its own event loop, enters the Starlette lifespan, and only then sets the session manager's task group. The test waits a fixed 0.1s and then sends requests from its own event loop viahttpx.ASGITransport. When the thread loses the scheduling race — easy underpytest -n autoplus coverage on a 4-vCPU runner — the first request reacheshandle_request()while_task_groupis stillNoneand the test fails.This has flaked 6 times since 2026-05-27, on both Ubuntu and Windows, across unrelated PRs — e.g. 3.14/locked/windows and 3.14/locked/ubuntu. Because the test job is
continue-on-error, these never turn a run red, so they're easy to miss.The fix: the server thread sets a
threading.Eventonce lifespan startup has completed (at that point the task group is guaranteed to exist), and the tests wait on it (bounded at 5s, viaanyio.to_thread.run_sync) instead of sleeping. The trailinganyio.sleep(0.2)that gives the original #1363 race its detection window is deliberately left untouched — the tests still exercise exactly the same request paths and log checks.How Has This Been Tested?
--flake-finder --flake-runs=50 -n 4.ServerThread.run(): the old tests reproduce the exact CI error, the fixed tests pass. Repeated on a windows-latest runner (Python 3.14): unpatched + delay fails all three tests with the sameRuntimeError; patched + delay passes; patched 50× per test under-n 4passes 150/150../scripts/testpasses at 100% coverage; ruff/pyright clean.Breaking Changes
None — test-only change.
Types of changes
Checklist
Additional context
All three tests in the file shared the same fixed-sleep pattern; only
test_race_condition_invalid_accept_headershappened to be the one observed failing in CI, but the delayed-thread experiment shows the other two fail the same way, so all three call sites are converted.AI Disclaimer