Remove custom `till_deferred_has_result(...)` in favor of `HomeserverTestCase.wait_on_thread(...)` to drive async Rust (Tokio runtime/thread pool) by MadLittleMods · Pull Request #19867 · element-hq/synapse

MadLittleMods · 2026-06-18T22:17:41Z

Remove custom till_deferred_has_result(...) in favor of HomeserverTestCase.wait_on_thread(...) to drive async Rust (Tokio runtime/thread pool).

Spawning from adding some more async Rust things in #19846 and noticing that we have an existing pattern to use instead of the custom till_deferred_has_result(...) that has crept in to a few files.

Dev notes

time.sleep(0) ("Suspend execution of the calling thread [...]")
os.sched_yield() ("Voluntarily relinquish the CPU.")

std::thread::sleep(std::time::Duration::from_secs(1));

Pull Request Checklist

Pull request is based on the develop branch
Pull request includes a changelog file. The entry should:
- Be a short description of your change which makes sense to users. "Fixed a bug that prevented receiving messages from other servers." instead of "Moved X method from EventStore to EventWorkerStore.".
- Use markdown where necessary, mostly for code blocks.
- End with either a period (.) or an exclamation mark (!).
- Start with a capital letter.
- Feel free to credit yourself, by adding a sentence "Contributed by @github_username." or "Contributed by [Your Name]." to the end of the entry.
Code style is correct (run the linters)

…client.py`

MadLittleMods · 2026-06-18T22:26:00Z

        events.USE_FROZEN_DICTS = False

-    def wait_on_thread(self, deferred: Deferred, timeout: int = 10) -> None:
+    def wait_on_thread(


wait_on_thread(...) was first introduced in matrix-org/synapse#5475. It's unclear from that PR itself but I'm guessing this is because of the defer_to_thread(...) usage in the media repo.

MadLittleMods · 2026-06-18T22:26:23Z

-    def wait_on_thread(self, deferred: Deferred, timeout: int = 10) -> None:
+    def wait_on_thread(
+        self,
+        awaitable: Awaitable[TV],


Changed this to accept any awaitable (Deferred or Coroutine)

MadLittleMods · 2026-06-18T22:43:35Z

            time.sleep(0.01)
+            # Advance the Twisted reactor as the thread may have scheduled something on
+            # the reactor to run (like `reactor.callFromThread(...)`)
+            self.reactor.advance(0)


Changed this from advancing time by 0.01 each iteration to 0. We shouldn't need to advance the Twisted reactor time at all bolstered by the fact that the tests still pass.

MadLittleMods · 2026-06-18T22:44:41Z

+            # Advance the Twisted reactor as the thread may have scheduled something on
+            # the reactor to run (like `reactor.callFromThread(...)`)
+            self.reactor.advance(0)


Changed the order to advance the Twisted reactor after we give some time to other threads to make some progress. The thinking is that the thread can do some work, potentially scheduling some things on reactor, and then we unblock that.

Since we're iterating in a loop, the order probably doesn't matter that much but perhaps this makes more sense.

MadLittleMods · 2026-06-18T22:55:05Z

        requester = self.get_success(
-            self.till_deferred_has_result(
-                self._auth.get_user_by_access_token("some_token")
-            )
+            # We have to wait for the async Rust HTTP client (running on the Tokio
+            # thread pool) to do its thing (see `create_deferred(...)` usage)
+            self.wait_on_thread(self._auth.get_user_by_access_token("some_token"))
        )


Overall, not very satisfied with this API shape. I think it's good enough as another iteration. Perhaps a better name?

Ideally, we could just do self.get_success(self._auth.get_user_by_access_token("some_token"))) which would drive things to completion regardless of the kind of work necessary (Python or Rust).

But adding real sleeps to get_success(...) for everything would result in slowing down the entire test suite.

We could potentially add a new attribute to HomeserverTestCase like drive_work_on_threads (like the existing needs_threadpool) which would conditionally sleep for real in get_success(...). This is half-decent but having to realize this obscure detail makes things work kinda sucks.

For example, this is how needs_threadpool is defined on a test case basis:

synapse/tests/media/test_media_storage.py

Lines 71 to 72 in d3fc819

class MediaStorageTests(unittest.HomeserverTestCase):

needs_threadpool = True

Or even better if we could automatically detect when there is work to be done on the Tokio thread pool and do the real sleep loop. Probably have to detect this by using the Tokio RuntimeMetrics. I think it would be better to explore this as a follow-up though.

So I think that we can actually change the sleep to be time.sleep(0), since that is apparently enough to signal that other threads should run. I've tried it locally and the tests tests/handlers/test_oauth_delegation.py tests/synapse_rust/test_http_client.py seem to pass?

At which point this could be added to get_success I think?

time.sleep(0) does seem to work 👍. If I look at the docs for time.sleep(...) ("Suspend execution of the calling thread [...]"), it also mentions that you can use os.sched_yield() ("Voluntarily relinquish the CPU.") which also works.

https://discuss.python.org/t/time-sleep-0-yield-behaviour/27185/5 mentions that time.sleep(0) releases the GIL while os.sched_yield() doesn't. But I think that is now fixed (looks like it was part of Python 3.10): python/cpython#96078

It's unclear what's better, constant context switching or just allowing some time. I would have assumed that constant context switching would have some impact but the results from #19871 show that it doesn't seem to slow things down in a noticeable way.

MadLittleMods · 2026-06-18T22:56:02Z

+            # We have to wait for the async Rust HTTP client (running on the Tokio
+            # thread pool) to do its thing (see `create_deferred(...)` usage)
+            self.wait_on_thread(self._auth.get_user_by_access_token("some_token"))


The comment is repetitive but it's nice to know why the wait_on_thread(...) complication is being used here.

…s not subscriptable` ``` File "/home/runner/work/synapse/synapse/tests/unittest.py", line 481, in HomeserverTestCase ) -> Deferred[TV]: builtins.TypeError: 'type' object is not subscriptable ```

See #19867 (comment)

MadLittleMods · 2026-06-23T19:58:40Z

Closing in favor of #19871

@erikjohnston

…nc Rust (Tokio runtime/thread pool) (#19871) This means you can use `get_success(...)` anywhere regardless of what kind of work needs to be done. Spawning from adding some more async Rust things in #19846 and wanting something more standard instead of the custom `till_deferred_has_result(...)` that has crept in to a few files. Alternative to #19867 spurred on by [this comment](#19867 (comment)) from @erikjohnston ### How does this work? Previously, `get_success(...)` just ran in a hot-loop advancing the Twisted reactor clock which didn't give any time for other threads to do some work or acquire the GIL if necessary (whenever there is a hand-off from Rust to Python, we need the GIL). Now, `get_success(...)` loops until we see a result (until we hit the ~0.1s real-time timeout). In the loop, we call [`time.sleep(0)`](https://docs.python.org/3/library/time.html#time.sleep) which will "Suspend execution of the calling thread [...]" (CPU and GIL) to allow other threads to do some work. Then like before, we advance the Twisted reactor clock to run any scheduled callbacks which includes anything the other threads may have scheduled. ### Does this slow down the entire test suite? Seems just as fast as before. There is minutes variance in what we had before and after but both are within the same range of each other. (see PR for actual before/after timings)

MadLittleMods added 4 commits June 18, 2026 16:01

Remove till_deferred_has_result from `tests/synapse_rust/test_http_…

0d3fbf9

…client.py`

Refine wait_on_thread

f23daba

Refine

1bd4c9d

Add changelog

488f055

MadLittleMods added rust Z-Rust labels Jun 18, 2026

MadLittleMods commented Jun 18, 2026

View reviewed changes

Fix trial-olddeps failing with `builtins.TypeError: 'type' object i…

5801844

…s not subscriptable` ``` File "/home/runner/work/synapse/synapse/tests/unittest.py", line 481, in HomeserverTestCase ) -> Deferred[TV]: builtins.TypeError: 'type' object is not subscriptable ```

MadLittleMods changed the title ~~Remove custom till_deferred_has_result(...) in favor of HomeserverTestCase.wait_on_thread(...)~~ Remove custom till_deferred_has_result(...) in favor of HomeserverTestCase.wait_on_thread(...) to drive async Rust (Tokio runtime/thread pool) Jun 18, 2026

MadLittleMods marked this pull request as ready for review June 18, 2026 23:27

MadLittleMods requested a review from a team as a code owner June 18, 2026 23:27

Try time.sleep(...) and os.sched_yield()

ef59b6d

See #19867 (comment)

This was referenced Jun 19, 2026

Try time.sleep(0) when we pump across all tests #19870

Closed

Update HomeserverTestCase.get_success(...) and friends to drive async Rust (Tokio runtime/thread pool) #19871

Merged

MadLittleMods added the A-Testing label Jun 22, 2026

MadLittleMods closed this Jun 23, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Remove custom `till_deferred_has_result(...)` in favor of `HomeserverTestCase.wait_on_thread(...)` to drive async Rust (Tokio runtime/thread pool)#19867

Remove custom `till_deferred_has_result(...)` in favor of `HomeserverTestCase.wait_on_thread(...)` to drive async Rust (Tokio runtime/thread pool)#19867
MadLittleMods wants to merge 6 commits into
developfrom
madlittlemods/remove-till_deferred_has_result

MadLittleMods commented Jun 18, 2026 •

edited

Loading

Uh oh!

MadLittleMods Jun 18, 2026

Uh oh!

MadLittleMods Jun 18, 2026

Uh oh!

MadLittleMods Jun 18, 2026 •

edited

Loading

Uh oh!

MadLittleMods Jun 18, 2026

Uh oh!

MadLittleMods Jun 18, 2026 •

edited

Loading

Uh oh!

erikjohnston Jun 19, 2026

Uh oh!

MadLittleMods Jun 23, 2026 •

edited

Loading

Uh oh!

MadLittleMods Jun 18, 2026

Uh oh!

MadLittleMods commented Jun 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	class MediaStorageTests(unittest.HomeserverTestCase):
	needs_threadpool = True

Uh oh!

Conversation

MadLittleMods commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Dev notes

Pull Request Checklist

Uh oh!

MadLittleMods Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

MadLittleMods Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

MadLittleMods Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MadLittleMods Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

MadLittleMods Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

erikjohnston Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

MadLittleMods Jun 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MadLittleMods Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

MadLittleMods commented Jun 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

MadLittleMods commented Jun 18, 2026 •

edited

Loading

MadLittleMods Jun 18, 2026 •

edited

Loading

MadLittleMods Jun 18, 2026 •

edited

Loading

MadLittleMods Jun 23, 2026 •

edited

Loading