Improve indexing performance: batch stat()s with io_uring#2821
Open
dcolascione wants to merge 1 commit intodjcb:masterfrom
Open
Improve indexing performance: batch stat()s with io_uring#2821dcolascione wants to merge 1 commit intodjcb:masterfrom
dcolascione wants to merge 1 commit intodjcb:masterfrom
Conversation
Add support for using io_uring to batch stat() calls when scanning maildir directories. This approach reduces the number of individual syscalls by processing stats in batches of up to 16384 files at a time; it also allows the kernel to continue doing kernel-internal stat()s while we do indexing work. Performance impact is moderate but noticeable on a real-world maildir with ~490k messages: Without io_uring: 17.5s real time (0.75s user, 5.6s sys) With io_uring: 15.4s real time (0.74s user, 8.4s sys) The higher sys time with io_uring reflects the batch processing happening in kernel space rather than repeated userspace->kernel transitions. Only enable io_uring for maildir directories since: 1. They're the only directories likely to be large enough to benefit 2. This ensures the io_uring instance isn't used concurrently The feature can be disabled at runtime via MU_DISABLE_IO_URING=1 or at build time via -Diouring=disabled. Requires liburing >= 2.3.
Owner
|
Looks interesting. Will take a bit longer to review this (I'm not too familiar with io_uring), I don't think before 1.12.9 (which should be in the coming week) The performance improvements are modest, so we'll have to weigh that against the added complexity. Anyway, thanks for doing this! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Add support for using io_uring to batch stat() calls when scanning maildir directories. This approach reduces the number of individual syscalls by processing stats in batches of up to 16384 files at a time; it also allows the kernel to continue doing kernel-internal stat()s while we do indexing work.
Performance impact is moderate but noticeable on a real-world maildir with ~490k messages:
Without io_uring: 17.5s real time (0.75s user, 5.6s sys)
With io_uring: 15.4s real time (0.74s user, 8.4s sys)
The higher sys time with io_uring reflects the batch processing happening in kernel space rather than repeated userspace->kernel transitions.
Only enable io_uring for maildir directories since:
The feature can be disabled at runtime via MU_DISABLE_IO_URING=1 or at build time via -Diouring=disabled. Requires liburing >= 2.3.