Skip to content

Re-arm wakeup doorbell on multishot poll termination#80

Merged
vadimskipin merged 1 commit into
mainfrom
vskipin/doorbell-rearm
Jun 18, 2026
Merged

Re-arm wakeup doorbell on multishot poll termination#80
vadimskipin merged 1 commit into
mainfrom
vskipin/doorbell-rearm

Conversation

@vadimskipin

Copy link
Copy Markdown
Collaborator

The per-CPU wakeup doorbell is a single IORING_OP_POLL_ADD_MULTI submitted once at init: wakeThread writes the eventfd, the multishot poll posts a CQE, and io_uring_enter2 wakes. But a multishot poll is not permanent - the kernel ends it (final CQE with IORING_CQE_F_MORE cleared) on CQ overflow, which IORING_FEAT_NODROP does not prevent (NODROP preserves the CQEs, not the poll's arming). The completion handler never checked F_MORE and never re-armed, so a single termination left the doorbell deaf for the life of the processor: wakeThread's eventfd_write produced no CQE, and the CPU only woke via its park timeout (up to maxWaitNs) - a sticky cross-CPU wakeup-latency cliff.

Factor the arming into ProcessorState::enqueueDoorbell, used by both initialize and handleCompletionQueueSlow, which now re-arms when a doorbell CQE arrives with F_MORE cleared (after the CQ is fully drained, outside the for-each-cqe iteration). Unlike enqueueWakeup, a skipped re-arm is unrecoverable, so it retries through a full SQ ring instead of skipping.

The per-CPU wakeup doorbell is a single IORING_OP_POLL_ADD_MULTI submitted
once at init: wakeThread writes the eventfd, the multishot poll posts a CQE,
and io_uring_enter2 wakes. But a multishot poll is not permanent - the kernel
ends it (final CQE with IORING_CQE_F_MORE cleared) on CQ overflow, which
IORING_FEAT_NODROP does not prevent (NODROP preserves the CQEs, not the poll's
arming). The completion handler never checked F_MORE and never re-armed, so a
single termination left the doorbell deaf for the life of the processor:
wakeThread's eventfd_write produced no CQE, and the CPU only woke via its park
timeout (up to maxWaitNs) - a sticky cross-CPU wakeup-latency cliff.

Factor the arming into ProcessorState::enqueueDoorbell, used by both initialize
and handleCompletionQueueSlow, which now re-arms when a doorbell CQE arrives
with F_MORE cleared (after the CQ is fully drained, outside the for-each-cqe
iteration). Unlike enqueueWakeup, a skipped re-arm is unrecoverable, so it
retries through a full SQ ring instead of skipping.
@vadimskipin vadimskipin merged commit 11e5317 into main Jun 18, 2026
14 checks passed
@vadimskipin vadimskipin deleted the vskipin/doorbell-rearm branch June 18, 2026 11:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant