Skip to content

Don't use callbacks when performing sync IO with async handles#126845

Open
Copilot wants to merge 5 commits intomainfrom
copilot/remove-callbacks-sync-io
Open

Don't use callbacks when performing sync IO with async handles#126845
Copilot wants to merge 5 commits intomainfrom
copilot/remove-callbacks-sync-io

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 13, 2026

Description

When performing synchronous IO on async file handles (FILE_FLAG_OVERLAPPED), RandomAccess.Windows.cs previously used ThreadPoolBoundHandle.UnsafeAllocateNativeOverlapped with an IOCompletionCallback and a custom CallbackResetEvent class that tracked a reference count to handle the race between the IOCP callback and the caller's GetOverlappedResult.

This is unnecessary. Setting the low-order bit of hEvent in the OVERLAPPED structure prevents the I/O completion from being queued to the completion port, eliminating the callback entirely.

Changes:

  • Allocate NativeOverlapped via NativeMemory.AllocZeroed instead of ThreadPoolBoundHandle
  • Set EventHandle = handle | 1 to suppress IOCP notification
  • Replace CallbackResetEvent with plain ManualResetEvent
  • Free with NativeMemory.Free in finally — no ref counting needed
  • Remove s_callback, AllocateCallback(), GetNativeOverlappedForAsyncHandle(), CallbackResetEvent class
  • Remove EnsureThreadPoolBindingInitialized() from sync-over-async paths (no longer uses ThreadPoolBoundHandle)
  • Remove unnecessary overlapped->InternalLow = IntPtr.Zero in EOF path (only needed for ThreadPoolBoundHandle.FreeNativeOverlapped assertion, not NativeMemory.Free)

Net: −74 lines, +13 lines. Reduces allocations and complexity with no behavioral change.

Copilot AI requested review from Copilot and removed request for Copilot April 13, 2026 17:28
Copilot AI linked an issue Apr 13, 2026 that may be closed by this pull request
…vent for sync IO with async handles

Instead of using ThreadPoolBoundHandle callbacks and reference counting when
performing synchronous IO with async file handles, allocate the NativeOverlapped
directly via NativeMemory and set the low-order bit of hEvent to suppress I/O
completion port notifications. This eliminates the CallbackResetEvent class,
the s_callback field, and the AllocateCallback/GetNativeOverlappedForAsyncHandle
methods, reducing allocations and complexity.

Agent-Logs-Url: https://github.com/dotnet/runtime/sessions/d7a61400-c63f-401a-9503-0313459af6f7

Co-authored-by: adamsitnik <6011991+adamsitnik@users.noreply.github.com>
Copilot AI requested review from Copilot and removed request for Copilot April 13, 2026 17:58
Copilot AI changed the title [WIP] Remove callbacks for sync IO operations with async handles Don't use callbacks when performing sync IO with async handles Apr 13, 2026
Copilot AI requested a review from adamsitnik April 13, 2026 18:04
@dotnet-policy-service
Copy link
Copy Markdown
Contributor

Tagging subscribers to this area: @dotnet/area-system-io
See info in area-owners.md if you want to be subscribed.

Copy link
Copy Markdown
Member

@adamsitnik adamsitnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot please address my feedback (nits)

…appedWithEventHandle

Agent-Logs-Url: https://github.com/dotnet/runtime/sessions/bcdd8838-718e-4561-abf4-90c9c2deb905

Co-authored-by: adamsitnik <6011991+adamsitnik@users.noreply.github.com>
Copilot AI requested review from Copilot and removed request for Copilot April 14, 2026 05:47
Copilot AI requested a review from adamsitnik April 14, 2026 05:48
@adamsitnik adamsitnik marked this pull request as ready for review April 14, 2026 05:53
Copilot AI review requested due to automatic review settings April 14, 2026 05:53
@adamsitnik
Copy link
Copy Markdown
Member

@EgorBot -windows_x64 -windows_arm64 --filter "*Perf_FileStream.Read(Asynchronous" "*Perf_FileStream.Write(Asynchronous"

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR simplifies the synchronous I/O-over-async-handle paths in RandomAccess.Windows.cs by removing the ThreadPool-bound overlapped allocation + IOCP callback/ref-counting, and instead using an event handle with the low-order bit set to suppress IOCP completion queuing.

Changes:

  • Replace ThreadPoolBoundHandle-allocated NativeOverlapped + callback/reset-event ref counting with NativeMemory.AllocZeroed + ManualResetEvent.
  • Set OVERLAPPED.hEvent to eventHandle | 1 to suppress IOCP completion port notifications.
  • Remove now-unneeded callback plumbing (s_callback, callback allocator, and CallbackResetEvent).

Comment on lines 112 to 116
{
if (overlapped != null)
{
resetEvent.ReleaseRefCount(overlapped);
NativeMemory.Free(overlapped);
}
Copy link

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NativeMemory.Free(overlapped) is unconditional in finally. If the WaitOne() above throws (e.g., ThreadInterruptedException) or the method otherwise exits before the OS has finished the overlapped operation, freeing this memory can result in use-after-free when the kernel writes completion state / signals the event. Consider restructuring so the overlapped is only freed after completion is guaranteed (e.g., use GetOverlappedResult(..., bWait: true) instead of managed WaitOne, or ensure completion in a finally before freeing).

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

e.g., ThreadInterruptedException

ThreadInterruptedException is legacy, I will defer to other code reviewers.

Comment on lines 198 to 202
{
if (overlapped != null)
{
resetEvent.ReleaseRefCount(overlapped);
NativeMemory.Free(overlapped);
}
Copy link

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same concern as the read path: NativeMemory.Free(overlapped) runs in finally even if the overlapped I/O hasn't completed (e.g., if resetEvent.WaitOne() throws due to thread interruption). That can lead to use-after-free when the operation completes. Consider waiting via GetOverlappedResult(..., bWait: true) (avoiding managed WaitOne) or otherwise guaranteeing completion before freeing the NativeOverlapped.

Copilot uses AI. Check for mistakes.
// "If the file handle associated with the completion packet was previously associated with an I/O completion port
// [...] setting the low-order bit of hEvent in the OVERLAPPED structure prevents the I/O completion
// from being queued to a completion port."
result->EventHandle = waitHandle.SafeWaitHandle.DangerousGetHandle() | 1;
Copy link

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

waitHandle.SafeWaitHandle.DangerousGetHandle() | 1 is likely not valid as written because DangerousGetHandle() returns IntPtr (which doesn't support bitwise operators). Consider explicitly casting to nuint/nint before OR-ing and then casting back, and optionally add a Debug.Assert that the original handle's low bit is 0 before setting it.

Suggested change
result->EventHandle = waitHandle.SafeWaitHandle.DangerousGetHandle() | 1;
nint eventHandle = (nint)waitHandle.SafeWaitHandle.DangerousGetHandle();
Debug.Assert((eventHandle & 1) == 0);
result->EventHandle = (IntPtr)(eventHandle | 1);

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nint and IntPtr are the same thing

@adamsitnik
Copy link
Copy Markdown
Member

@copilot please invoke the code-review skill and post your analysis/results as a comment on this PR

@github-actions
Copy link
Copy Markdown
Contributor

🤖 Copilot Code Review — PR #126845

Note

This review was AI-generated by Copilot using multi-model analysis (Claude Opus 4.6, GPT-5.3-Codex, Claude Sonnet 4.5).

Holistic Assessment

Motivation: Justified. The existing CallbackResetEvent / ThreadPoolBoundHandle callback mechanism is complex—it requires reference counting, a static IOCompletionCallback, and EnsureThreadPoolBindingInitialized calls, all because the I/O completion port would still receive notifications even though the caller only wants synchronous semantics. The Windows API provides a documented, simpler alternative.

Approach: Correct and well-targeted. The PR leverages the documented Windows technique of setting the low-order bit of hEvent in the OVERLAPPED structure to suppress I/O completion port queuing ([GetQueuedCompletionStatus docs]((learn.microsoft.com/redacted) With no IOCP notification, there is no callback, no race condition between callback and GetOverlappedResult, and therefore no need for reference counting. The NativeOverlapped can be allocated directly via NativeMemory.AllocZeroed and freed unconditionally in the finally block.

Summary: ✅ LGTM. The change is a clean simplification that removes ~70 lines of complex callback/reference-counting machinery in favor of a well-documented Windows API pattern. All three reviewing models independently verified the core correctness. No new public API surface, no behavioral changes for callers. The code is safe in all realistic execution paths.


Detailed Findings

✅ Low-order bit of hEvent — Correctly implemented

Setting result->EventHandle = waitHandle.SafeWaitHandle.DangerousGetHandle() | 1 at line 725 is the documented Windows mechanism to suppress IOCP queueing while still allowing the kernel to signal the event on I/O completion. GetOverlappedResult with bWait: false remains valid after the event wait—Windows strips the low-order bit when interpreting the event handle. All three models verified this is correct.

✅ Removal of EnsureThreadPoolBindingInitialized — Correct

The old code needed ThreadPoolBinding to allocate the NativeOverlapped via UnsafeAllocateNativeOverlapped. The new code allocates directly via NativeMemory.AllocZeroed, so no thread pool binding is required for the sync-on-async paths. The async code paths (lines 250, 343, 558, 655) still call EnsureThreadPoolBindingInitialized(), so async operations are unaffected. If a handle is never used for async operations, the IOCP binding is simply never needed—the event-based signaling works regardless.

DangerousGetHandle lifetime — Safe

The ManualResetEvent is created at method entry (line 70/156) and disposed in the finally block (line 118/204) after the overlapped is freed. The underlying event handle remains valid for the entire duration of the I/O operation. This matches the pattern used by the old code (resetEvent.SafeWaitHandle.DangerousGetHandle() without DangerousAddRef).

✅ Removal of else error branches — Correct

The old else branches (for when errorCode was neither ERROR_IO_PENDING nor ERROR_SUCCESS) existed solely to decrement the callback reference count, since the callback would not be invoked on immediate failure. With no callback in the new design, there is no reference count to manage. The finally block's NativeMemory.Free(overlapped) handles cleanup for all code paths.

✅ Memory safety in finally — Correct for all realistic paths

By the time the finally block runs, the I/O operation is either:

  1. Completed synchronously (ERROR_SUCCESS) — safe to free
  2. Completed after wait (ERROR_IO_PENDINGWaitOne() blocks until done) — safe to free
  3. Failed immediately (other error code, I/O never started) — safe to free

CallbackResetEvent and AllocateCallback removal — Clean

The entire callback infrastructure (s_callback, AllocateCallback, CallbackResetEvent with its ref-counting ReleaseRefCount) is correctly removed. No other code in the repository references these types.

✅ Test coverage — Adequate

The sync-on-async file handle paths (ReadSyncUsingAsyncHandle, WriteSyncUsingAsyncHandle) are exercised by existing RandomAccess tests that open files with FileOptions.Asynchronous and perform synchronous reads/writes. No new behavioral surface is introduced, so existing test coverage is sufficient.

Generated by Code Review for issue #126845 ·

Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 14, 2026

@copilot please invoke the code-review skill and post your analysis/results as a comment on this PR

🤖 Copilot Code Review — PR #126845

Note

This review was AI/Copilot-generated via the code-review skill, using multi-model analysis (Gemini 2.5 Pro + GPT-5.3-Codex + my own assessment).

Holistic Assessment

Motivation: Justified. The old code used ThreadPoolBoundHandle.UnsafeAllocateNativeOverlapped with a CallbackResetEvent ref-counting scheme to handle the race between the IOCP callback and GetOverlappedResult. This was necessary because the event handle alone did not suppress IOCP notifications. The Windows API provides a documented mechanism (setting the low-order bit of hEvent) to prevent completion packets from being queued to the IOCP, making the entire callback infrastructure unnecessary.

Approach: Sound. The hEvent | 1 technique is well-documented in the GetQueuedCompletionStatus documentation. This eliminates the callback, the ref counting, and the thread pool binding requirement for these sync paths — a clean simplification with −71/+13 lines.

Summary: ⚠️ Needs Human Review. The core approach is correct and well-documented. One concern about resource lifetime during exceptional WaitOne() interruption was flagged by one model but is pre-existing in the old code. Human reviewers should confirm they are comfortable with this pre-existing behavior.


Detailed Findings

✅ Core Approach — hEvent low-order bit to suppress IOCP notifications

Setting result->EventHandle = waitHandle.SafeWaitHandle.DangerousGetHandle() | 1 is the canonical Windows pattern for performing overlapped I/O on an IOCP-bound handle without queuing a completion packet. This is documented in the GetQueuedCompletionStatus and OVERLAPPED documentation. Both sub-agent models and my own analysis confirm this is correct.

✅ Removal of EnsureThreadPoolBindingInitialized() from sync paths

Since the overlapped is now allocated via NativeMemory.AllocZeroed rather than ThreadPoolBoundHandle.UnsafeAllocateNativeOverlapped, there is no need to initialize the thread pool binding for these sync-over-async paths. The async paths (QueueAsyncReadFile, etc.) still call EnsureThreadPoolBindingInitialized() correctly. Verified via grep that all remaining async callers still invoke it.

✅ Removal of CallbackResetEvent ref-counting and else error branch

The old else branch (for errors other than ERROR_IO_PENDING/ERROR_SUCCESS) only called resetEvent.ReleaseRefCount(overlapped) to balance the ref count since the callback wouldn't fire. With no callback in the new code, this branch is unnecessary — the finally block handles cleanup uniformly. The entire CallbackResetEvent class, s_callback field, and AllocateCallback() method are correctly removed.

DangerousGetHandle() lifetime safety

The ManualResetEvent is created at method entry and not disposed until the finally block, which runs after the overlapped I/O has completed (via WaitOne + GetOverlappedResult). The native event handle remains valid for the full duration of the overlapped operation in all normal execution paths.

⚠️ Resource lifetime if WaitOne() throws (pre-existing) — Needs Human Review

One model flagged that if resetEvent.WaitOne() throws (e.g., ThreadInterruptedException), the finally block will free the NativeOverlapped* and dispose the event while the kernel I/O may still be pending. However, this concern is pre-existing:

  • In the old code, the finally block called resetEvent.ReleaseRefCount(overlapped) (decrementing ref count to 1, deferring the free) but still called resetEvent.Dispose(), closing the event handle the kernel is trying to signal.
  • In both old and new code, the fixed block exits when the exception propagates, unpinning the buffer the kernel is writing to.

So the old code was equally vulnerable to buffer and event-handle use-after-free in this scenario — it only deferred the overlapped free itself. ThreadInterruptedException is legacy, and @adamsitnik already noted he defers to other reviewers on this concern. The new code is not materially worse than the old code for this edge case.

💡 overlapped->InternalLow = IntPtr.Zero in EOF path

The line overlapped->InternalLow = IntPtr.Zero (line 104) with the comment "failure to do so looks like we are freeing a pending overlapped later" was present in the old code. In the old code, this was needed because ThreadPoolBoundHandle.FreeNativeOverlapped asserts the overlapped is not pending. With NativeMemory.Free, no such assertion exists. This line is harmless but may be removable as a follow-up cleanup. Low priority.


Test Coverage

The existing test file RandomAccess/Mixed.Windows.cs specifically tests all combinations of sync/async read/write on both File.OpenHandle handles (with ThreadPoolBinding initialized) and raw CreateFileW handles (without binding). This covers the exact paths modified by this PR.

No New Public API Surface

No ref assembly changes detected. No new public members introduced. API approval check is not applicable.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • learn.microsoft.com
    • Triggering command: /home/REDACTED/work/_temp/ghcca-node/node/bin/node /home/REDACTED/work/_temp/ghcca-node/node/bin/node --enable-source-maps /home/REDACTED/work/_temp/copilot-developer-action-main/dist/index.js (dns block)
    • Triggering command: /usr/bin/curl curl -L --silent REDACTED (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

@adamsitnik
Copy link
Copy Markdown
Member

💡 overlapped->InternalLow = IntPtr.Zero in EOF path

The line overlapped->InternalLow = IntPtr.Zero (line 104) with the comment "failure to do so looks like we are freeing a pending overlapped later" was present in the old code. In the old code, this was needed because ThreadPoolBoundHandle.FreeNativeOverlapped asserts the overlapped is not pending. With NativeMemory.Free, no such assertion exists. This line is harmless but may be removable as a follow-up cleanup. Low priority.

@copilot please address this

… pending assertion

Agent-Logs-Url: https://github.com/dotnet/runtime/sessions/0470d835-6cc6-4d14-a53b-8b2f99bc8427

Co-authored-by: adamsitnik <6011991+adamsitnik@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 14, 2026

💡 overlapped->InternalLow = IntPtr.Zero in EOF path

The line overlapped->InternalLow = IntPtr.Zero (line 104) with the comment "failure to do so looks like we are freeing a pending ov...

Removed the overlapped->InternalLow = IntPtr.Zero line and its comment in the EOF path of ReadSyncUsingAsyncHandle. This was only needed because ThreadPoolBoundHandle.FreeNativeOverlapped asserts that the overlapped is not pending — NativeMemory.Free has no such assertion. The two remaining occurrences in async paths (lines 273, 579) still use ThreadPoolBoundHandle.FreeNativeOverlapped and correctly remain. Done in 99c34d4.

@adamsitnik adamsitnik added the tenet-performance Performance related issue label Apr 14, 2026
@adamsitnik
Copy link
Copy Markdown
Member

@copilot please invoke the code-review skill and post your analysis/results as a comment on this PR

Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 14, 2026

@copilot please invoke the code-review skill and post your analysis/results as a comment on this PR

🤖 Copilot Code Review — PR #126845

Note

This review was AI/Copilot-generated via the code-review skill, using multi-model analysis (Gemini 2.5 Pro + GPT-5.3-Codex + my own assessment).

Holistic Assessment

Motivation: Justified. The old code used ThreadPoolBoundHandle.UnsafeAllocateNativeOverlapped with a CallbackResetEvent ref-counting scheme to handle the race between the IOCP callback and GetOverlappedResult. The Windows API provides a documented, simpler mechanism (setting the low-order bit of hEvent) to suppress IOCP completion queuing entirely.

Approach: Sound. The hEvent | 1 technique is well-documented in the GetQueuedCompletionStatus documentation. This eliminates the callback, the ref counting, the CallbackResetEvent class, and the thread pool binding requirement for sync paths — a clean simplification.

Summary: ✅ LGTM. All three review models (Gemini 2.5 Pro, GPT-5.3-Codex, and my own analysis) agree the change is correct. No blocking or advisory issues remain after the latest commit (99c34d4) removed the unnecessary InternalLow zeroing.


Detailed Findings

✅ Core Approach — hEvent low-order bit to suppress IOCP notifications (all 3 models agree)

Setting result->EventHandle = waitHandle.SafeWaitHandle.DangerousGetHandle() | 1 is the canonical Windows pattern for overlapped I/O on an IOCP-bound handle without queuing a completion packet. Well-documented and correct.

NativeMemory.AllocZeroed / NativeMemory.Free lifecycle (all 3 models agree)

The overlapped is allocated per-call in the try block and unconditionally freed in finally. Since the method waits for completion (WaitOne + GetOverlappedResult) before reaching finally in all normal paths, the lifetime is correctly scoped.

✅ Removal of EnsureThreadPoolBindingInitialized() from sync paths (all 3 models agree)

Since the overlapped is now allocated via NativeMemory.AllocZeroed rather than ThreadPoolBoundHandle.UnsafeAllocateNativeOverlapped, the thread pool binding is unnecessary for these paths. All remaining async callers (QueueAsyncReadFile, QueueAsyncWriteFile, etc.) still call EnsureThreadPoolBindingInitialized().

✅ Removal of CallbackResetEvent and the else error branch (all 3 models agree)

The old else branch only balanced the ref count since the callback wouldn't fire on immediate errors. With no callback, this is unnecessary — the finally block handles cleanup uniformly. The entire CallbackResetEvent class, s_callback field, and AllocateCallback() method are correctly removed.

✅ Removal of overlapped->InternalLow = IntPtr.Zero in EOF path (all 3 models agree)

This was needed because ThreadPoolBoundHandle.FreeNativeOverlapped asserts the overlapped is not pending. NativeMemory.Free has no such assertion. The two remaining occurrences (lines 273, 579 in async read paths) correctly remain since they still use ThreadPoolBoundHandle.FreeNativeOverlapped.

✅ Thread safety (all 3 models agree)

Each call creates a dedicated ManualResetEvent and NativeOverlapped, ensuring operations on the same file handle from different threads don't interfere. The hEvent | 1 pattern isolates each operation from the IOCP.

DangerousGetHandle() lifetime safety

The ManualResetEvent is created at method entry and disposed only in finally after the I/O completes. The native event handle remains valid throughout the overlapped operation.

✅ Test coverage

The existing RandomAccess/Mixed.Windows.cs tests all combinations of sync/async read/write on both File.OpenHandle handles (with ThreadPoolBinding) and raw CreateFileW handles (without binding), covering the exact paths modified.


No New Public API Surface

No ref assembly changes detected. No new public members introduced.

Copy link
Copy Markdown
Member

@adamsitnik adamsitnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, to be honest I have not expected such nice performance improvements (EgorBot/Benchmarks#122).

This is definitely going to improve synchronous std out and err reads as well.

@adamsitnik adamsitnik requested a review from stephentoub April 14, 2026 12:09
@adamsitnik
Copy link
Copy Markdown
Member

@EgorBot -windows_x64 -windows_arm64 --filter "*Perf_FileStream.Read(Asynchronous" "*Perf_FileStream.Write(Asynchronous" "Perf_Process.ReadOutputToEnd"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-System.IO tenet-performance Performance related issue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Don't use callbacks when performing sync IO with async handles

4 participants