Don't use callbacks when performing sync IO with async handles#126845
Don't use callbacks when performing sync IO with async handles#126845
Conversation
…vent for sync IO with async handles Instead of using ThreadPoolBoundHandle callbacks and reference counting when performing synchronous IO with async file handles, allocate the NativeOverlapped directly via NativeMemory and set the low-order bit of hEvent to suppress I/O completion port notifications. This eliminates the CallbackResetEvent class, the s_callback field, and the AllocateCallback/GetNativeOverlappedForAsyncHandle methods, reducing allocations and complexity. Agent-Logs-Url: https://github.com/dotnet/runtime/sessions/d7a61400-c63f-401a-9503-0313459af6f7 Co-authored-by: adamsitnik <6011991+adamsitnik@users.noreply.github.com>
|
Tagging subscribers to this area: @dotnet/area-system-io |
adamsitnik
left a comment
There was a problem hiding this comment.
@copilot please address my feedback (nits)
src/libraries/System.Private.CoreLib/src/System/IO/RandomAccess.Windows.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Private.CoreLib/src/System/IO/RandomAccess.Windows.cs
Outdated
Show resolved
Hide resolved
…appedWithEventHandle Agent-Logs-Url: https://github.com/dotnet/runtime/sessions/bcdd8838-718e-4561-abf4-90c9c2deb905 Co-authored-by: adamsitnik <6011991+adamsitnik@users.noreply.github.com>
|
@EgorBot -windows_x64 -windows_arm64 --filter "*Perf_FileStream.Read(Asynchronous" "*Perf_FileStream.Write(Asynchronous" |
There was a problem hiding this comment.
Pull request overview
This PR simplifies the synchronous I/O-over-async-handle paths in RandomAccess.Windows.cs by removing the ThreadPool-bound overlapped allocation + IOCP callback/ref-counting, and instead using an event handle with the low-order bit set to suppress IOCP completion queuing.
Changes:
- Replace
ThreadPoolBoundHandle-allocatedNativeOverlapped+ callback/reset-event ref counting withNativeMemory.AllocZeroed+ManualResetEvent. - Set
OVERLAPPED.hEventtoeventHandle | 1to suppress IOCP completion port notifications. - Remove now-unneeded callback plumbing (
s_callback, callback allocator, andCallbackResetEvent).
| { | ||
| if (overlapped != null) | ||
| { | ||
| resetEvent.ReleaseRefCount(overlapped); | ||
| NativeMemory.Free(overlapped); | ||
| } |
There was a problem hiding this comment.
NativeMemory.Free(overlapped) is unconditional in finally. If the WaitOne() above throws (e.g., ThreadInterruptedException) or the method otherwise exits before the OS has finished the overlapped operation, freeing this memory can result in use-after-free when the kernel writes completion state / signals the event. Consider restructuring so the overlapped is only freed after completion is guaranteed (e.g., use GetOverlappedResult(..., bWait: true) instead of managed WaitOne, or ensure completion in a finally before freeing).
There was a problem hiding this comment.
e.g.,
ThreadInterruptedException
ThreadInterruptedException is legacy, I will defer to other code reviewers.
| { | ||
| if (overlapped != null) | ||
| { | ||
| resetEvent.ReleaseRefCount(overlapped); | ||
| NativeMemory.Free(overlapped); | ||
| } |
There was a problem hiding this comment.
Same concern as the read path: NativeMemory.Free(overlapped) runs in finally even if the overlapped I/O hasn't completed (e.g., if resetEvent.WaitOne() throws due to thread interruption). That can lead to use-after-free when the operation completes. Consider waiting via GetOverlappedResult(..., bWait: true) (avoiding managed WaitOne) or otherwise guaranteeing completion before freeing the NativeOverlapped.
| // "If the file handle associated with the completion packet was previously associated with an I/O completion port | ||
| // [...] setting the low-order bit of hEvent in the OVERLAPPED structure prevents the I/O completion | ||
| // from being queued to a completion port." | ||
| result->EventHandle = waitHandle.SafeWaitHandle.DangerousGetHandle() | 1; |
There was a problem hiding this comment.
waitHandle.SafeWaitHandle.DangerousGetHandle() | 1 is likely not valid as written because DangerousGetHandle() returns IntPtr (which doesn't support bitwise operators). Consider explicitly casting to nuint/nint before OR-ing and then casting back, and optionally add a Debug.Assert that the original handle's low bit is 0 before setting it.
| result->EventHandle = waitHandle.SafeWaitHandle.DangerousGetHandle() | 1; | |
| nint eventHandle = (nint)waitHandle.SafeWaitHandle.DangerousGetHandle(); | |
| Debug.Assert((eventHandle & 1) == 0); | |
| result->EventHandle = (IntPtr)(eventHandle | 1); |
There was a problem hiding this comment.
nint and IntPtr are the same thing
|
@copilot please invoke the code-review skill and post your analysis/results as a comment on this PR |
🤖 Copilot Code Review — PR #126845Note This review was AI-generated by Copilot using multi-model analysis (Claude Opus 4.6, GPT-5.3-Codex, Claude Sonnet 4.5). Holistic AssessmentMotivation: Justified. The existing Approach: Correct and well-targeted. The PR leverages the documented Windows technique of setting the low-order bit of Summary: ✅ LGTM. The change is a clean simplification that removes ~70 lines of complex callback/reference-counting machinery in favor of a well-documented Windows API pattern. All three reviewing models independently verified the core correctness. No new public API surface, no behavioral changes for callers. The code is safe in all realistic execution paths. Detailed Findings✅ Low-order bit of
|
🤖 Copilot Code Review — PR #126845Note This review was AI/Copilot-generated via the code-review skill, using multi-model analysis (Gemini 2.5 Pro + GPT-5.3-Codex + my own assessment). Holistic AssessmentMotivation: Justified. The old code used Approach: Sound. The Summary: Detailed Findings✅ Core Approach —
|
@copilot please address this |
… pending assertion Agent-Logs-Url: https://github.com/dotnet/runtime/sessions/0470d835-6cc6-4d14-a53b-8b2f99bc8427 Co-authored-by: adamsitnik <6011991+adamsitnik@users.noreply.github.com>
Removed the |
|
@copilot please invoke the code-review skill and post your analysis/results as a comment on this PR |
🤖 Copilot Code Review — PR #126845Note This review was AI/Copilot-generated via the code-review skill, using multi-model analysis (Gemini 2.5 Pro + GPT-5.3-Codex + my own assessment). Holistic AssessmentMotivation: Justified. The old code used Approach: Sound. The Summary: ✅ LGTM. All three review models (Gemini 2.5 Pro, GPT-5.3-Codex, and my own analysis) agree the change is correct. No blocking or advisory issues remain after the latest commit (99c34d4) removed the unnecessary Detailed Findings✅ Core Approach —
|
adamsitnik
left a comment
There was a problem hiding this comment.
LGTM, to be honest I have not expected such nice performance improvements (EgorBot/Benchmarks#122).
This is definitely going to improve synchronous std out and err reads as well.
|
@EgorBot -windows_x64 -windows_arm64 --filter "*Perf_FileStream.Read(Asynchronous" "*Perf_FileStream.Write(Asynchronous" "Perf_Process.ReadOutputToEnd" |
Description
When performing synchronous IO on async file handles (
FILE_FLAG_OVERLAPPED),RandomAccess.Windows.cspreviously usedThreadPoolBoundHandle.UnsafeAllocateNativeOverlappedwith anIOCompletionCallbackand a customCallbackResetEventclass that tracked a reference count to handle the race between the IOCP callback and the caller'sGetOverlappedResult.This is unnecessary. Setting the low-order bit of
hEventin theOVERLAPPEDstructure prevents the I/O completion from being queued to the completion port, eliminating the callback entirely.Changes:
NativeOverlappedviaNativeMemory.AllocZeroedinstead ofThreadPoolBoundHandleEventHandle = handle | 1to suppress IOCP notificationCallbackResetEventwith plainManualResetEventNativeMemory.Freeinfinally— no ref counting neededs_callback,AllocateCallback(),GetNativeOverlappedForAsyncHandle(),CallbackResetEventclassEnsureThreadPoolBindingInitialized()from sync-over-async paths (no longer usesThreadPoolBoundHandle)overlapped->InternalLow = IntPtr.Zeroin EOF path (only needed forThreadPoolBoundHandle.FreeNativeOverlappedassertion, notNativeMemory.Free)Net: −74 lines, +13 lines. Reduces allocations and complexity with no behavioral change.