Fix agent memory leak: InvokeAsync to SendAsync#10
Merged
GordonBeeming merged 1 commit intomainfrom Mar 26, 2026
Merged
Conversation
…sync The agent was leaking ~74GB over 21 hours via 296K unreleased 256K VM_ALLOCATE blocks (GC heap segments). Root cause: all SignalR hub calls used InvokeAsync (two-way RPC) which allocates pending invocation tracking state for each call, even for one-way methods like Heartbeat. Combined with NativeAOT, this pending state accumulated native memory that the GC never released back to the OS. Changes: - Switch all one-way SignalR calls (Heartbeat, SessionStatusChanged, DirectoryListing, ReportAllSessions, UpdateStatus) from InvokeAsync to SendAsync. Only RegisterAgent (which returns a result) keeps InvokeAsync. - Add CancellationToken parameter to all SignalR methods and thread stoppingToken through from Worker.cs callers. - Bound the Closed event reconnection loop with the stopping token instead of while(true), preventing unbounded retries during shutdown. - Dispose Process objects immediately when HealthCheck detects a crashed session, rather than waiting for the 1-hour cleanup cycle. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: GitButler <gitbutler@gitbutler.com>
There was a problem hiding this comment.
Pull request overview
This PR addresses a reported long-running memory growth issue in the agent by switching one-way SignalR hub calls from InvokeAsync (request/response) to SendAsync (fire-and-forget), and by propagating cancellation throughout the agent’s SignalR and reconnection logic.
Changes:
- Replace one-way SignalR
InvokeAsynccalls withSendAsync, keepingInvokeAsynconly forRegisterAgent. - Propagate
CancellationTokenthrough agent-to-hub calls and bound the manual reconnection loop by the stopping token. - Dispose
Processhandles immediately when a crashed session is detected during health checks.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| src/ClaudeNest.Agent/Worker.cs | Passes the stopping token through to all SignalR send/report operations. |
| src/ClaudeNest.Agent/Services/SignalRConnectionManager.cs | Converts hub calls to SendAsync, adds token propagation, and bounds manual reconnect by stopping token. |
| src/ClaudeNest.Agent/Services/SessionManager.cs | Disposes session Process handle on crash detection during HealthCheckAsync. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
GordonBeeming
added a commit
that referenced
this pull request
Mar 26, 2026
- Catch OperationCanceledException explicitly in the SignalR reconnection loop so shutdown cancellation logs cleanly instead of appearing as a reconnection failure with noisy backoff warnings. - Remove Process.Dispose() from HealthCheckAsync crash detection to avoid racing with SpawnProcessAsync/MonitorAdoptedProcessAsync that may still hold the Process handle. The owning code path remains responsible for disposal; the 1-hour cleanup catches any stragglers. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: GitButler <gitbutler@gitbutler.com>
GordonBeeming
added a commit
that referenced
this pull request
Mar 26, 2026
- Catch OperationCanceledException explicitly in the SignalR reconnection loop so shutdown cancellation logs cleanly instead of appearing as a reconnection failure with noisy backoff warnings. - Remove Process.Dispose() from HealthCheckAsync crash detection to avoid racing with SpawnProcessAsync/MonitorAdoptedProcessAsync that may still hold the Process handle. The owning code path remains responsible for disposal; the 1-hour cleanup catches any stragglers. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: GitButler <gitbutler@gitbutler.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
InvokeAsynctoSendAsyncto eliminate pending invocation state accumulation that caused 85GB memory growth over 21 hoursCancellationTokenpropagation to all SignalR methodsClosedevent reconnection loop with the stopping tokenProcessobjects immediately on crash detection inHealthCheckAsyncRoot Cause
The agent used
InvokeAsync(two-way RPC) for all SignalR calls, including one-way methods likeHeartbeat,SessionStatusChanged, etc. EachInvokeAsyncallocates aTaskCompletionSourceand registers it in an internal pending invocations dictionary. With NativeAOT, this tracking state accumulated as 256K VM_ALLOCATE blocks that were never released, reaching 296K blocks (74GB) over 21 hours of runtime.SendAsyncis fire-and-forget -- it serializes and sends without allocating any pending state. OnlyRegisterAgent(which actually returnsAgentRegistrationResult) needsInvokeAsync.Test plan
dotnet build-- full solution builds cleandotnet test-- all integration tests passvmmap --summary $(pgrep claudenest-agent)after deployment to confirm no rapid 256K segment growth🤖 Generated with Claude Code