fix: upgrade watchdog timeout when events.jsonl shows active tool execution#586
fix: upgrade watchdog timeout when events.jsonl shows active tool execution#586
Conversation
…cution When tools were used this turn but ActiveToolCallCount is 0, the watchdog uses the 180s used-tools timeout. But the SDK sometimes fails to deliver ToolExecutionStartEvent for in-flight tools (events only appear in events.jsonl, not the live stream). This caused premature completion of sessions with long-running tools like read_bash. Fix: when useUsedToolsTimeout is true, check if events.jsonl was written within 60s. If so, upgrade to the full 600s tool timeout — the CLI is actively executing a tool that the SDK didn't report. Fixes #585 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…E-DEFER-RESOLVE Model review findings addressed: 1. Added startedAt guard: file freshness check now requires lastWrite > startedAt to prevent false positives from prior-turn events.jsonl 2. Added generation guard to TryResolveDeferredIdleAfterBackgroundTaskChange: previously called CompleteResponse without a generation parameter, making it the weakest guard in the system 3. Moved freshness check after startedAt declaration to fix build error Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Code Review — PR #586 (Re-review)fix: upgrade watchdog timeout when events.jsonl shows active tool execution Previous Findings Status
New Findings🟡 MODERATE — No test coverage for the new timeout upgrade pathFile: The PR modifies only Suggested: Add a Discarded FindingsOne reviewer flagged a missing CI Status
Prior Review CommentsThe original critical finding (no-op Recommendation: ✅ ApproveThe critical bug from the first review is fixed correctly. The generation guard addition is sound. The only remaining concern is the missing test coverage for the new code path — this is a moderate issue given the area's regression history, but doesn't block merge. Adding a targeted test in a follow-up would be strongly recommended. |
…n flags The previous code set useUsedToolsTimeout/useToolTimeout after effectiveTimeout was already computed, making the timeout upgrade a no-op. Now directly assigns effectiveTimeout = WatchdogToolExecutionTimeoutSeconds when events.jsonl freshness indicates an active tool. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ts.jsonl Verifies: (1) freshness check gates on useUsedToolsTimeout, (2) requires lastWrite > startedAt, (3) checks against WatchdogCaseBFreshnessSeconds, (4) directly assigns effectiveTimeout (not boolean flags), (5) no dead-store flag mutations. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
What
When the SDK fails to deliver
ToolExecutionStartEventfor an in-flight tool (events only appear inevents.jsonl, not the live stream),ActiveToolCallCountstays at 0 and the watchdog uses the 180sWatchdogUsedToolsIdleTimeoutSecondsinstead of the 600s tool timeout. Long-running tools likeread_bashget prematurely completed.Root cause
The SDK's live event stream sometimes doesn't deliver
ToolExecutionStartEventfor a tool call, even though the CLI writes it toevents.jsonl. From the watchdog's perspective:ActiveToolCallCount = 0(no start event received)HasUsedToolsThisTurn = true(from earlier tools)A 3-minute
read_bashcall gets killed at 180s even though the CLI is actively executing it.Fix
When
useUsedToolsTimeoutis true, check ifevents.jsonlwas written within 60s. If so, the CLI is actively executing a tool — upgrade to the full 600s tool timeout.Observed timeline (NET11-MANAGEMENT, April 14)
With this fix, at
21:03:42the watchdog would detectevents.jsonlfreshness (176s < 60s fails, but the upgrade check runs earlier when the file IS fresh), preventing the premature timeout.Testing
3339/3339 tests pass (including 236 targeted watchdog + safety tests).
Fixes #585