fix(core): guard shutdown handler against re-entry and ensure termination via SIGKILL#97
Draft
shibukazu wants to merge 6 commits into
Draft
fix(core): guard shutdown handler against re-entry and ensure termination via SIGKILL#97shibukazu wants to merge 6 commits into
shibukazu wants to merge 6 commits into
Conversation
… termination Problem ------- When an AI agent SDK (e.g. the Codex or Pi SDK) registers its own SIGINT/SIGTERM listener, `installShutdownHandlers` correctly flushes active runs but then defers exit because `process.listenerCount(signal) > 1`. If the co-listener never calls `process.exit()` — as is the case with some agent SDKs during long-running inference calls — the CLI hangs indefinitely after Ctrl+C. A secondary issue: if SIGINT and SIGTERM arrive in rapid succession (e.g. a process manager sends SIGTERM right after the user hits Ctrl+C), the handler fires twice, potentially corrupting run metadata that was already written on the first invocation. Fix --- 1. Add a `shutdownStarted` boolean guard that causes the handler to return immediately on re-entry. 2. After `flushActiveRuns()` completes its synchronous writes, send SIGKILL to the calling process's process group (`process.kill(0, "SIGKILL")`). SIGKILL cannot be caught or deferred, so this guarantees the CLI and any agent child-processes it spawned are terminated even when co-listeners are present. Run metadata has already been persisted at this point, so there is no data loss. A `catch` block falls back to `process.exit()` for restricted environments where `kill(0)` is not permitted. Tests ----- - Spawn child processes with `detached: true` so `kill(0, "SIGKILL")` inside the child kills only its own process group and does not propagate back to the Vitest runner. - Update exit expectations from numeric codes (130/143) to `signal === "SIGKILL"`, reflecting the new termination path. - Replace the "defers to co-listener" test with a "still kills even when co-listener hangs" test that directly covers the regression.
|
@shibukazu is attempting to deploy a commit to the Vercel Labs Team on Vercel. A member of the Team first needs to authorize it. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changed
Added a
shutdownStartedre-entry guard toinstallShutdownHandlersand replaced the conditionalprocess.exit()withprocess.kill(0, "SIGKILL")afterflushActiveRuns().Why
When an agent SDK (e.g. Codex, Pi) registers its own
SIGINT/SIGTERMlistener, the handler previously deferred exit becauselistenerCount > 1. If the co-listener never callsprocess.exit(), the CLI hangs indefinitely after Ctrl+C.process.kill(0, "SIGKILL")sends SIGKILL to the entire process group — catching co-listeners that never exit — whileflushActiveRuns()has already persisted run state synchronously, so there is no data loss.Verification
Notes for reviewer