fix(rpc-child-controller): unref detached children so parent Pi can exit (FIX-6)#3
Conversation
|
Thanks for splitting this out. The Please add focused proof before this lands:
A real The code change itself is small and I don’t see a source-level objection; this is about matching the proof floor for a process-liveness fix. |
65024d7 to
d36c619
Compare
…xit (FIX-6) Found in real-Pi smoke test (Pi 0.75.5): The RPC child controller spawns step children with detached:true (non-Win) and piped stdio. Without child.unref(), libuv tracks the ChildProcess handle as a live reference even after the child has exited cleanly and all stdio has closed. The parent Pi process therefore cannot exit after `pi -p ...` returns its terminal response. Symptom: `pi -p` hangs at the shell prompt after the model's terminal JSON. Mark had to open a new terminal after every smoke run. Fix: call this.child.unref?.() immediately after spawn. Exit/close/stderr listeners still fire after unref(); completion handling is unchanged. unref() is the standard Node.js pattern for detached:true + piped stdio when the parent uses Promise-based completion tracking instead of event-loop liveness as the wait mechanism.
d36c619 to
51e4939
Compare
|
Thanks — I updated this PR with the checked-in lifecycle proof you asked for and rebased it onto current Current state:
What changed:
this.child = this.spawnProcess(invocation.command, invocation.args, {
cwd: this.options.cwd,
shell: false,
stdio: ["pipe", "pipe", "pipe"],
detached: process.platform !== "win32",
});
this.child.unref?.();
Validation after rebasing onto current
I reproduced that baseline gate failure on current Repro commands: # PR branch
cd /tmp/pi-multiagent-upstream
git checkout pr1-fix-6-child-unref
pnpm run typecheck
pnpm test
git diff --check
pnpm run gate # expected in this env: tests pass, then baseline smoke:fake-pi /tmp assertion fails
# Baseline comparison
git checkout --detach upstream/main
pnpm run gate # expected in this env: 244/244 tests pass, then same smoke:fake-pi /tmp assertion failsI also ran independent local read-only reviews against the final branch state; no blockers were found. |
tiziano-contorno
left a comment
There was a problem hiding this comment.
Thanks — the revised branch adds the lifecycle proof requested. I verified the checked-in unref regression test, normal closeout, cancel/timeout termination paths, typecheck, full tests, smoke, pack/load checks, source-size/public-doc checks, diff whitespace, and the independent unref probe. Merging with the maintenance release.
This is one of three focused PRs splitting out the usable parts of #2 against current
main, as requested in your review.What
Single-line fix in
extensions/multiagent/src/rpc-child-controller.ts(line 58):this.child = this.spawnProcess(invocation.command, invocation.args, { cwd: this.options.cwd, shell: false, stdio: ["pipe", "pipe", "pipe"], detached: process.platform !== "win32" }); +this.child.unref?.();Why
Found in real-Pi smoke testing on Pi 0.75.5.
RpcChildControllerspawns step children withdetached: true(non-Windows) and piped stdio. Withoutchild.unref(), libuv keeps the parent's event loop alive on the dead-childChildProcesshandle even after the child has exited cleanly and stdio has closed.Symptom:
pi -p ...hangs at the shell prompt after the model's terminal JSON.The fix mirrors the existing
timeoutTimer.unref?.()pattern at the same call site (line 62 inrpc-child-controller.ts) andexitCloseTimer.unref?.()(line 299).Required checks against this branch
npx tsc --noEmitnode --test tests/*.test.tsnode tests/check-source-size.tsnode tests/check-public-docs.tsgit diff --check HEAD~1..HEADSide-finding (out-of-scope but worth flagging)
tests/smoke-fake-pi.tsfails on cleanorigin/mainwith an assertion mismatch:# agent_team terminal notice|Objective:|...untrusted status evidence; run_status/step_result for artifactsThis appears to be a stale test expectation against a recent rendering refactor, completely independent of this PR. Filing separately if you'd like, or happy to include a small follow-up PR for it.
Discipline notes
## Unreleasedonly (1 sentence)fix(rpc-child-controller): ...)Related
Closes a portion of #2 (separately scoped). Two more focused PRs follow:
GRAPH_BODY_FIELDStrimoutputLimitknobs as a small featureThree proposal-issues for the larger features (scheduling, persistent state + reattach, worktree isolation) will be opened separately so we can agree on shape before code, as you requested.