Skip to content

fix(shell): honor AbortSignal so --timeout cancels in practice (swamp-club#247)#1317

Closed
stack72 wants to merge 3 commits intomainfrom
worktree-transient-foraging-falcon
Closed

fix(shell): honor AbortSignal so --timeout cancels in practice (swamp-club#247)#1317
stack72 wants to merge 3 commits intomainfrom
worktree-transient-foraging-falcon

Conversation

@stack72
Copy link
Copy Markdown
Contributor

@stack72 stack72 commented May 5, 2026

Summary

Closes swamp-club#247.

PR #1305 added --timeout to model method run and workflow run but
documented it as cooperative-only because built-in models did not honor
the abort signal. This wires it through in two layers so the flag works
in practice and surfaces the cancelled envelope #1305 originally
promised.

Layer 1 — built-in honoring AbortSignal

  • command/shell passes context.signal into executeProcess and
    re-throws AbortError ahead of its silent-swallow paths so the abort
    is not buried as exitCode = -1.
  • executeProcess uses Deno.Command's native signal option (Deno
    2.7.14 sends SIGTERM, matching the existing timeout-path semantics)
    and explicitly normalizes signal-driven kills to AbortError on all
    three execution branches (streaming, buffered+timeoutMs, simple
    buffered). The simple-buffered branch switches from command.output()
    to command.spawn() + process.output() because the former silently
    ignores CommandOptions.signal under Deno 2.7.14.

Layer 2 — driver-layer error-type preservation

Without this, the abort works but the JSON envelope reads
code: "method_execution_failed" instead of cancelled, because the
driver wire boundary flattens exceptions to a string.

  • ExecutionResult gains a cancelled?: boolean flag.
  • RawExecutionDriver sets it when the failure was an AbortError.
  • DefaultMethodExecutionService re-throws as
    DOMException(\"AbortError\") when driverResult.cancelled is set
    (both call sites) so libswamp's run.ts handler routes it to
    cancelled().

End-to-end output is now:

```json
{
"error": "Operation was cancelled.",
"code": "cancelled"
}
```

Help text

Dropped the "Cooperative — only honored by methods that check
AbortSignal" caveat on --timeout for both model method run and
workflow run. Built-ins now honor it transparently; extension-model
methods still must check ctx.signal.

Test Plan

  • 3 new shell_model_test.ts tests: AbortError surfaces, no data
    record on abort, ignoreExitCode does not swallow AbortError
  • 2 new process_executor_test.ts tests for the buffered branches
    (the streaming-mode signal test was already there)
  • 2 new raw_execution_driver_test.ts tests for the
    cancelled-flag contract
  • New integration/shell_timeout_test.ts drives the CLI binary
    against sleep 30 with --timeout 1s and asserts
    code: \"cancelled\"
  • `deno check` / `deno lint` / `deno fmt --check` clean
  • Full suite: 5468 passed, 0 failed, 28 ignored
  • Manual scratch-repo: `swamp model method run sleep10 execute
    --input run='sleep 10' --timeout 1s --json` aborts in ~1s
    (was 10.5s) with the cancelled envelope above

Follow-up

systeminit/swamp-uat's
`tests/cli/adversarial/method_run_timeout_test.ts` works around this
bug today by using a custom `abort-aware-sleep` extension fixture
instead of `command/shell`. After this PR lands and a swamp release is
tagged, that test should be updated to also exercise `command/shell`
directly. The existing fixture-based variant should stay as an
extension-model signal-handling regression guard.

🤖 Generated with Claude Code

stack72 and others added 3 commits May 5, 2026 20:10
Closes swamp-club#247.

PR #1305 added `--timeout` to `model method run` and `workflow run` but
documented it as cooperative-only because built-in models did not honor
the abort signal. This wires it through in two layers so the flag works
in practice and surfaces the `cancelled` envelope the original PR
promised.

Layer 1 — built-in honoring AbortSignal:
- `command/shell` now passes `context.signal` into `executeProcess` and
  re-throws AbortError ahead of the catch-block swallow paths so the
  abort is not buried as `exitCode = -1`.
- `executeProcess` uses Deno.Command's native `signal` option (Deno
  2.7.14 sends SIGTERM, matching the existing timeout-path semantics)
  and explicitly normalizes signal-driven kills to AbortError on all
  three execution branches (streaming, buffered+timeoutMs, simple
  buffered). The simple-buffered branch switches from `command.output()`
  to `command.spawn() + process.output()` because the former silently
  ignores `CommandOptions.signal` under Deno 2.7.14.

Layer 2 — driver-layer error-type preservation:
- `ExecutionResult` gains a `cancelled?: boolean` flag.
- `RawExecutionDriver` sets it when the failure was an AbortError, so
  the type information survives the wire boundary that flattens
  exceptions to a string.
- `DefaultMethodExecutionService` re-throws as `DOMException("AbortError")`
  when `driverResult.cancelled` is set (both call sites) so libswamp's
  `run.ts` handler routes it to `cancelled()` instead of
  `methodExecutionFailed()`.

Tests:
- Unit: 3 new shell_model tests (AbortError surfaces, no data record on
  abort, ignoreExitCode does not swallow AbortError); 2 new
  process_executor tests for the buffered branches; 2 new raw driver
  tests for the cancelled-flag contract.
- Integration: new `integration/shell_timeout_test.ts` drives the CLI
  binary against `sleep 30` with `--timeout 1s` and asserts `code:
  "cancelled"` in the JSON envelope.

Help text on `model method run --timeout` and `workflow run --timeout`
updated to drop the "Cooperative — only honored by methods that check
AbortSignal" caveat now that built-ins honor it transparently;
extension-model methods still must check `ctx.signal`.

Follow-up: swamp-uat's
`tests/cli/adversarial/method_run_timeout_test.ts` works around this
bug today by using a custom abort-aware-sleep extension fixture
instead of `command/shell`. After this lands and a swamp release is
tagged, that test should be updated (alongside the existing fixture-
based variant, which remains useful as an extension-model regression
guard) to also exercise `command/shell` directly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Linux CI revealed that Deno.Command's native `signal` option does not
reliably propagate `AbortSignal.any([signal, AbortSignal.timeout(ms)])`
— the exact pattern libswamp's `withTimeout` produces. Direct
AbortControllers worked (so unit tests passed on Linux) but the
integration test took the full 30s sleep instead of aborting at 1s.

Replace the native option with a manual `addEventListener('abort')` +
`process.kill('SIGTERM')` pattern, extracted into `attachSignalKill`
and applied to all three executeProcess branches. Same kill semantics
(SIGTERM), same surface contract (AbortError on signal-driven kill),
but works uniformly across platforms.

Add a regression test that uses the exact `AbortSignal.any + timeout`
shape so a future "simplification" back to native-signal-only fails at
the unit level rather than in flaky integration coverage.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Linux CI failure root cause: dash forks rather than execs for single-command
`sh -c` invocations, so SIGTERM to the parent shell leaves an orphaned
grandchild holding the stdout/stderr pipes open. macOS bash exec's the
single command (parent PID becomes the command) so the orphan never appears
— which is why the bug only surfaced on Linux CI.

When the orphan keeps the write end of the pipe alive, the deno-side
read loops in `streamLines` never get EOF, and the wider Promise.all
over [stdout, stderr, status] blocks until the orphan exits naturally
(30s for `sleep 30`).

Fix: thread the abort signal into `streamLines` so each `reader.read()`
races against `signal.aborted`. When the signal fires, the loop
releases the reader and returns regardless of pipe state.

Add a regression test that runs `/bin/dash -c 'sleep 30'` with a 500ms
abort and asserts elapsed < 5s. dash is reliably present on Linux CI
and on macOS too, so this exact failure mode is now caught at unit-test
time rather than only in CI integration.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@stack72
Copy link
Copy Markdown
Contributor Author

stack72 commented May 5, 2026

Closing — see swamp-club#247 for diagnosis writeup. The wiring fix works in production but the test surface needed (cross-platform shell fork behavior, sanitizer cleanup) is bigger than #247 was scoped for and deserves its own design pass.

@stack72 stack72 closed this May 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant