Skip to content

fix(tests): disable fileParallelism to eliminate subprocess SIGKILL flakiness#259

Open
flightlesstux wants to merge 1 commit intomksglu:mainfrom
flightlesstux:fix/test-suite-parallel-subprocess-sigkill
Open

fix(tests): disable fileParallelism to eliminate subprocess SIGKILL flakiness#259
flightlesstux wants to merge 1 commit intomksglu:mainfrom
flightlesstux:fix/test-suite-parallel-subprocess-sigkill

Conversation

@flightlesstux
Copy link
Copy Markdown
Contributor

Fixes #258

Summary

  • Root cause confirmed: spawnSync subprocesses in hook suites were externally SIGKILL'd (status=null, signal=SIGKILL, stderr="") when vitest ran their files concurrently across fork workers — signal leaks out of worker-teardown under load.
  • Set fileParallelism: false so files execute sequentially. Tests within a file already ran sequentially (spawnSync is synchronous), so the only thing sacrificed is inter-file concurrency.
  • Dropped the CI-only retry: 2 which was masking the race rather than fixing it.

Diagnostic Evidence

Single-file runs of every failing suite: green.
Full suite baseline: 6 failed / 45, 12 tests fail, 3 consecutive runs.
With fileParallelism: false: 45 passed / 45, 3 consecutive runs (36s vs ~10s).
Signal was captured by augmenting runHook temporarily — subprocess died before producing any output.

Trade-off

Local wall-time increases from ~10s to ~36s on an 8-CPU machine. Acceptable for a <1-minute suite in exchange for deterministic green runs on every machine.

A more surgical alternative (vitest projects splitting hook-subprocess suites into their own sequential pool) was considered and rejected — more config surface for a suite that already completes in under a minute.

Test plan

  • npx vitest run — 3 consecutive full runs, 44/44 files pass, 0 flakes
  • No source code touched — config-only change
  • CI run will validate on Linux once this PR opens

Co-authored-by: Ercan Ermis eposta@ercanermis.com

Several suites spawn Node subprocesses via spawnSync (hook runners) that
load the better-sqlite3 native addon. When vitest ran those files
concurrently across fork workers, child processes were intermittently
SIGKILL'd (empty stdout/stderr, status=null) — signal propagation from
worker-teardown under load.

Set `fileParallelism: false` so files run sequentially. Tests within a
file already run sequentially (spawnSync is synchronous), so the only
loss is inter-file concurrency. Also drop the CI-only `retry` that was
papering over the symptom.

Verified: 3/3 full runs green (44/44 files, ~36s wall time).

Fixes mksglu#258

Co-authored-by: Ercan Ermis <eposta@ercanermis.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Flaky test suite: ~12 tests SIGKILL'd under parallel fork pool (spawnSync + better-sqlite3)

1 participant