(test-infrastructure) [Tech Debt]: Vitest fork workers become orphaned processes on test interruption

## Problem Description

Vitest fork workers are not properly cleaned up when test runs are interrupted or exit unexpectedly, resulting in orphaned processes (PPID 1) that persist indefinitely after the parent test runner exits.

## Environment

- **Vitest Version**: 3.2.4
- **Node Version**: Current system
- **OS**: macOS (Darwin 24.6.0)
- **Pool Configuration**: `forks` with `maxForks: 4`

## Evidence

Discovered 7 running Vitest processes when configuration limits to 4:

```bash
$ ps -p 38327,38328,38329,38330,40284,41935,41936 -o pid,ppid,lstart,command
PID  PPID STARTED                      COMMAND
38327     1 Wed Nov 19 08:54:28 2025     node (vitest 1)     
38328     1 Wed Nov 19 08:54:28 2025     node (vitest 2)     
38329     1 Wed Nov 19 08:54:28 2025     node (vitest 3)     
38330     1 Wed Nov 19 08:54:28 2025     node (vitest 4)     
40284     1 Wed Nov 19 08:54:40 2025     node (vitest 3)     
41935     1 Wed Nov 19 08:54:46 2025     node (vitest 2)     
41936     1 Wed Nov 19 08:54:46 2025     node (vitest 1)
```

**Analysis:**
- All processes show **PPID 1** (init/launchd) = orphaned from dead parent
- Three distinct batches from three separate interrupted test runs:
  - First run (08:54:28): 4 workers (vitest 1-4) - all orphaned
  - Second run (08:54:40): 1 orphaned worker (vitest 3)
  - Third run (08:54:46): 2 orphaned workers (vitest 1-2)

## Current Configuration

```javascript
// vitest.config.js:18-29
export default defineConfig({
  test: {
    pool: "forks",
    poolOptions: {
      forks: {
        maxForks: 4,  // Controlled parallelism
        minForks: 1,
      },
    },
    forceExit: true,  // Only affects main process, not workers
    testTimeout: 10000,
    hookTimeout: 10000,
  },
});
```

## Root Cause

1. **Worker processes don't receive termination signals** when parent Vitest process exits unexpectedly (Ctrl+C, crash, forced exit)
2. `forceExit: true` only applies to the **main Vitest process**, not fork workers
3. Process cleanup handlers in test setup files run **within workers** (not in parent), so they can't kill sibling workers
4. Known upstream issue: https://github.com/vitest-dev/vitest/issues/3909

## Current Cleanup Logic (Insufficient)

```javascript
// test/setup.js:14-48
global.childProcesses = [];

const cleanup = () => {
  for (const child of global.childProcesses) {
    if (child && !child.killed) {
      child.kill("SIGTERM");
    }
  }
};

process.on("exit", cleanup);
process.on("SIGINT", () => { cleanup(); process.exit(0); });
process.on("SIGTERM", () => { cleanup(); process.exit(0); });
```

**Why this doesn't work:**
- Only tracks processes manually added to `global.childProcesses`
- Vitest spawns fork workers directly (never added to our tracking)
- Cleanup runs inside each worker process, can't access sibling workers
- Exit handlers may not fire during forceful termination (SIGKILL, crashes)

## Impact

- **Resource waste**: Orphaned processes consume memory/CPU indefinitely
- **Port conflicts**: Workers may hold resources (ports, file locks)
- **Developer confusion**: "Why do I have 7 processes when config says 4?"
- **CI resource accumulation**: Multiple builds create exponential orphan growth
- **False test failures**: Orphaned workers may interfere with new test runs

## Workarounds

### Immediate: Manual cleanup
```bash
pkill -9 -f "node \\(vitest"
```

### Short-term: Pre-test cleanup hook
```json
{
  "scripts": {
    "pretest": "pkill -9 -f 'node (vitest' || true",
    "test": "vitest run"
  }
}
```

### Medium-term: Test wrapper script
```bash
#!/bin/bash
# scripts/test-wrapper.sh
trap 'pkill -9 -f "node (vitest"' EXIT INT TERM
vitest run "$@"
```

### Alternative: Switch to threads pool
```javascript
// vitest.config.js
pool: "threads",  // Better cleanup behavior, less isolation
poolOptions: {
  threads: {
    maxThreads: 4,
    minThreads: 1,
  },
}
```

**Trade-off**: Threads pool has better cleanup but less process isolation (may not work for CommonJS isolation requirements).

## Proposed Solutions

### Option 1: Pre-test cleanup (Quick fix)
Add automated cleanup before test runs to prevent accumulation.

**Pros:**
- Simple to implement
- Works immediately
- No code changes to test infrastructure

**Cons:**
- Doesn't prevent orphaning, only cleans up after
- May kill legitimate concurrent test runs
- Band-aid solution

### Option 2: Switch to threads pool (If isolation not critical)
Migrate from `pool: 'forks'` to `pool: 'threads'`.

**Pros:**
- Better cleanup behavior by default
- Lower overhead than process forks
- Vitest recommended for most use cases

**Cons:**
- Less process isolation
- May not work if CommonJS isolation is required
- Requires testing to verify no behavior changes

### Option 3: Monitor Vitest upstream (Long-term)
Track https://github.com/vitest-dev/vitest/issues/3909 for official fix.

**Pros:**
- Proper fix at root cause level
- No workarounds needed
- Benefits entire community

**Cons:**
- Timeline uncertain
- Need workaround in meantime

## Acceptance Criteria

- [ ] No orphaned Vitest processes after test runs (successful or interrupted)
- [ ] `ps aux | grep vitest` shows at most `maxForks` worker processes
- [ ] Solution works across: normal exit, Ctrl+C, process kill, crash
- [ ] CI builds don't accumulate orphaned processes over multiple runs

## References

- Vitest fork cleanup issue: https://github.com/vitest-dev/vitest/issues/3909
- Related issue: https://github.com/vitest-dev/vitest/issues/3569
- Vitest common errors: https://vitest.dev/guide/common-errors
- Configuration: `vitest.config.js:18-29`
- Cleanup logic: `test/setup.js:14-48`

## Questions for Discussion

1. Is CommonJS isolation via forks required for our tests?
2. Should we switch to threads pool if isolation isn't critical?
3. Implement pre-test cleanup as interim solution?
4. Add CI monitoring to detect orphaned process accumulation?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

(test-infrastructure) [Tech Debt]: Vitest fork workers become orphaned processes on test interruption #13

Problem Description

Environment

Evidence

Current Configuration

Root Cause

Current Cleanup Logic (Insufficient)

Impact

Workarounds

Immediate: Manual cleanup

Short-term: Pre-test cleanup hook

Medium-term: Test wrapper script

Alternative: Switch to threads pool

Proposed Solutions

Option 1: Pre-test cleanup (Quick fix)

Option 2: Switch to threads pool (If isolation not critical)

Option 3: Monitor Vitest upstream (Long-term)

Acceptance Criteria

References

Questions for Discussion

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

(test-infrastructure) [Tech Debt]: Vitest fork workers become orphaned processes on test interruption #13

Description

Problem Description

Environment

Evidence

Current Configuration

Root Cause

Current Cleanup Logic (Insufficient)

Impact

Workarounds

Immediate: Manual cleanup

Short-term: Pre-test cleanup hook

Medium-term: Test wrapper script

Alternative: Switch to threads pool

Proposed Solutions

Option 1: Pre-test cleanup (Quick fix)

Option 2: Switch to threads pool (If isolation not critical)

Option 3: Monitor Vitest upstream (Long-term)

Acceptance Criteria

References

Questions for Discussion

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions