Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 24 additions & 0 deletions documentation/PULL_REQUESTS/2026-05-01-fix-flaky-1mb-test.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
## Problem

During the `v1.1.0` release, one test flaked once on CI — a test that checks we can stream about 1 MB of output through the wrapper without losing bytes. It passed on retry and has passed on every run since, but flaky release-gate tests are dangerous: they either block shipping or erode confidence in the gate.

Investigation revealed this wasn't just test flakiness — it was a real production bug. When our wrapper finishes a command, it writes a final newline and then exits. But Node's `process.exit()` is synchronous and doesn't wait for pipe buffers to drain. Under heavy load (~1 MB of prior output), the final bytes could still be in-flight to the operating system when the process tears down, getting silently dropped. A user running something like `fullcontext npm test > output.txt` could end up with a truncated file missing its last bytes — rarely, but possibly.

## Solution

Two fixes, both shipped on this branch:

1. **Production fix (the real blocker):** Stop calling `process.exit()` after writing output. Instead, set the exit code and let Node exit naturally once all buffered writes have reached the operating system. This is the pattern Node's own documentation recommends for exactly this bug class.

2. **Test hardening (defense in depth):** Update the three streaming tests to wait for each stream to fully end before asserting on collected output, rather than waiting only for the child process to close. These races could theoretically produce false failures under different loads. The producer must emit correctly; the consumer must read everything emitted. Both sides are now right.

## Impact

- Eliminates a rare but real data-loss bug in production output.
- Fixes the CI flake at the gate, not just masks it.
- All 35 tests still pass, locally and in CI, across both fixes.

# Credits

- Nabs (Architect)
- JENA (Lead Developer)
76 changes: 49 additions & 27 deletions src/cli.test.ts
Original file line number Diff line number Diff line change
@@ -1,7 +1,9 @@
import { test } from 'node:test';
import { strict as assert } from 'node:assert';
import { spawn, spawnSync } from 'node:child_process';
import { once } from 'node:events';
import * as path from 'node:path';
import type { Readable } from 'node:stream';

const CLI = path.resolve(__dirname, 'index.js');

Expand All @@ -20,6 +22,26 @@ function runCli(command: string) {
};
}

/**
* Collect every byte emitted by a Readable stream and return the concatenated
* Buffer. Attaches a 'data' listener synchronously, then awaits the 'end'
* event so the returned Buffer is guaranteed to contain every chunk the
* stream ever emits.
*
* Prefer this over `Buffer.concat(chunks)` gated on `child.on('close')`:
* 'close' on the child process can fire while the parent-side Readable still
* has queued 'data' events, which causes intermittent truncation under load.
* 'end' fires only after the last 'data' event has been dispatched.
*/
async function collectStream(stream: Readable): Promise<Buffer> {
const chunks: Buffer[] = [];
stream.on('data', (chunk: Buffer) => {
chunks.push(chunk);
});
await once(stream, 'end');
return Buffer.concat(chunks);
}

test('transforms multi-line stdout into single line', () => {
const r = runCli('printf "a\\nb\\nc\\n"');
assert.equal(r.stdout, '[1] a [2] b [3] c\n');
Expand Down Expand Up @@ -70,11 +92,14 @@ test('streams stdout incrementally', async () => {
const exitTime = new Promise<number>((resolve) => {
child.on('close', () => resolve(Date.now()));
});
// collectStream awaits 'end', so the returned Buffer contains every chunk.
const outputPromise = collectStream(child.stdout);

const chunks: Buffer[] = [];
child.stdout.on('data', (c) => chunks.push(c));

const [t1, t2] = await Promise.all([firstChunkTime, exitTime]);
const [t1, t2, outputBuf] = await Promise.all([
firstChunkTime,
exitTime,
outputPromise,
]);

// The first chunk must arrive meaningfully before exit.
// Use a 200ms margin to be robust on slow CI.
Expand All @@ -84,8 +109,7 @@ test('streams stdout incrementally', async () => {
);

// Final bytes unchanged from the batch implementation
const final = Buffer.concat(chunks).toString('utf8');
assert.equal(final, '[1] one [2] two\n');
assert.equal(outputBuf.toString('utf8'), '[1] one [2] two\n');
});

test('streams stderr incrementally and independently from stdout', async () => {
Expand All @@ -95,15 +119,16 @@ test('streams stderr incrementally and independently from stdout', async () => {
{ stdio: ['ignore', 'pipe', 'pipe'] },
);

const outChunks: Buffer[] = [];
const errChunks: Buffer[] = [];
child.stdout.on('data', (c) => outChunks.push(c));
child.stderr.on('data', (c) => errChunks.push(c));

await new Promise((resolve) => child.on('close', resolve));
// Collect both streams to their 'end' event to guarantee no queued
// 'data' chunks are missed when the child closes.
const [outBuf, errBuf] = await Promise.all([
collectStream(child.stdout),
collectStream(child.stderr),
]);
await once(child, 'close');

assert.equal(Buffer.concat(outChunks).toString('utf8'), '[1] out\n');
assert.equal(Buffer.concat(errChunks).toString('utf8'), '[1] err\n');
assert.equal(outBuf.toString('utf8'), '[1] out\n');
assert.equal(errBuf.toString('utf8'), '[1] err\n');
});

test('exits cleanly when downstream pipe closes early (EPIPE)', async () => {
Expand Down Expand Up @@ -264,16 +289,18 @@ test('streams ~1 MB of output without loss or corruption', async () => {
stdio: ['ignore', 'pipe', 'pipe'],
});

const chunks: Buffer[] = [];
child.stdout.on('data', (c: Buffer) => chunks.push(c));

const exit = await new Promise<number | null>((resolve) => {
child.on('close', (code) => resolve(code));
});
// Collect stdout and wait for the child to close in parallel. collectStream
// awaits the stream's 'end' event, which fires only after every 'data'
// event has been dispatched — eliminating the close-vs-data race that
// caused intermittent missing-trailing-newline failures on CI.
const [outputBuf, [exit]] = await Promise.all([
collectStream(child.stdout),
once(child, 'close') as Promise<[number | null]>,
]);

assert.equal(exit, 0, 'expected clean exit');

const output = Buffer.concat(chunks).toString('utf8');
const output = outputBuf.toString('utf8');

// Starts with the first line marker and no leading space.
assert.ok(
Expand All @@ -285,12 +312,7 @@ test('streams ~1 MB of output without loss or corruption', async () => {
assert.ok(output.endsWith('\n'), 'expected trailing newline');
assert.ok(!output.endsWith('\n\n'), 'expected exactly one trailing newline');

// Line-marker count equals LINE_COUNT. Each marker is `[N] ` where N is
// the 1-based index. The number of markers is the number of matches of
// the `\[\d+\] ` pattern minus the leading one (which has no space prefix)
// — actually, the count of ` \[\d+\] ` (leading-space form) plus 1 for the
// first marker. Simpler: count `] ` occurrences where the preceding chars
// look like `[N`.
// Line-marker count equals LINE_COUNT.
const markerMatches = output.match(/\[\d+\] /g) ?? [];
assert.equal(
markerMatches.length,
Expand Down
30 changes: 26 additions & 4 deletions src/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -130,7 +130,9 @@ function executeCommand(command: string): void {
// Handle spawn errors (e.g., shell not found)
child.on('error', (err: Error) => {
process.stderr.write(`[1] fullcontext: ${err.message}\n`);
process.exit(1);
// See child.on('close') below: set exitCode and let the runtime exit
// naturally so buffered stderr bytes are flushed before termination.
process.exitCode = 1;
});

// Forward SIGINT/SIGTERM to the entire child tree so a Ctrl-C in the
Expand Down Expand Up @@ -176,12 +178,32 @@ function executeCommand(command: string): void {
process.exit(0);
}

// Flush any partial lines and emit trailing newlines.
// Flush any partial lines and emit trailing newlines. These writes go
// into Node's internal stdout/stderr buffers and are drained to the OS
// asynchronously by libuv.
stdoutTransformer.end();
stderrTransformer.end();

// Preserve exit code from child process
process.exit(code ?? 1);
// Do NOT call process.exit(): it is synchronous and does not wait for
// buffered pipe writes to drain. Under load (e.g. ~1 MB of prior
// output), the final bytes — including the trailing newline written
// by stdoutTransformer.end() — can still be in Node's internal write
// buffer or libuv's write queue at the moment of exit and would be
// dropped, producing silently truncated output (observed as an
// intermittent "expected trailing newline" failure in the 1 MB
// streaming test on slow CI runners, and reproducible in the wild as
// `fullcontext big-cmd > file.txt` producing a file missing its last
// bytes).
//
// Setting process.exitCode and returning lets the runtime exit
// naturally once every pending stdout/stderr write has been accepted
// by the OS. Pending libuv write requests are tracked as active
// requests on the stdio handles and keep the event loop alive; the
// SIGINT/SIGTERM listeners registered via process.on() are backed by
// unref'd uv_signal_t handles and do not hold the loop open. This is
// the pattern Node's own docs prescribe for this exact class of bug
// (see https://nodejs.org/api/process.html#processexit ).
process.exitCode = code ?? 1;
});
}

Expand Down
Loading