Add standalone fork-instrument crash reproducer by mho22 · Pull Request #701 · Automattic/kandelo

mho22 · 2026-06-15T16:17:46Z

Summary

add benchmarks/fork/chain.c, a recursive walker shaped after Zend's zend_compile_short_circuiting so wasm-fork-instrument's per-frame cost is measurable on a binary apart from PHP itself
add benchmarks/fork/build.sh to produce baseline and fork-instrumented variants from one chain.c source
add benchmarks/fork/run.mjs to binary-search the maximum recursion depth that survives V8's call-stack budget on each variant
exit non-zero from run.mjs when forkinstr's max depth is not strictly less than baseline's — that invariant is what the benchmark exists to verify

Why

PR #669 asks for a standalone test case apart from PHP that reproduces the V8 RangeError: Maximum call stack size exceeded on deep recursion through fork-instrumented frames. Iterating on candidate fork-instrument changes via the full PHP build+run cycle takes ~45 minutes; this benchmark drops that to ~10 seconds.

The PHP crash composes two causes: unbounded recursion in zend_compile_short_circuiting (fixed PHP-side by #669) and ~8 extra wasm locals per instrumented frame from wasm-fork-instrument's switch-dispatch state machine. `chain.c` bounds recursion by its `depth` argument so it reproduces only the kandelo-side cause in isolation; any candidate reduction of the per-frame cost can be validated by watching the forkinstr depth move.

Notes

ABI impact: none. Adds a benchmark harness only; no kernel, host, or program changes.
Empirical numbers on macOS arm64 / Node v24.13.0, deterministic across 4 runs:

variant `walk` declared locals max depth before `RangeError` % of baseline

baseline 4 9 972 100%

forkinstr 12 6 648 66.7%

Differential = 3 324 frames lost (33% drop) attributable to wasm-fork-instrument's switch-dispatch state machine adding ~8 locals per frame on the recursive function. The ratio (~1.50×) is the load-bearing finding; absolute numbers are platform/tier-dependent.

Testing

bash scripts/dev-shell.sh bash benchmarks/fork/build.sh produces identical wasm sizes (3882 / 4587 bytes) and walk-locals counts (4 / 12) across runs
node --experimental-wasm-exnref benchmarks/fork/run.mjs exits 0 with baseline 9 972 / forkinstr 6 648
with the wasms manually swapped so forkinstr > baseline, the same command exits 1 with INVARIANT FAILED: ... — confirms the self-test fires on regression
bash scripts/check-abi-version.sh exits 0 (benchmark touches no kernel/host/programs/ABI)

## Summary - add `benchmarks/fork/chain.c`, a recursive walker shaped after Zend's `zend_compile_short_circuiting` so wasm-fork-instrument's per-frame cost is measurable on a binary apart from PHP itself - add `benchmarks/fork/build.sh` to produce baseline and fork- instrumented variants from one chain.c source - add `benchmarks/fork/run.mjs` to binary-search the maximum recursion depth that survives V8's call-stack budget on each variant - exit non-zero from `run.mjs` when forkinstr's max depth is not strictly less than baseline's — that invariant is what the benchmark exists to verify ## Why PR #669 asks for a standalone test case apart from PHP that reproduces the V8 `RangeError: Maximum call stack size exceeded` on deep recursion through fork-instrumented frames. Iterating on candidate fork-instrument changes via the full PHP build+run cycle takes ~45 minutes; this benchmark drops that to ~10 seconds. The PHP crash composes two causes: unbounded recursion in `zend_compile_short_circuiting` (fixed PHP-side by #669) and ~8 extra wasm locals per instrumented frame from wasm-fork-instrument's switch-dispatch state machine. `chain.c` bounds recursion by its `depth` argument so it reproduces only the kandelo-side cause in isolation; any candidate reduction of the per-frame cost can be validated by watching the forkinstr depth move. ## Testing - `bash scripts/dev-shell.sh bash benchmarks/fork/build.sh` produces identical wasm sizes (3882 / 4587 bytes) and walk-locals counts (4 / 12) across runs - `node --experimental-wasm-exnref benchmarks/fork/run.mjs` is deterministic at baseline 9 972 / forkinstr 6 648 across 4 runs on macOS arm64 / Node v24.13.0; exits 0 - with the wasms manually swapped so forkinstr > baseline, the same command exits 1 with `INVARIANT FAILED: ...` — confirms the self-test fires on regression - `bash scripts/check-abi-version.sh` exits 0 (benchmark touches no kernel/host/programs/ABI) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

PR #701's recursive MRE showed that wasm-fork-instrument's synthetic locals reduce V8's survivable recursion depth. PR #713 already removed unused catch-state locals from no-catch functions, but fork-path functions still declared frame_ptr and call_idx locals just to cache state already present in the continuation frame header. Reuse the existing frame header instead. REWIND now moves the save-buffer cursor to the active frame and loads frame.call_index from offset +4 for top-level and nested switch dispatch. UNWIND call sites write their call index into that same header field before branching to the shared postamble. The wpk_fork_* exports, save-buffer header, and per-frame offsets stay unchanged. Measured with /tmp/kandelo-pr701-mre on Node v24.15.0 after rebuilding tools/bin/wasm-fork-instrument from this branch: baseline locals=4/max_survived=9154, PR #713 catch-state-only locals=10/max_survived=6865, deeper frame-header-state locals=8/max_survived=7846. Historical bead notes recorded original instrumented locals=12/max_survived=6639 and PR #713 catch-state-only locals=10/max_survived=7469; the PR table labels run context because historical and fresh rows are not strictly cross-run comparable. Also tighten PR/reporting guidance so nontrivial runtime, ABI-adjacent, generated-code, package-artifact, or measurement-sensitive work carries explanatory commit bodies and before/after evidence tables with comparability notes. Validation run: - bash scripts/dev-shell.sh bash scripts/build-fork-instrument-tool.sh - focused switch_dispatch, instrument, and large_dispatcher tests on aarch64-apple-darwin - bash scripts/dev-shell.sh cargo test -p fork-instrument --target aarch64-apple-darwin - bash scripts/dev-shell.sh bash scripts/ci-run-test-suite.sh fork-instrument - bash scripts/dev-shell.sh bash scripts/check-abi-version.sh - git diff --check for the touched fork-instrument and reporting-guidance files Not run: broader kernel, host, libc, POSIX, Sortix, and browser suites, because this patch is isolated to wasm-fork-instrument generated Wasm shape/tests/docs/reporting guidance and does not change runtime, syscall, libc, host, package, VFS, or browser behavior. cargo fmt was not run because cargo fmt is not installed in the dev shell.

brandonpayton mentioned this pull request Jun 18, 2026

fork-instrument: reduce synthetic frame-state locals #713

Draft

brandonpayton mentioned this pull request Jun 18, 2026

fork-instrument: replay pure scalar fork inputs #714

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add standalone fork-instrument crash reproducer#701

Add standalone fork-instrument crash reproducer#701
mho22 wants to merge 1 commit into
mainfrom
benchmark-fork-instrument

mho22 commented Jun 15, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mho22 commented Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why

Notes

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

mho22 commented Jun 15, 2026 •

edited

Loading