fix(wall): break SIGPROF self-chain in signal handler install by szegedi · Pull Request #319 · DataDog/pprof-nodejs

szegedi · 2026-05-04T14:20:14Z

Summary

Fix infinite recursion in SignalHandler::HandleProfilerSignal observed under dd-trace-js + Node.js 20 on x86_64 Linux as ~430 stack frames of HandleProfilerSignal calling itself, with each frame paused right after the indirect callq *%r14 (i.e. the call to old_handler_func_).

Root cause

SignalHandler::Install() else-branch captures whatever was the previous SIGPROF disposition into old_handler_func_:

installed_ = (sigaction(SIGPROF, &sa, &old_handler_) == 0);
old_handler_func_.store(old_handler_.sa_sigaction, std::memory_order_relaxed);

If, at that moment, SIGPROF was already pointing at &HandleProfilerSignal while installed_ == false, we end up storing our own address into old_handler_func_, and the next signal delivery loops forever via:

auto old_handler = old_handler_func_.load(...);
if (!old_handler) return;
...
old_handler(sig, info, context);  // recurses into HandleProfilerSignal

The state (installed_ == false, SIGPROF == &HandleProfilerSignal) requires a non-pprof party to write &HandleProfilerSignal back into SIGPROF after our Restore() cleared installed_. Static analysis of single-instance pprof + v8 SIGPROF interaction (StartInternal always runs cpuProfiler_->Start — which is what may trigger V8's SignalHandler::IncreaseSamplerCount → V8.Install — before pprof.IncreaseUseCount) shows that V8's old_signal_handler_ can never end up holding &HandleProfilerSignal, because V8.Install only runs when V8's process-wide client_count_ transitions 0→1, and at that point pprof has not yet installed in this lifecycle. So V8.Restore can never write us back into SIGPROF.

The realistic trigger is therefore two copies of @datadog/pprof loaded in the same process (the dlopen-key for native addons is the file path, so a bundled copy under node_modules/dd-trace/node_modules/@datadog/pprof plus a top-level dep at node_modules/@datadog/pprof are two distinct instances with independent static state). With that, the sequence

A.Install      ; SIGPROF=A, installed_A=true,  pp_old_A=DEFAULT
B.Install      ; SIGPROF=B, installed_B=true,  pp_old_B=A
A.Restore      ; SIGPROF=DEFAULT, installed_A=false
B.Restore      ; SIGPROF=A,       installed_B=false   ← B writes A back
A.Install (2)  ; captures A as its own old_handler        ← self-loop

leaves A's old_handler_func_ pointing at A's own HandleProfilerSignal. A non-pprof third-party SIGPROF user that captured pprof while pprof was active and later restores it produces the same shape.

Fix

Refuse to chain to ourselves in the install path. If the captured previous disposition is &HandleProfilerSignal, store nullptr in old_handler_func_ so the handler's existing if (!old_handler) return; fast-path kicks in. The signal is dropped instead of recursing — same observable behavior the handler already has when SIGPROF was previously unhandled.

This is a strict improvement: in the two-instance case, the other loaded copy still chains to us once via its own old_handler_func_, but a single hop is harmless; only the self-loop runs away.

Test plan

Builds cleanly with npm run build.
Existing test suite still passes.
Manual reproduction (multi-instance @datadog/pprof load and verify no recursion).

Jira: [PROF-14467]

If SIGPROF is already pointing at HandleProfilerSignal when Install() runs the else-branch (installed_ == false), the captured "previous" handler is our own function. Storing that into old_handler_func_ turns the next signal delivery into infinite recursion via the chained call in HandleProfilerSignal — observed in the wild as ~430 stack frames of HandleProfilerSignal calling itself before the thread blows its stack. Refuse to chain to ourselves: if the captured previous disposition is &HandleProfilerSignal, store nullptr instead, so the handler fast-paths out via the existing `if (!old_handler) return;` check.

github-actions · 2026-05-04T14:20:54Z

Overall package size

Self size: 2.01 MB
Deduped: 2.38 MB
No deduping: 2.38 MB

Dependency sizes

| name | version | self size | total size | |------|---------|-----------|------------| | source-map | 0.7.6 | 185.63 kB | 185.63 kB | | pprof-format | 2.2.1 | 163.06 kB | 163.06 kB | | node-gyp-build | 4.8.3 | 13.81 kB | 13.81 kB |

_{🤖 This report was automatically generated by heaviest-objects-in-the-universe}

nsavoire · 2026-05-04T14:35:09Z

+      // SIGPROF user restored its own saved disposition — which it had
+      // captured while we were the active handler — after we Restore()'d),
+      // calling prev would loop forever on the next signal.
+      old_handler_func_.store(prev == &HandleProfilerSignal ? nullptr : prev,


If old_handler_func_ is null then we won't be able to forward SIGPROF to v8 profiler and profiling won't work, maybe we should return an error in StartImpl (since this can only occur at profiler startup).

Okay, this required some propagation of boolean flags, but you're right that it's the actual correct solution; thanks for catching it. The whole thing is weird – I saw it in a single crash report today, and I'm not sure how it can arise; working theory is that the customer somehow managed to run the profiler twice? I really don't know. Anyhow, let me know if this now makes sense. It's a bit more complicated now 😁

Storing nullptr in old_handler_func_ avoided the runaway recursion but left the wall profiler's SIGPROF handler unable to forward signals to v8's sampler, which silently produces empty profiles. Plumb the install result out through IncreaseUseCount and StartInternal so StartImpl can roll back the just-started v8 cpu profile and return a Result error to the JS caller. The condition can only arise on the very first install (installed_ == false branch), so StopCore's restart path cannot hit it.

szegedi · 2026-05-04T15:06:37Z

  auto startProcessCpuTime = startProcessCpuTime_;

  if (restart) {
-    profileId_ = StartInternal();


We're capturing profileId_ in StartInternal() now so the rollback on line 914 has the right id to stop.

szegedi requested review from IlyasShabi, nsavoire and r1viollet as code owners May 4, 2026 14:20

nsavoire reviewed May 4, 2026

View reviewed changes

szegedi commented May 4, 2026

View reviewed changes

szegedi added the semver-patch Bug or security fixes, mainly label May 4, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(wall): break SIGPROF self-chain in signal handler install#319

fix(wall): break SIGPROF self-chain in signal handler install#319
szegedi wants to merge 2 commits intomainfrom
fix/signal-handler-self-chain

szegedi commented May 4, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 4, 2026 •

edited

Loading

Uh oh!

nsavoire May 4, 2026

Uh oh!

szegedi May 4, 2026

Uh oh!

szegedi May 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

szegedi commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Root cause

Fix

Test plan

Uh oh!

github-actions Bot commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overall package size

Uh oh!

nsavoire May 4, 2026

Choose a reason for hiding this comment

Uh oh!

szegedi May 4, 2026

Choose a reason for hiding this comment

Uh oh!

szegedi May 4, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

szegedi commented May 4, 2026 •

edited

Loading

github-actions Bot commented May 4, 2026 •

edited

Loading