fix(js): prevent pool slot starvation under load#6896
fix(js): prevent pool slot starvation under load#6896AuditeMarlow wants to merge 1 commit intoprojectdiscovery:devfrom
Conversation
WalkthroughPropagates caller context into JS execution by setting opts.Context, replaces pool Add/Done with context-aware AddWithContext, and adds atomic guards plus watchdog goroutines to ensure pooled and non-pooled slots are released exactly once on cancellation or deadline. Changes
Sequence DiagramsequenceDiagram
participant Client
participant Executor
participant Pool
participant Watchdog
participant Cleanup
Client->>Executor: ExecuteProgram(request)
Executor->>Executor: derive ctx (with timeout/deadline)
Executor->>Pool: AddWithContext(opts.Context)
alt slot acquired
Pool-->>Executor: slot acquired
Executor->>Watchdog: start cancellation watcher (goroutine)
Executor->>Executor: run JS runtime using opts.Context
par normal run
Executor->>Executor: execute script
and cancellation path
Watchdog->>Watchdog: wait for ctx.Done() or done signal
alt ctx.Done()
Watchdog->>Pool: call Done() (CAS guard ensures single call)
end
end
Executor->>Cleanup: deferred cleanup
Cleanup->>Pool: call Done() if not already (CAS)
Cleanup->>Watchdog: close done channel
Cleanup-->>Client: return result
else AddWithContext fails
Pool-->>Executor: error (deadline/full)
Executor-->>Client: return error
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 3 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
No actionable comments were generated in the recent review. 🎉 🧹 Recent nitpick comments
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
1618147 to
4a26c99
Compare
Zombie goroutines from timed-out JS executions held pool slots indefinitely: Add() blocked with context.Background(), and defer Done() only ran when the goroutine eventually completed. Under load, both pools (80 pooled + 20 non-pooled slots) filled with zombies, silently dropping all subsequent matches. Three changes fix slot lifecycle management: 1. Propagate the 20s deadline context into ExecuteProgram (compiler.go) so both execution paths can respect the deadline. 2. Replace Add() with AddWithContext(ctx) in both pool.go and non-pool.go so goroutines waiting for a slot fail fast when the deadline expires instead of blocking indefinitely. 3. Add a watchdog goroutine that releases the slot when the deadline expires, even if the zombie is still running. An atomic.Bool ensures exactly one Done() call between the watchdog and the normal defer path.
4a26c99 to
6f7f4b7
Compare
Proposed changes
Partially addresses #6894 — JS protocol templates silently drop all matches
under load. This PR fixes pool slot starvation (the primary cause); a follow-up
is needed to propagate execution contexts into JS library network calls.
Root cause: zombie goroutines from timed-out JS executions hold pool slots
indefinitely.
compiler.gocreates a 20s deadline context but passes theoriginal (no-deadline) context to
ExecuteProgram. Bothpooljsc.Add()andephemeraljsc.Add()block withcontext.Background(), so goroutines waitingfor a slot never fail fast. Once a slot is acquired, JS library network calls
use
context.TODO()— they hang on non-matching targets with no deadline.ExecFuncWithTwoReturnsabandons the caller after 20s, but the goroutinecontinues holding its pool slot via
defer Done(). Both pools (80 pooled + 20non-pooled slots) fill with zombies; subsequent executions time out waiting for
slots.
Three changes fix slot lifecycle management:
Propagate the deadline context (
compiler.go): setopts.Context = ctxbefore calling
ExecuteProgramso the 20s deadline flows through to bothexecution paths.
Fail-fast slot acquisition (
pool.go,non-pool.go): replaceAdd()with
AddWithContext(ctx). Goroutines waiting for a slot return immediatelywhen the deadline expires instead of blocking indefinitely.
Watchdog goroutine (
pool.go,non-pool.go): a background goroutinelistens on
ctx.Done()and releases the pool slot when the deadline expires,even if the zombie is still running. An
atomic.Boolensures exactly oneDone()call between the watchdog and the normal defer path.Note: this PR does not address
context.TODO()usage in the 16 JS libraryfiles. That is a follow-up concern. The watchdog already prevents permanent pool
starvation; context propagation into JS libraries would make zombies exit faster.
Proof
Setup
1. Start a Redis container (no password):
2. Save as
listeners.pyand start 25 TCP listeners (15s response delay):Each listener accepts a connection, waits 15 seconds, then responds with
garbage. The 15s delay is within Nuclei's default JS execution timeout (20s)
but long enough to hold pool slots. Open ports cause
isPortOpenpre-conditionsto pass, so JS templates execute against all listeners and consume pool slots.
listeners.py
3. Generate target list (25 haystack + 1 Redis last):
Baseline — single target (identical on both dev and fix)
7 matches.
Under load — dev branch (567794f), 3 consecutive runs
0 matches every time. All results silently dropped.
Under load — this branch, 3 consecutive runs
9–33 matches per run (vs 0 on dev). All three runs find
redis-info(Export, pooled path).
redis-default-logins(non-Export, 20-slot path)recovered in 2 of 3 runs. The remaining variance is caused by
context.TODO()in JS library network calls — zombie goroutines still hold slots for up to 15s
(the listener delay) before completing, limiting throughput on the crowded
20-slot non-pooled path. That is a follow-up fix.
Unit tests
Three new tests in
pkg/js/compiler/pool_test.goverify the fix mechanics:TestAddWithContextRespectsDeadline— slot acquisition fails fast whendeadline expires (vs blocking indefinitely with
Add())TestWatchdogReleasesSlotOnDeadline— watchdog frees a slot held by a zombie,allowing a new execution to acquire it
TestPoolExhaustionRecovery— full starvation/recovery cycle: all slotsfilled with zombies, deadlines expire, subsequent acquisitions succeed
Checklist
Summary by CodeRabbit
Improvements
Tests