fix(js): prevent pool slot starvation under load by AuditeMarlow · Pull Request #6896 · projectdiscovery/nuclei

AuditeMarlow · 2026-02-13T12:51:45Z

Proposed changes

Partially addresses #6894 — JS protocol templates silently drop all matches
under load. This PR fixes pool slot starvation (the primary cause); a follow-up
is needed to propagate execution contexts into JS library network calls.

Root cause: zombie goroutines from timed-out JS executions hold pool slots
indefinitely. compiler.go creates a 20s deadline context but passes the
original (no-deadline) context to ExecuteProgram. Both pooljsc.Add() and
ephemeraljsc.Add() block with context.Background(), so goroutines waiting
for a slot never fail fast. Once a slot is acquired, JS library network calls
use context.TODO() — they hang on non-matching targets with no deadline.
ExecFuncWithTwoReturns abandons the caller after 20s, but the goroutine
continues holding its pool slot via defer Done(). Both pools (80 pooled + 20
non-pooled slots) fill with zombies; subsequent executions time out waiting for
slots.

Three changes fix slot lifecycle management:

Propagate the deadline context (compiler.go): set opts.Context = ctx
before calling ExecuteProgram so the 20s deadline flows through to both
execution paths.
Fail-fast slot acquisition (pool.go, non-pool.go): replace Add()
with AddWithContext(ctx). Goroutines waiting for a slot return immediately
when the deadline expires instead of blocking indefinitely.
Watchdog goroutine (pool.go, non-pool.go): a background goroutine
listens on ctx.Done() and releases the pool slot when the deadline expires,
even if the zombie is still running. An atomic.Bool ensures exactly one
Done() call between the watchdog and the normal defer path.

Note: this PR does not address context.TODO() usage in the 16 JS library
files. That is a follow-up concern. The watchdog already prevents permanent pool
starvation; context propagation into JS libraries would make zombies exit faster.

Proof

Setup

1. Start a Redis container (no password):

docker run -d --name redis-repro -p 6379:6379 redis:latest

2. Save as listeners.py and start 25 TCP listeners (15s response delay):

Each listener accepts a connection, waits 15 seconds, then responds with
garbage. The 15s delay is within Nuclei's default JS execution timeout (20s)
but long enough to hold pool slots. Open ports cause isPortOpen pre-conditions
to pass, so JS templates execute against all listeners and consume pool slots.

listeners.py

#!/usr/bin/env python3
"""Start TCP listeners that accept connections but don't speak Redis."""
import socket
import threading
import signal
import sys
import time

PORT_START = 20000
COUNT = int(sys.argv[1]) if len(sys.argv) > 1 else 25
DELAY = 15

def handle_conn(conn):
    try:
        conn.recv(1024)
        time.sleep(DELAY)
        conn.sendall(b"-ERR not redis\r\n")
    except Exception:
        pass
    finally:
        conn.close()

def listener(port):
    s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
    s.bind(('127.0.0.1', port))
    s.listen(5)
    while True:
        try:
            conn, _ = s.accept()
            threading.Thread(target=handle_conn, args=(conn,), daemon=True).start()
        except OSError:
            break

for p in range(PORT_START, PORT_START + COUNT):
    threading.Thread(target=listener, args=(p,), daemon=True).start()

print(f"Listening on {COUNT} ports ({PORT_START}-{PORT_START + COUNT - 1}), delay={DELAY}s")
signal.pause()

python3 listeners.py 25

3. Generate target list (25 haystack + 1 Redis last):

python3 -c "
for i in range(20000, 20025):
    print(f'localhost:{i}')
print('localhost:6379')
" > targets.txt

Baseline — single target (identical on both dev and fix)

$ nuclei -t ~/nuclei-templates/javascript/ -target localhost:6379 -no-mhe -silent
[redis-default-logins] [javascript] [high] localhost:6379 [passwords=""]
[redis-default-logins] [javascript] [high] localhost:6379 [passwords="iamadmin"]
[redis-default-logins] [javascript] [high] localhost:6379 [passwords="root"]
[redis-default-logins] [javascript] [high] localhost:6379 [passwords="password"]
[redis-default-logins] [javascript] [high] localhost:6379 [passwords="admin"]
[redis-info] [javascript] [info] localhost:6379
[snmpv3-detect] [javascript] [info] localhost:6379

7 matches.

Under load — dev branch (`567794f`), 3 consecutive runs

$ nuclei -t ~/nuclei-templates/javascript/ -l targets.txt -no-mhe -silent | wc -l
0
$ nuclei -t ~/nuclei-templates/javascript/ -l targets.txt -no-mhe -silent | wc -l
0
$ nuclei -t ~/nuclei-templates/javascript/ -l targets.txt -no-mhe -silent | wc -l
0

0 matches every time. All results silently dropped.

Under load — this branch, 3 consecutive runs

$ nuclei -t ~/nuclei-templates/javascript/ -l targets.txt -no-mhe -silent | wc -l
15
$ nuclei -t ~/nuclei-templates/javascript/ -l targets.txt -no-mhe -silent | wc -l
33
$ nuclei -t ~/nuclei-templates/javascript/ -l targets.txt -no-mhe -silent | wc -l
9

9–33 matches per run (vs 0 on dev). All three runs find redis-info
(Export, pooled path). redis-default-logins (non-Export, 20-slot path)
recovered in 2 of 3 runs. The remaining variance is caused by context.TODO()
in JS library network calls — zombie goroutines still hold slots for up to 15s
(the listener delay) before completing, limiting throughput on the crowded
20-slot non-pooled path. That is a follow-up fix.

Unit tests

Three new tests in pkg/js/compiler/pool_test.go verify the fix mechanics:

TestAddWithContextRespectsDeadline — slot acquisition fails fast when
deadline expires (vs blocking indefinitely with Add())
TestWatchdogReleasesSlotOnDeadline — watchdog frees a slot held by a zombie,
allowing a new execution to acquire it
TestPoolExhaustionRecovery — full starvation/recovery cycle: all slots
filled with zombies, deadlines expire, subsequent acquisitions succeed

$ go test -race -v -run 'TestAddWithContext|TestWatchdog|TestPoolExhaustion' ./pkg/js/compiler/...
=== RUN   TestAddWithContextRespectsDeadline
--- PASS: TestAddWithContextRespectsDeadline (0.05s)
=== RUN   TestWatchdogReleasesSlotOnDeadline
--- PASS: TestWatchdogReleasesSlotOnDeadline (0.07s)
=== RUN   TestPoolExhaustionRecovery
--- PASS: TestPoolExhaustionRecovery (0.20s)
PASS

Checklist

Pull request is created against the dev branch
All checks passed (lint, unit/integration/regression tests etc.) with my changes
I have added tests that prove my fix is effective or that my feature works
I have added necessary documentation (if appropriate)

Summary by CodeRabbit

Improvements
- Script execution now respects outer cancellation and deadlines, making timeouts more predictable.
- Resource pool handling and cleanup are more robust, ensuring slots are reliably released when executions are cancelled or time out.
Tests
- Added tests validating deadline-aware acquisitions, watchdog-driven slot release on deadline, and pool exhaustion recovery.

coderabbitai · 2026-02-13T12:52:04Z

Walkthrough

Propagates caller context into JS execution by setting opts.Context, replaces pool Add/Done with context-aware AddWithContext, and adds atomic guards plus watchdog goroutines to ensure pooled and non-pooled slots are released exactly once on cancellation or deadline.

Changes

Cohort / File(s)	Summary
Compiler entry `pkg/js/compiler/compiler.go`	Sets `opts.Context = ctx` before invoking `ExecuteProgram` so the inner execution observes outer cancellation/deadlines.
Non-pooled execution `pkg/js/compiler/non-pool.go`	Replaces `Add()`/`Done()` with `AddWithContext(opts.Context)`, handles acquisition errors, adds `sync/atomic` flag, `done` channel, and a cancellation watcher to ensure `Done()` is called exactly once.
Pooled execution `pkg/js/compiler/pool.go`	Adds `sync/atomic`, stores `ctx` in runtime context, uses `AddWithContext(opts.Context)` with error handling, and introduces an atomic `slotReleased`, watchdog goroutine, and deterministic deferred cleanup to guarantee single `Done()` invocation and close watcher channel.
Tests `pkg/js/compiler/pool_test.go`	Adds tests validating `AddWithContext` deadline behavior, watchdog-driven slot release on deadline, and pool exhaustion/recovery with zombie workers.

Sequence Diagram

sequenceDiagram
    participant Client
    participant Executor
    participant Pool
    participant Watchdog
    participant Cleanup

    Client->>Executor: ExecuteProgram(request)
    Executor->>Executor: derive ctx (with timeout/deadline)
    Executor->>Pool: AddWithContext(opts.Context)
    alt slot acquired
        Pool-->>Executor: slot acquired
        Executor->>Watchdog: start cancellation watcher (goroutine)
        Executor->>Executor: run JS runtime using opts.Context
        par normal run
            Executor->>Executor: execute script
        and cancellation path
            Watchdog->>Watchdog: wait for ctx.Done() or done signal
            alt ctx.Done()
                Watchdog->>Pool: call Done() (CAS guard ensures single call)
            end
        end
        Executor->>Cleanup: deferred cleanup
        Cleanup->>Pool: call Done() if not already (CAS)
        Cleanup->>Watchdog: close done channel
        Cleanup-->>Client: return result
    else AddWithContext fails
        Pool-->>Executor: error (deadline/full)
        Executor-->>Client: return error
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 I set a context, snug and bright,
Watched slots free when day turned night,
Atomic paws and a tiny bell,
Deadlines rang, the pool did swell,
Hooray — the rabbit hops, all's right! 🎉

🚥 Pre-merge checks | ✅ 3 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'fix(js): prevent pool slot starvation under load' directly and clearly summarizes the main change: addressing pool slot starvation issues in JS execution under load conditions.
Merge Conflict Detection	✅ Passed	✅ No merge conflicts detected when merging into `dev`

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

No actionable comments were generated in the recent review. 🎉

🧹 Recent nitpick comments

pkg/js/compiler/pool.go (1)

145-172: Watchdog pattern is correct; consider extracting the shared helper.

The CAS-guarded watchdog + defer cleanup is identical in both pool.go and non-pool.go. The logic is sound, but the duplication could be eliminated with a small helper, e.g.:

♻️ Optional: extract shared watchdog helper

// acquireSlot acquires a pool slot respecting ctx and returns a release
// function that is safe to call exactly once (idempotent).  A background
// watchdog releases the slot if ctx expires before release is called.
func acquireSlot(ctx context.Context, pool *syncutil.AdaptiveWaitGroup) (release func(), err error) {
	if err := pool.AddWithContext(ctx); err != nil {
		return nil, err
	}
	var released atomic.Bool
	done := make(chan struct{})
	go func() {
		select {
		case <-ctx.Done():
			if released.CompareAndSwap(false, true) {
				pool.Done()
			}
		case <-done:
		}
	}()
	return func() {
		close(done)
		if released.CompareAndSwap(false, true) {
			pool.Done()
		}
	}, nil
}

Usage in both files would reduce to:

release, err := acquireSlot(opts.Context, pooljsc)
if err != nil {
    return nil, err
}
defer release()

pkg/js/compiler/pool_test.go (1)

118-127: Zombie goroutines linger for 10 s after the test returns.

The sleeping goroutines outlive the test function. While not a correctness issue (they're goroutine-safe), they add 10 s of background noise to the test process. Consider using a context-aware sleep so they exit promptly:

♻️ Suggested improvement

-		// Zombie worker: blocks for 10s simulating a hung network call.
-		go func() {
-			defer func() {
-				close(done)
-				if released.CompareAndSwap(false, true) {
-					pool.Done()
-				}
-			}()
-			time.Sleep(10 * time.Second)
-		}()
+		// Zombie worker: blocks simulating a hung network call, but
+		// exits promptly once the test's context is canceled.
+		zombieCtx := ctx // capture per-iteration
+		go func() {
+			defer func() {
+				close(done)
+				if released.CompareAndSwap(false, true) {
+					pool.Done()
+				}
+			}()
+			select {
+			case <-time.After(10 * time.Second):
+			case <-zombieCtx.Done():
+				// Still sleep briefly so we don't beat the watchdog.
+				time.Sleep(50 * time.Millisecond)
+			}
+		}()

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Zombie goroutines from timed-out JS executions held pool slots indefinitely: Add() blocked with context.Background(), and defer Done() only ran when the goroutine eventually completed. Under load, both pools (80 pooled + 20 non-pooled slots) filled with zombies, silently dropping all subsequent matches. Three changes fix slot lifecycle management: 1. Propagate the 20s deadline context into ExecuteProgram (compiler.go) so both execution paths can respect the deadline. 2. Replace Add() with AddWithContext(ctx) in both pool.go and non-pool.go so goroutines waiting for a slot fail fast when the deadline expires instead of blocking indefinitely. 3. Add a watchdog goroutine that releases the slot when the deadline expires, even if the zombie is still running. An atomic.Bool ensures exactly one Done() call between the watchdog and the normal defer path.

auto-assign bot requested a review from dwisiswant0 February 13, 2026 12:51

AuditeMarlow force-pushed the fix/js-pool-slot-starvation branch from 1618147 to 4a26c99 Compare February 13, 2026 13:07

AuditeMarlow force-pushed the fix/js-pool-slot-starvation branch from 4a26c99 to 6f7f4b7 Compare February 13, 2026 13:17

AuditeMarlow mentioned this pull request Feb 16, 2026

[BUG] JS protocol templates silently drop matches under load #6894

Open

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(js): prevent pool slot starvation under load#6896

fix(js): prevent pool slot starvation under load#6896
AuditeMarlow wants to merge 1 commit intoprojectdiscovery:devfrom
guardian360:fix/js-pool-slot-starvation

AuditeMarlow commented Feb 13, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Feb 13, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

AuditeMarlow commented Feb 13, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Proposed changes

Proof

Setup

Baseline — single target (identical on both dev and fix)

Under load — dev branch (567794f), 3 consecutive runs

Under load — this branch, 3 consecutive runs

Unit tests

Checklist

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Feb 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

AuditeMarlow commented Feb 13, 2026 •

edited by coderabbitai bot

Loading

Under load — dev branch (`567794f`), 3 consecutive runs

coderabbitai bot commented Feb 13, 2026 •

edited

Loading