Modules: Concurrency issue with module context fix by pm-nikhil-vaidya · Pull Request #4603 · prebid/prebid-server

pm-nikhil-vaidya · 2025-11-11T19:02:02Z

Add holdReadLock mechanism to prevent concurrent map access in moduleContexts during parallel bidder hook execution. This fixes panics caused by concurrent reads and writes to the moduleContexts map when multiple bidders execute hooks in parallel. This is an alternate implementation to PR prebid#4603 (prebid#4603), using a simpler approach with a boolean flag to conditionally hold the read lock for the entire duration of hook execution in bidder stages. Changes: - Add holdReadLock boolean field to executionContext - Add getModuleContextLocked() helper that reads without acquiring locks - Modify executeGroup() to conditionally hold RLock for entire execution - Enable holdReadLock for ExecuteBidderRequestStage and ExecuteRawBidderResponseStage The RWMutex is now held for the entire duration that hook goroutines are executing in bidder stages, preventing saveModuleContexts() from acquiring write lock while reads are in progress. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 (1M context) <noreply@anthropic.com>

scr-oath · 2025-12-09T20:11:12Z

Please see #4631 as an alternative approach, which uses the read-lock when calling the parallel bidder hook stages.

NOTE: I wrote it with Claude code and need to review myself - it may not be exactly what I wanted - I may want it to have better control over the lock than it currently does.

I just walked through it - it seems fine - each of the hookExecutor methods seems to call saveModuleContexts AFTER calling the executeStage and therefore holding a single lock for the duration of the executeGroup is fine - I might have grabbed and held at each hook invocation, but conceptually you can't return until all hooks are called so it's perfectly fine to only have one read lock held, so 👍

scr-oath · 2025-12-17T09:08:42Z

Ok - so taking another look - I think that *sync.Map would be a better option - I believe that models Java's ConcurrentHashMap where it tries to do things without locking if possible but resorts to locking in sticky situations. You're basically inventing sync.Map without that faster path here by forcing users to use Get/Set where they could be using Load/Store to accomplish the same effect and without reinventing any wheels.

CORRECTION from gemini: Implementing something with lock around map is probably best/fastest apparently.

Go 1.24+ Update: The standard Go map implementation was upgraded to Swiss Tables (a high-performance hash table design by Google). This makes the standard map + RWMutex significantly faster for lookups and inserts, further widening the gap between it and sync.Map for general use.

I'll have to take another look at this when I'm fresh. Could be ready to go 🤞

bsardo · 2026-01-06T16:37:59Z

@postindustria-code can you please review?

scr-oath · 2026-01-06T16:45:07Z

@pm-nikhil-vaidya - Happy New Year - can you take a look at the diff and see if you are comfortable applying it?
#4603 (comment)

pm-nikhil-vaidya · 2026-01-08T10:35:23Z

@scr-oath I have implemented the solution you suggested as it removes intermediate copy while copying the module context and removes extra lock mechanism while reading. I have just added a check in the Insert() function in case of map map is empty.

func (mc *ModuleContext) Insert(seq iter.Seq2[string, any]) {
	if mc == nil {
		return
	}

	mc.Lock()
	defer mc.Unlock()

	if mc.data == nil {
		mc.data = make(map[string]any)
	}

	maps.Insert(mc.data, seq)
}

You can review and let me know if any more changes required

legendko · 2026-01-08T13:06:41Z

I understand the motivation behind introducing All() for performance reasons. That makes sense, especially given how hot this path can be. However I see a possible deadlock issue with its usage.

ModuleContext.All() holds an RLock for the entire duration of iteration and executes caller-provided code while that lock is held. This creates a structural deadlock hazard:

If code inside the iteration attempts to call Set(), Insert(), or any write operation → guaranteed deadlock
Multiple concurrent bidders might trigger this without any module author error
Plus if iteration is slow or blocks → writer starvation for all concurrent bidders

F.i. this pattern is very natural and currently deadlocks:

for k, v := range mc.All() {
    if shouldUpdate(k, v) {
        mc.Set(k, newValue) // deadlocks: tries to upgrade RLock → Lock
    }
}

GetAll() seems to be safer to use although being a bit slower and using copies

@pm-nikhil-vaidya @scr-oath what are you thoughts on this?

postindustria-code · 2026-01-08T13:17:17Z

Quick follow-up / clarification:

The previous comment was also from me (@legendko) — just posted from my personal GitHub account by mistake. Apologies for any confusion; I wanted to continue the discussion here from this account.

While we’re on the topic, I’d like to raise one additional concern, since this is a breaking API change for module authors:

Should this be explicitly treated as a breaking change (e.g. version bump / release notes)?
A migration guide for external modules? This exact issue arised from a custom module

pm-nikhil-vaidya · 2026-01-21T06:22:50Z

@bsardo @scr-oath @postindustria-code

I’ve rolled back to the previous implementation. I’ve also retained the approach suggested by @scr-oath for cases where a user prefers performance over the potential risk of a deadlock.

scr-oath · 2026-01-27T04:49:26Z

+	mc.Lock()
+	defer mc.Unlock()
+	if mc.data == nil {
+		mc.data = make(map[string]any)


suggestion: Make with the right size for copy if nil.

Suggested change

mc.data = make(map[string]any)

mc.data = make(map[string]any, len(data))

scr-oath · 2026-01-27T04:50:09Z

+var emptyMapIter iter.Seq2[string, any] = func(yield func(string, any) bool) {}
+
+// All returns an iterator over key-value pairs from the module context with read lock held
+func (mc *ModuleContext) All() iter.Seq2[string, any] {


suggestion: remove this

I thought we had decided against exposing All due to its risk of deadlock if held in reverse order with another

scr-oath · 2026-01-27T04:50:48Z

+}
+
+// Insert adds the key-value pairs from seq to the module context with write lock held
+func (mc *ModuleContext) Insert(seq iter.Seq2[string, any]) {


suggestion: remove this

I thought we decided against exporing Insert also for deadlock concerns.

bsardo · 2026-02-03T16:29:24Z

@pm-nikhil-vaidya we discussed @scr-oath's comments today. Let's remove Insert and All as he suggested and then we should be good.

scr-oath

All three previous change requests have been addressed:

✅ make(map[string]any, len(data)) — properly sized allocation in SetAll when data is nil
✅ All() iterator removed — eliminates deadlock risk from holding lock across iteration
✅ Insert() removed — same concern addressed

Concurrency analysis of put(): The copy-based approach (existingCtx.SetAll(mCtx.GetAll())) is safe. Go evaluates mCtx.GetAll() first (acquires/releases mCtx.RLock), then passes the returned plain map to existingCtx.SetAll() (acquires/releases existingCtx.Lock). No two ModuleContext locks are ever held simultaneously. No reverse-order deadlock is possible since ModuleContext methods never reference back to moduleContexts.

nitpick (non-blocking): SetAll comment says "replaces all data" but maps.Copy merges into the existing map (existing keys not in data are preserved). The merge behavior is correct for this use case (accumulating context across stages), but the comment could say "merges data into the context" to be precise.

LGTM — the RWMutex + copy-based approach is sound, modules are all updated correctly, and tests are comprehensive.

scr-oath

Re-reviewing after my previous approval was dismissed. The only delta since my last review is the v3→v4 import path bump from the master merge — the logic is unchanged.

One blocking issue found: data race in Get/GetAll where mc.data is read outside the lock.

Previous change-requests (sized allocation in SetAll, removed All() iterator, removed Insert()) are all still correctly addressed. The concurrency design is sound aside from the pre-lock nil check.

scr-oath · 2026-03-17T14:57:51Z

+
+// Get retrieves a value from the module context with read lock
+func (mc *ModuleContext) Get(key string) (any, bool) {
+	if mc == nil || mc.data == nil {


issue (blocking): mc.data == nil read outside the lock is a data race.

Set (line 72) writes mc.data under a write lock. Reading mc.data here without holding any lock is a concurrent read/write — go test -race will flag this. Same issue in GetAll at line 79.

Fix: move the data == nil check inside the lock, or drop it entirely since NewModuleContext() always initializes data:

func (mc *ModuleContext) Get(key string) (any, bool) { if mc == nil { return nil, false } mc.RLock() defer mc.RUnlock() if mc.data == nil { return nil, false } val, ok := mc.data[key] return val, ok }

Apply the same pattern to GetAll.

scr-oath · 2026-03-17T14:57:52Z

+	return result
+}
+
+// SetAll replaces all data in the context


issue (non-blocking): Comment says "replaces all data" but maps.Copy merges — existing keys not present in data are preserved.

Since the only call site (moduleContexts.put) wants merge semantics, the implementation is correct. Just fix the comment to say "merges data into the context" to avoid misleading module authors.

scr-oath · 2026-03-17T14:57:52Z

-	mc.ctxs[moduleName] = newCtx
+
+	// Add new data to existing context
+	existingCtx.SetAll(mCtx.GetAll())


suggestion (non-blocking): Extract mCtx.GetAll() before acquiring moduleContexts.Lock to avoid nested locking.

Currently safe because GetAll copies-then-releases and SetAll acquires its own lock — no two ModuleContext locks are ever held simultaneously. But holding moduleContexts.Lock while calling into ModuleContext methods is a fragile invariant. Moving the snapshot outside the lock makes it explicit:

func (mc *moduleContexts) put(moduleName string, mCtx *hookstage.ModuleContext) { if mc == nil { return } newData := mCtx.GetAll() mc.Lock() defer mc.Unlock() existingCtx, ok := mc.ctxs[moduleName] if !ok { mc.ctxs[moduleName] = mCtx return } existingCtx.SetAll(newData) }

This will partially address the nested locking issue. Since setAll also acquires a lock, nested locking won’t be completely eliminated. Therefore, we can retain the existing implementation.

scr-oath · 2026-03-17T14:57:52Z

+			}},
+		}
+		return ret, nil
+	}


suggestion (non-blocking): The "get async request from module context" + error-analytics pattern is duplicated across HandleProcessedAuctionHook and HandleAuctionResponseHook (4 blocks total). Consider extracting a helper to cut ~40 lines of duplication:

func getAsyncRequest(mc *hookstage.ModuleContext, prefix string) (*AsyncRequest, *hookanalytics.Analytics) { ... }

This change is refactoring of the RTD module, we can take this change in separate PR. This PR mainly target the concurrency issue.

scr-oath · 2026-03-17T15:44:55Z

CI build failure — two lines in modules/scope3/rtd/module_test.go still use map-index syntax on *ModuleContext:

module_test.go:262: cannot index entrypointResult.ModuleContext (variable of type *hookstage.ModuleContext)
module_test.go:692: cannot index entrypointResult.ModuleContext (variable of type *hookstage.ModuleContext)

The same pattern was already fixed on lines 364, 547, and 801. Apply the same migration to these two:

// Lines 262 and 692 — change:
assert.NotNil(t, entrypointResult.ModuleContext[asyncRequestKey])
// to:
asyncRequest, _ := entrypointResult.ModuleContext.Get(asyncRequestKey)
assert.NotNil(t, asyncRequest)

scr-oath

Re-reviewing after the fix commits.

The prior blocking race is fixed. Get and GetAll in hooks/hookstage/invocation.go now check mc.data == nil inside the read lock, ordered behind Set/SetAll's lazy init under the write lock. The SetAll doc comment was also corrected from "replaces" to "merges". The two CI-breaking ModuleContext[asyncRequestKey] map-indexes in modules/scope3/rtd/module_test.go are now .Get(...). The new wurfl device-detection migration is a clean, mechanical port that mirrors the pattern used by the other modules in this PR.

Verification:

Race fix — invocation.go lines 58–65 (Get) and 85–92 (GetAll): RLock taken, then nil check, then map read. Symmetric on the write side at lines 72–78 (Set) and 100–106 (SetAll). No window remains.
put() in hooks/hookexecution/context.go:63 — copy-based merge is still deadlock-free: mCtx and existingCtx are distinct *ModuleContext instances, and ModuleContext methods never call back into moduleContexts. Lock graph is acyclic.
CI fixes — both leftover map-index sites in module_test.go migrated; the assertion semantics (assert.NotNil on interface nil) are preserved because Get returns untyped-nil on miss.

LGTM. One non-blocking nit on the wurfl error message left as a line comment.

Reviewer notes:

The two carryover non-blocking suggestions from my prior review (extract snapshot before put's outer lock; helper for the duplicated scope3 Get+type-assert pattern) remain unaddressed. Author's call — neither blocks merge.

scr-oath · 2026-04-28T15:21:04Z

-	rawHeaders, ok := invocationCtx.ModuleContext[wurflHeaderCtxKey].(map[string]string)
+	headers, ok := invocationCtx.ModuleContext.Get(wurflHeaderCtxKey)
+	if !ok {
+		return result, hookexecution.NewFailure("invalid module context type")


nitpick (non-blocking): error message is misleading on this branch.

Get returning ok=false means the key is missing, not that the value's type is wrong. The next branch (line 110) handles the actual type-assertion failure with the same string, so the two failure modes are no longer distinguishable.

Consider:

headers, ok := invocationCtx.ModuleContext.Get(wurflHeaderCtxKey) if !ok { return result, hookexecution.NewFailure("missing module context entry") } rawHeaders, ok := headers.(map[string]string) if !ok { return result, hookexecution.NewFailure("invalid module context type") }

Modules: Concurrency issue with module context fix

a0a5ab8

pm-nikhil-vaidya added the modularity infrastructure label Nov 12, 2025

pm-nikhil-vaidya requested a review from bsardo November 12, 2025 05:58

bsardo reviewed Nov 25, 2025

View reviewed changes

Comment thread hooks/hookstage/invocation.go

bsardo assigned bsardo and scr-oath Nov 25, 2025

scr-oath mentioned this pull request Dec 9, 2025

Fix race condition in bidder hook execution (#4239) #4631

Closed

scr-oath self-requested a review December 17, 2025 15:51

scr-oath requested changes Dec 17, 2025

View reviewed changes

Comment thread hooks/hookexecution/context.go

Switched to more performant solution while adding module contexts

cbada1e

Reverted to GetAll and SetAll methods to update the ModuleCtxs

109bff6

postindustria-code previously approved these changes Jan 21, 2026

View reviewed changes

scr-oath requested changes Jan 27, 2026

View reviewed changes

scr-oath and others added 2 commits January 26, 2026 20:53

Merge branch 'master' into module-context-concurrency

91c4e72

Merge branch 'master' into module-context-concurrency

8dbfba6

Removed unused methods for module context update

7c46e0c

pm-nikhil-vaidya dismissed postindustria-code’s stale review via 7c46e0c February 4, 2026 06:12

Merge branch 'master' into module-context-concurrency

1ee58dc

scr-oath previously approved these changes Feb 10, 2026

View reviewed changes

postindustria-code previously approved these changes Feb 16, 2026

View reviewed changes

pm-nikhil-vaidya added 2 commits February 17, 2026 21:13

Merge branch 'master' into module-context-concurrency

df16300

Merge branch 'master' into module-context-concurrency

bf0b766

pm-nikhil-vaidya dismissed stale reviews from scr-oath and postindustria-code via bf0b766 March 8, 2026 05:00

scr-oath requested review from postindustria-code and scr-oath March 17, 2026 14:45

scr-oath requested changes Mar 17, 2026

View reviewed changes

pm-nikhil-vaidya and others added 3 commits March 25, 2026 11:32

Merge branch 'master' into module-context-concurrency

76ec132

Moved data nil check inside lock protection

d9a9b3d

Merge branch 'master' into module-context-concurrency

9c1098a

pm-nikhil-vaidya requested a review from scr-oath March 31, 2026 05:44

Merge branch 'master' into module-context-concurrency

e6ab830

bsardo assigned postindustria-code and unassigned bsardo Apr 16, 2026

pm-nikhil-vaidya and others added 2 commits April 16, 2026 18:50

Merge branch 'master' into module-context-concurrency

ded88a9

Used synced moduleCtx for device detection module

0d1d90e

scr-oath approved these changes Apr 28, 2026

View reviewed changes

scr-oath requested a review from bsardo April 28, 2026 15:25

Merge branch 'master' into module-context-concurrency

9944bf5

	mc.data = make(map[string]any)
	mc.data = make(map[string]any, len(data))

Conversation

pm-nikhil-vaidya commented Nov 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

scr-oath commented Dec 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

scr-oath commented Dec 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

bsardo commented Jan 6, 2026

Uh oh!

scr-oath commented Jan 6, 2026

Uh oh!

pm-nikhil-vaidya commented Jan 8, 2026

Uh oh!

legendko commented Jan 8, 2026

Uh oh!

postindustria-code commented Jan 8, 2026

Uh oh!

pm-nikhil-vaidya commented Jan 21, 2026

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bsardo commented Feb 3, 2026

Uh oh!

scr-oath left a comment

Choose a reason for hiding this comment

Uh oh!

scr-oath left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pm-nikhil-vaidya Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

scr-oath commented Mar 17, 2026

Uh oh!

scr-oath left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

pm-nikhil-vaidya commented Nov 11, 2025 •

edited

Loading

scr-oath commented Dec 9, 2025 •

edited

Loading

scr-oath commented Dec 17, 2025 •

edited

Loading

pm-nikhil-vaidya Mar 25, 2026 •

edited

Loading