Skip to content

fix: deduplicate concurrent WorkOS session refreshes#384

Open
mcarlson wants to merge 1 commit intostagefrom
fix/session-refresh-race
Open

fix: deduplicate concurrent WorkOS session refreshes#384
mcarlson wants to merge 1 commit intostagefrom
fix/session-refresh-race

Conversation

@mcarlson
Copy link
Copy Markdown
Collaborator

@mcarlson mcarlson commented Mar 13, 2026

Summary

  • Adds in-memory deduplication for concurrent WorkOS session refresh attempts
  • When multiple requests arrive with the same expired session cookie, only one refresh() call is made to WorkOS — concurrent requests await the same promise
  • WorkOS refresh tokens are single-use (rotated on each refresh), so without this fix, only the first concurrent refresh succeeds and all others fail with 401

Root Cause Analysis

The Bug

When the studio page fires multiple concurrent API requests (batch submission sends up to 5 at a time via Promise.allSettled), and the user's WorkOS access token has expired, all concurrent requests race to refresh the same session. WorkOS refresh tokens are single-use (rotated on refresh). Only 1 request wins; the others fail.

Failure Trace

  1. Studio batch submit (useBatchSubmit.ts:84) fires up to 5 concurrent Promise.allSettled requests
  2. All 5 requests arrive at requireAuth middleware with the same expired session cookie
  3. Each calls authenticateWorkOS()authenticateAndGetWorkOSSession() returns null (expired)
  4. Each calls refreshWorkOSSession() — but WorkOS refresh tokens are single-use
  5. Request test fluent ffmpeg #1 wins the refresh, gets new sealed session, updates cookie in response
  6. Requests bring up skeleton #2-5 fail refresh (token already consumed) → caught by catch block → returns null
  7. null result → handleWorkOSAuthFailure()clears the wos-session cookie and returns 401
  8. Frontend axios interceptor catches any 401 → calls logout() → navigates to sign-in

Why Only Studio

The studio page is the only feature that fires concurrent authenticated API requests. Normal pages make sequential requests, so the refresh happens once and subsequent requests use the new cookie. Studio's batch pattern creates the race.

Contributing Factors

  1. No refresh mutex on backend (this PR fixes this)
  2. Aggressive frontend logout — interceptor calls logout() on any 401 with no debounce (fixed in companion frontend PR)
  3. Cookie cleared on auth failurehandleWorkOSAuthFailure clears wos-session cookie, so even if request test fluent ffmpeg #1 set a new cookie, the 401 response from request bring up skeleton #2 clears it
  4. Polling compounds the raceuseStudioJobProgress.ts polls every 10s with multiple concurrent GET requests

Changes

  • src/utils/workos.util.ts: Added pendingRefreshes Map that caches in-flight refresh promises keyed by authToken. The promise is cleaned up via .finally() after resolution.

Companion PR

Test plan

  • Deploy to stage and verify studio batch submission no longer signs user out
  • Verify normal login/logout still works
  • Verify session refresh still works for single requests

🤖 Generated with Claude Code

When multiple requests arrive with the same expired session cookie,
only one refresh call is made to WorkOS. Concurrent requests await
the same promise instead of each consuming the single-use refresh
token independently.

Fixes studio page sign-out caused by batch submission race condition.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant