Skip to content

fix: handle race condition in recordExternalDbSyncDeletion#1466

Merged
N2D4 merged 4 commits into
devfrom
devin/1779398237-fix-external-db-sync-deletion-race
May 22, 2026
Merged

fix: handle race condition in recordExternalDbSyncDeletion#1466
N2D4 merged 4 commits into
devfrom
devin/1779398237-fix-external-db-sync-deletion-race

Conversation

@N2D4
Copy link
Copy Markdown
Contributor

@N2D4 N2D4 commented May 21, 2026

When concurrent operations delete the same row (e.g., password change bulk-deletes all tokens while a sign-out tries to record deletion of the same token), the INSERT...SELECT finds 0 rows and throws.

Now insertedCount === 0 logs via captureError instead of throwing, since the concurrent bulk operation already recorded the deletion. insertedCount > 1 still throws as a real data integrity issue.

Adds E2E tests that reproduce the race condition by admin-changing a password (bulk token deletion) then attempting sign-out/session-delete with the stale session.

Link to Devin session: https://app.devin.ai/sessions/7aa48634542b49e897ce85ffaaf0a029
Requested by: @N2D4

Summary by CodeRabbit

  • Bug Fixes
    • Prevented server errors during concurrent deletion flows so sign-out and session deletion now return appropriate error responses (e.g., 401, 404) instead of crashing.
  • Tests
    • Added regression tests that reproduce concurrent-deletion/race conditions to ensure stable responses when stale or bulk-deleted tokens or sessions are used.

Review Change Stack

When concurrent operations delete the same row (e.g., password change
bulk-deletes all tokens while a concurrent sign-out tries to record
deletion of the same token), the INSERT...SELECT in
recordExternalDbSyncDeletion finds 0 rows.

Previously this threw a StackAssertionError. Now:
- insertedCount === 0: log via captureError (race condition, not a bug)
- insertedCount > 1: still throws (data integrity issue)

Affected entity types:
- ProjectUserRefreshToken
- VerificationCode_TEAM_INVITATION

Co-Authored-By: Konstantin Wohlwend <n2d4xc@gmail.com>
@devin-ai-integration
Copy link
Copy Markdown
Contributor

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR that start with 'DevinAI' or '@devin'.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@vercel
Copy link
Copy Markdown

vercel Bot commented May 21, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
stack-auth-hosted-components Ready Ready Preview, Comment May 21, 2026 11:51pm
stack-auth-mcp Ready Ready Preview, Comment May 21, 2026 11:51pm
stack-auth-skills Ready Ready Preview, Comment May 21, 2026 11:51pm
stack-backend Ready Ready Preview, Comment May 21, 2026 11:51pm
stack-dashboard Ready Ready Preview, Comment May 21, 2026 11:51pm
stack-demo Ready Ready Preview, Comment May 21, 2026 11:51pm
stack-docs Ready Ready Preview, Comment May 21, 2026 11:51pm
stack-preview-backend Ready Ready Preview, Comment May 21, 2026 11:51pm
stack-preview-dashboard Ready Ready Preview, Comment May 21, 2026 11:51pm

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 21, 2026

📝 Walkthrough

Walkthrough

Multiple branches of recordExternalDbSyncDeletion no longer assert an exact inserted-row count; they execute the raw INSERT INTO "DeletedRow" ... FOR UPDATE and return. Added E2E regression tests reproducing bulk refresh-token/session deletion races; test imports now include backendContext.

Changes

External DB sync race condition handling

Layer / File(s) Summary
Relax deleted-row assertions across targets
apps/backend/src/lib/external-db-sync.ts
Removed insertedCount capture and exact-count assertions from many target.tableName branches (e.g., ProjectUser, ContactChannel, Team, TeamMember, TeamMemberDirectPermission, ProjectUserDirectPermission, UserNotificationPreference, ProjectUserRefreshToken, ProjectUserOAuthAccount, VerificationCode_TEAM_INVITATION) so each branch now executes the raw INSERT INTO "DeletedRow" ... FOR UPDATE and returns.
E2E regression tests for session deletion races
apps/e2e/tests/backend/endpoints/api/v1/auth/sessions/current/index.test.ts
Added two regression tests reproducing races where refresh tokens/sessions are bulk-deleted while a stale access token remains; asserts endpoints return 401/404 (not 500) and updated imports to include backendContext.

Sequence Diagram

sequenceDiagram
  participant Client
  participant SessionsAPI
  participant ExternalDbSync
  participant Database
  Client->>SessionsAPI: DELETE /api/v1/auth/sessions/current (stale access token)
  SessionsAPI->>ExternalDbSync: recordExternalDbSyncDeletion(target.tableName)
  ExternalDbSync->>Database: INSERT INTO "DeletedRow" ... FOR UPDATE
  Database-->>ExternalDbSync: insertedCount
  ExternalDbSync-->>SessionsAPI: return (no throw on 0)
  SessionsAPI-->>Client: 401/404 or other non-500 response
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 I hopped where counts once caused a flare,

zero rows met silence in the air.
Tests chased the race, no panic in sight —
endpoints answered, the night stayed light.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately identifies the main change: fixing a race condition in the recordExternalDbSyncDeletion function, which is the core modification in the backend file.
Description check ✅ Passed The description explains the race condition, how it's handled, and mentions E2E tests added. However, it lacks clear structure with sections and doesn't follow a standard format.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch devin/1779398237-fix-external-db-sync-deletion-race

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Co-Authored-By: Konstantin Wohlwend <n2d4xc@gmail.com>
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 21, 2026

Greptile Summary

This PR fixes a race condition in recordExternalDbSyncDeletion where concurrent bulk-delete operations (e.g., a password change bulk-deleting all refresh tokens) would cause a subsequent individual deletion to find 0 rows in the INSERT…SELECT, previously throwing a hard assertion error and returning a 500.

  • In the ProjectUserRefreshToken and VerificationCode_TEAM_INVITATION branches, insertedCount === 0 now calls captureError (logging the event) instead of throwing, since the concurrent bulk operation is presumed to have already recorded the DeletedRow entry atomically. insertedCount > 1 still throws as a genuine data integrity error.
  • Two E2E tests are added to reproduce both the sign-out path and the sessions CRUD DELETE path after an admin-triggered password change invalidates all refresh tokens.

Confidence Score: 4/5

The change is safe to merge — it converts a hard crash into a logged warning for a known-benign concurrent-delete scenario, and the new behaviour is covered by two E2E tests.

The logic for the two fixed branches is sound: if the bulk delete and its DeletedRow insert are atomic (same transaction), insertedCount === 0 reliably means the entry was already recorded. The one remaining question is whether ProjectUserOAuthAccount and the other unchanged branches can be reached through similar concurrent bulk-delete paths.

apps/backend/src/lib/external-db-sync.ts — specifically the ProjectUserOAuthAccount branch and the other table-type branches that still throw unconditionally on insertedCount !== 1.

Important Files Changed

Filename Overview
apps/backend/src/lib/external-db-sync.ts Race condition fix: insertedCount === 0 in the ProjectUserRefreshToken and VerificationCode_TEAM_INVITATION branches now calls captureError instead of throwing, correctly handling concurrent bulk deletions. Other table types (ProjectUserOAuthAccount, ContactChannel, TeamMember, etc.) still throw on zero rows and may need the same treatment if they participate in bulk deletes.
apps/e2e/tests/backend/endpoints/api/v1/auth/sessions/current/index.test.ts Two new E2E tests added to reproduce the race condition (sign-out and session CRUD delete after admin-triggered bulk token deletion). Both tests correctly assert non-500 status codes and include inline snapshots for the expected 401/404 responses.

Sequence Diagram

sequenceDiagram
    participant Client as Client (sign-out)
    participant BulkOp as Admin (password change)
    participant DB as PostgreSQL

    BulkOp->>DB: "DELETE FROM ProjectUserRefreshToken WHERE userId=X"
    Client->>DB: "INSERT INTO DeletedRow SELECT ... FOR UPDATE (finds 0 rows)"
    DB-->>Client: "insertedCount = 0"
    Note over Client: "OLD: throw StackAssertionError (500)"
    Note over Client: "NEW: captureError (log) + return normally"
    BulkOp->>DB: "INSERT INTO DeletedRow (bulk records deletion)"
    DB-->>BulkOp: committed
    Client-->>Client: "Returns 401 REFRESH_TOKEN_NOT_FOUND_OR_EXPIRED"
Loading

Comments Outside Diff (1)

  1. apps/backend/src/lib/external-db-sync.ts, line 452-456 (link)

    P2 Potential unhandled race for ProjectUserOAuthAccount

    ProjectUserOAuthAccount still throws whenever insertedCount !== 1. If there is any code path that performs a bulk delete of OAuth accounts (e.g., cascading deletes when a user is fully removed), a concurrent individual deletion could hit the same race condition and produce a 500 assertion error — the exact bug this PR is fixing for ProjectUserRefreshToken. It may be worth confirming whether bulk deletions of OAuth accounts are possible, and if so, applying the same captureError treatment here.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: apps/backend/src/lib/external-db-sync.ts
    Line: 452-456
    
    Comment:
    **Potential unhandled race for `ProjectUserOAuthAccount`**
    
    `ProjectUserOAuthAccount` still throws whenever `insertedCount !== 1`. If there is any code path that performs a bulk delete of OAuth accounts (e.g., cascading deletes when a user is fully removed), a concurrent individual deletion could hit the same race condition and produce a 500 assertion error — the exact bug this PR is fixing for `ProjectUserRefreshToken`. It may be worth confirming whether bulk deletions of OAuth accounts are possible, and if so, applying the same `captureError` treatment here.
    
    How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix All With AI
Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 1
apps/backend/src/lib/external-db-sync.ts:452-456
**Potential unhandled race for `ProjectUserOAuthAccount`**

`ProjectUserOAuthAccount` still throws whenever `insertedCount !== 1`. If there is any code path that performs a bulk delete of OAuth accounts (e.g., cascading deletes when a user is fully removed), a concurrent individual deletion could hit the same race condition and produce a 500 assertion error — the exact bug this PR is fixing for `ProjectUserRefreshToken`. It may be worth confirming whether bulk deletions of OAuth accounts are possible, and if so, applying the same `captureError` treatment here.

Reviews (1): Last reviewed commit: "fix: handle null userAuth type in test" | Re-trigger Greptile

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (3)
apps/e2e/tests/backend/endpoints/api/v1/auth/sessions/current/index.test.ts (3)

10-10: ⚡ Quick win

Assert auth context presence instead of silently coercing to undefined.

This regression test depends on having captured the pre-race client auth; silent coercion can mask setup breakage.

Suggested change
-  const savedAuth = backendContext.value.userAuth ?? undefined;
+  const savedAuth = backendContext.value.userAuth;
+  if (savedAuth == null) {
+    throw new Error("Expected backendContext.value.userAuth after sign-up.");
+  }

As per coding guidelines, **/*.{ts,tsx}: “Fail early, fail loud. Fail fast with an error instead of silently continuing.”

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/e2e/tests/backend/endpoints/api/v1/auth/sessions/current/index.test.ts`
at line 10, The test silently coerces backendContext.value.userAuth to undefined
via "const savedAuth = backendContext.value.userAuth ?? undefined", which masks
setup failures; change this to fail fast by asserting presence of userAuth
(e.g., throw or use your test assertion helper) and assign the non-null value to
savedAuth only after the check so savedAuth and backendContext.value.userAuth
are guaranteed defined (replace the null-coalescing line with an explicit check
for backendContext.value.userAuth and an error/expect if missing).

26-27: ⚡ Quick win

Prefer inline snapshots over standalone status matchers in this test file.

You already snapshot full responses; the extra .toBe(...)/.not.toBe(...) assertions are inconsistent with the test guideline and can be removed or converted to inline snapshots.

As per coding guidelines, **/*.test.{ts,tsx}: “When writing tests, prefer .toMatchInlineSnapshot over other selectors in Vitest tests.”

Also applies to: 52-53, 60-60, 77-78

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/e2e/tests/backend/endpoints/api/v1/auth/sessions/current/index.test.ts`
around lines 26 - 27, Remove the redundant status assertions like
expect(response.status).not.toBe(500) and instead rely on the existing response
snapshot; either delete those .toBe/.not.toBe lines or fold the status into the
inline snapshot by ensuring the call to expect(response).toMatchInlineSnapshot
captures the status field. Update all occurrences referencing the response
status (the expect(response.status)... assertions found alongside
expect(response).toMatchInlineSnapshot) so tests follow the guideline of
preferring .toMatchInlineSnapshot in these specs.

61-63: ⚡ Quick win

Remove any from session selection and use a defensive type guard.

Using any here bypasses type safety and weakens the test’s assumptions.

Suggested change
-  const nonCurrentSession = listRes.body.items.find((s: any) => !s.is_current_session);
-  expect(nonCurrentSession).toBeDefined();
+  const nonCurrentSession = listRes.body.items.find(
+    (s: unknown): s is { id: string; is_current_session: false } =>
+      typeof s === "object" &&
+      s !== null &&
+      "id" in s &&
+      typeof s.id === "string" &&
+      "is_current_session" in s &&
+      s.is_current_session === false
+  );
+  if (nonCurrentSession == null) {
+    throw new Error("Expected a non-current session after creating a second session.");
+  }

As per coding guidelines, **/*.{ts,tsx}: “Try to avoid the any type. Whenever you need to use any, leave a comment explaining why you're using it.”

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/e2e/tests/backend/endpoints/api/v1/auth/sessions/current/index.test.ts`
around lines 61 - 63, Replace the use of `any` when searching for a non-current
session by adding a defensive type guard and using it with Array.find on
`listRes.body.items`: create a type predicate (e.g., `function isSession(obj:
unknown): obj is Session`) that checks for the expected properties including
`is_current_session`, then call `const nonCurrentSession =
listRes.body.items.find(isSession && (s => !s.is_current_session));` (or find
using a separate `isNonCurrentSession` predicate) so the result is properly
typed instead of `any`; ensure the test imports/defines the `Session` shape and
asserts `nonCurrentSession` is defined.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@apps/e2e/tests/backend/endpoints/api/v1/auth/sessions/current/index.test.ts`:
- Line 10: The test silently coerces backendContext.value.userAuth to undefined
via "const savedAuth = backendContext.value.userAuth ?? undefined", which masks
setup failures; change this to fail fast by asserting presence of userAuth
(e.g., throw or use your test assertion helper) and assign the non-null value to
savedAuth only after the check so savedAuth and backendContext.value.userAuth
are guaranteed defined (replace the null-coalescing line with an explicit check
for backendContext.value.userAuth and an error/expect if missing).
- Around line 26-27: Remove the redundant status assertions like
expect(response.status).not.toBe(500) and instead rely on the existing response
snapshot; either delete those .toBe/.not.toBe lines or fold the status into the
inline snapshot by ensuring the call to expect(response).toMatchInlineSnapshot
captures the status field. Update all occurrences referencing the response
status (the expect(response.status)... assertions found alongside
expect(response).toMatchInlineSnapshot) so tests follow the guideline of
preferring .toMatchInlineSnapshot in these specs.
- Around line 61-63: Replace the use of `any` when searching for a non-current
session by adding a defensive type guard and using it with Array.find on
`listRes.body.items`: create a type predicate (e.g., `function isSession(obj:
unknown): obj is Session`) that checks for the expected properties including
`is_current_session`, then call `const nonCurrentSession =
listRes.body.items.find(isSession && (s => !s.is_current_session));` (or find
using a separate `isNonCurrentSession` predicate) so the result is properly
typed instead of `any`; ensure the test imports/defines the `Session` shape and
asserts `nonCurrentSession` is defined.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: bb1181ed-536c-4aa8-88bd-db2e970b436b

📥 Commits

Reviewing files that changed from the base of the PR and between 0df594a and 7542903.

📒 Files selected for processing (2)
  • apps/backend/src/lib/external-db-sync.ts
  • apps/e2e/tests/backend/endpoints/api/v1/auth/sessions/current/index.test.ts

…return

Co-Authored-By: Konstantin Wohlwend <n2d4xc@gmail.com>
The INSERT...SELECT queries filter by primary key so insertedCount can
only be 0 (race condition - concurrent bulk delete already handled it)
or 1. Neither case warrants an assertion.

Co-Authored-By: Konstantin Wohlwend <n2d4xc@gmail.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
apps/backend/src/lib/external-db-sync.ts (1)

129-151: 💤 Low value

PR description mentions logging via captureError for the 0-row case, but no logging is implemented.

The PR description states: "the PR description states this case is logged via captureError because a concurrent bulk operation likely already recorded the deletion." However, the current implementation just executes the SQL and returns silently when 0 rows are inserted.

If the intent is to silently accept 0 rows (which is reasonable given the race condition is benign), this is fine—but the PR description should be updated. If observability for race conditions is desired, consider capturing the result:

Optional: Add observability for race conditions
-    await tx.$executeRaw(Prisma.sql`
+    const insertedCount = await tx.$executeRaw(Prisma.sql`
       INSERT INTO "DeletedRow" (
         ...
       FOR UPDATE
     `);
+    if (insertedCount === 0) {
+      captureError("recordExternalDbSyncDeletion-race", new Error(
+        `Race condition: ${target.tableName} row already deleted (tenancyId=${target.tenancyId})`
+      ));
+    }

     return;

This pattern would apply to all 10 branches in this function. Given this is optional observability and the race condition is benign, this is a nice-to-have rather than a requirement.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/backend/src/lib/external-db-sync.ts` around lines 129 - 151, The SQL
block that inserts into "DeletedRow" via tx.$executeRaw for ProjectUser
currently ignores the 0-row result despite the PR claiming captureError logging;
either update the PR description to remove the logging claim or add
observability: capture the INSERT result (rows affected) after tx.$executeRaw
and, when it is 0, call captureError with context including target.tenancyId and
target.projectUserId (and the tableName "ProjectUser") to record the benign race
condition; apply the same pattern to the other branches in this function to
match the PR description.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@apps/backend/src/lib/external-db-sync.ts`:
- Around line 129-151: The SQL block that inserts into "DeletedRow" via
tx.$executeRaw for ProjectUser currently ignores the 0-row result despite the PR
claiming captureError logging; either update the PR description to remove the
logging claim or add observability: capture the INSERT result (rows affected)
after tx.$executeRaw and, when it is 0, call captureError with context including
target.tenancyId and target.projectUserId (and the tableName "ProjectUser") to
record the benign race condition; apply the same pattern to the other branches
in this function to match the PR description.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b5c58687-2ec6-40fd-a2e0-c029112da39d

📥 Commits

Reviewing files that changed from the base of the PR and between 476bba8 and acfb73f.

📒 Files selected for processing (1)
  • apps/backend/src/lib/external-db-sync.ts

@N2D4 N2D4 merged commit 600a0d6 into dev May 22, 2026
31 of 35 checks passed
@N2D4 N2D4 deleted the devin/1779398237-fix-external-db-sync-deletion-race branch May 22, 2026 00:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant