Skip to content

Move OAuth state to Neon Postgres (atomic one-time-use) behind STORE_BACKEND flag#184

Merged
JakeSCahill merged 4 commits into
feature/mcp-email-authfrom
feature/mcp-oauth-neon-store
Jun 22, 2026
Merged

Move OAuth state to Neon Postgres (atomic one-time-use) behind STORE_BACKEND flag#184
JakeSCahill merged 4 commits into
feature/mcp-email-authfrom
feature/mcp-oauth-neon-store

Conversation

@JakeSCahill

Copy link
Copy Markdown
Contributor

Stacked on #181 (feature/mcp-email-auth) — base will retarget to main once #181 merges. Review the compare against feature/mcp-email-auth for the isolated diff.

What & why

Moves the OAuth authorization server's one-time-use / transactional state (auth requests, auth codes, refresh tokens, families) from Netlify Blobs to Neon Postgres (Netlify DB), behind a STORE_BACKEND flag.

This closes the one real correctness gap the Blobs code already documents: Blobs has no compare-and-swap, so one-time-use is read-then-delete / read-then-mark. Two concurrent requests with the same auth code (or refresh token) can both observe it unused before either consumes it. The meaningful case is a refresh token: a concurrent legit + stolen rotation both succeed, defeating the family reuse-detection (theft signal). Postgres fixes it with a single atomic statement.

How

  • Backend selector (lib/oauth/store.mjs): same exported interface; picks db/blobs.mjs (default) or db/neon.mjs by STORE_BACKEND. Callers don't change. DCR clients stay on Blobs (plain persistence — no atomicity benefit, smaller migration surface).
  • Atomic consume on Neon: takeAuthCodeUPDATE … SET used=true WHERE used=false AND expires_at>now() RETURNING *; refresh rotation → consumeRefresh does UPDATE … WHERE used=false RETURNING *. Exactly one concurrent caller wins; the loser is treated as reuse → family revoked, restoring theft detection under races.
  • Netlify DB native flow: uses @netlify/database (getDatabase(), zero-config, reads NETLIFY_DATABASE_URL, fail-closed). Schema lives in netlify/database/migrations/, auto-applied on deploy — including to per-preview DB branches.
  • TTL cleanup: a daily scheduled function (oauth-cleanup.mjs) deletes expired requests/codes and past-expiry refresh tokens, then sweeps empty families. No-ops unless STORE_BACKEND=neon. Bounds growth (Blobs never GC'd these).

Rollout (safe, reversible)

  1. Deploy with STORE_BACKEND unset → stays on Blobs, zero behavior change.
  2. STORE_BACKEND=neon is set on the deploy-preview context only → previews exercise Neon (own DB branch, migrations auto-applied).
  3. Verify the flow on a preview, then set neon on production.
  4. Roll back instantly by resetting STORE_BACKEND=blobs — no code revert.

Cutover note: flipping blobs→neon doesn't migrate rows, so live refresh tokens (in Blobs) won't exist in Neon — users re-authenticate once. Auth codes (60s TTL) are unaffected in practice. Flip during low traffic.

Tests

  • All existing suites pass on the default (Blobs) backend; 56 passing.
  • tests/mcp-oauth-neon.test.ts proves the fix against real Postgres (skipped unless TEST_NEON_URL is set — a fake honoring atomic semantics would prove nothing): two concurrent auth-code consumes / refresh rotations yield exactly one winner; cleanup removes expired rows. Run against a disposable Neon branch (it truncates those tables), never the primary.

Out of scope (deferred, tracked separately)

User capture, rate-limit counters, and the dev signing key stay on Blobs. Security review also flagged (non-blocking): DNS-rebinding residual in the CIMD SSRF guard, and showing client name/consent on the login interstitial before phase-2 durable identity.

@JakeSCahill JakeSCahill requested a review from a team as a code owner June 22, 2026 10:42
@netlify

netlify Bot commented Jun 22, 2026

Copy link
Copy Markdown

Deploy Preview for redpanda-documentation ready!

Name Link
🔨 Latest commit 3a81089
🔍 Latest deploy log https://app.netlify.com/projects/redpanda-documentation/deploys/6a391eacc7f44d0008b3984c
😎 Deploy Preview https://deploy-preview-184--redpanda-documentation.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.
Lighthouse
Lighthouse
1 paths audited
Performance: 85 (🟢 up 18 from production)
Accessibility: 92 (🔴 down 2 from production)
Best Practices: 92 (no change from production)
SEO: 83 (no change from production)
PWA: -
View the detailed breakdown and full score reports

To edit notification comments on pull requests, go to your Netlify project configuration.

Split the OAuth state store into pluggable backends behind store.mjs:
- db/blobs.mjs: the current Netlify Blobs implementation (extracted, no
  behavior change), default backend.
- db/neon.mjs: a Neon Postgres backend whose one-time-use consumes are
  atomic single statements (UPDATE/DELETE ... RETURNING), closing the
  read-then-delete race Blobs can't (no compare-and-swap).
- store.mjs: thin selector by STORE_BACKEND (default blobs); DCR clients
  stay on Blobs (plain persistence, no atomicity benefit).

Replaces the non-atomic markRefreshUsed with consumeRefresh: on Neon only
one of two concurrent refreshes wins the row; the loser is treated as
reuse and the family is revoked, restoring theft detection under races.

Neon driver is imported lazily so the default path needs no DB or dep.
No caller behavior changes on the default backend; all 56 tests pass.
- Migration SQL (db/migrations/0001_oauth_state.sql) for the four
  one-time-use/transactional tables, with expires_at indexes.
- cleanupExpired() + a daily scheduled function (oauth-cleanup.mjs) that
  deletes expired requests/codes and past-expiry refresh tokens, then
  sweeps empty families. No-ops unless STORE_BACKEND=neon. Bounds growth.
- @neondatabase/serverless dependency (HTTP driver; no Drizzle — the
  atomic ops are single hand-written statements).
- Real-Postgres concurrency tests (tests/mcp-oauth-neon.test.ts), skipped
  unless TEST_NEON_URL is set: prove two concurrent auth-code consumes /
  refresh rotations yield exactly one winner, and cleanup removes expired
  rows. A fake can't prove atomicity, so these require a real DB.

56 tests pass; 3 Neon tests skip without a DB URL.
The database is provisioned and attached to the redpanda-documentation
site. Wire the code to Netlify's managed flow:

- Use @netlify/database (the package db init installed) instead of the
  raw @neondatabase/serverless driver: neon.mjs now connects via
  getDatabase().httpClient (zero-config, reads NETLIFY_DATABASE_URL,
  fail-closed if absent).
- Move the schema into Netlify's auto-applied migrations directory
  (netlify/database/migrations/), so it's applied on deploy — including
  to per-preview DB branches. Removes the hand-rolled migrations path.
- Update the atomicity test to the new path + @netlify/database client.

56 tests pass; 3 Neon tests skip without TEST_NEON_URL.
@JakeSCahill JakeSCahill force-pushed the feature/mcp-oauth-neon-store branch from e69189b to 3a81089 Compare June 22, 2026 11:38
@JakeSCahill JakeSCahill merged commit 3d76b12 into feature/mcp-email-auth Jun 22, 2026
3 of 4 checks passed
@JakeSCahill JakeSCahill deleted the feature/mcp-oauth-neon-store branch June 22, 2026 11:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant