fix(auth): suppress signup notifications for non-fresh users#1372
Conversation
The signup Discord webhook and MailerLite subscribe fire from two hooks in lib/auth.ts: databaseHooks.user.create.after (gated on emailVerified) and emailVerification.afterEmailVerification (no gate). Both can deliver a user object whose row is not actually freshly created. The trigger observed in prod on 2026-05-25 14:12 UTC: a POST /sign-up/email landed for an email (jerrygithub360@gmail.com) that already had an OAuth-only account from 2026-04-27. Better Auth's email/password sign-up flow sends an OTP for any email it is asked about; once verified, the afterEmailVerification hook fires for the already-existing user and notifyDiscordSignup posts a false-positive "New signup" alert to the incident channel. Three subsequent password-sign-in attempts failed with "Credential account not found" because the existing user is OAuth-only. Loki confirmed the firing sequence end-to-end. Add a small freshness helper (lib/auth-notification-guard.ts) and wrap both call sites with it: notifications fire only when user.createdAt is within the last hour. Real signups complete OTP / verification-link flows well within that window; any "re-event" path that surfaces an older user object is now silently filtered out of the signup channel. The window is intentionally generous (1 hour, not 5 min) to cover slow inbox-check verification flows without being so long that re-events slip through. This fix is independent of and additive to KEEP-621 (Cloudflare Turnstile on /sign-up/email), which had not yet shipped to prod at the time of the observed incident. Tests: - lib/auth-notification-guard.ts: new helper, single exported function - tests/unit/auth-notification-guard.test.ts: 9 cases covering fresh user, slow-OTP user, expired user, the 2026-05-25 regression case, exact-window-boundary, ISO-string timestamp, missing field, null field, invalid date string - pnpm type-check: clean - pnpm check: 0 diagnostics on changed files; repo-wide error count decreases by 1 (the previously-present "[Auth] afterEmailVerification fired" console.log is removed)
Address code review feedback on PR #1372: - Widen NEW_SIGNUP_NOTIFICATION_WINDOW_MS from 1h to 24h. The 1h ceiling collides with Better Auth's default 3600s verification-link TTL (a click at minute 59-60 with any network skew falls past the cutoff) and with the OTP-resend path where a user who started signup hours earlier and finally enters the OTP would be silently dropped. 24h comfortably absorbs both flows plus app-pod clock skew while still suppressing the re-verification-of-existing-user case the guard exists for. Export the constant so tests can reference it directly. - Change the boundary check from `<` to `<=` so the exact window tick is treated as fresh; closes a one-millisecond hole at the upper edge. - Tighten the parameter type from `Date | string | null` to `Date | null`. Better Auth's adapter always passes a Date object (`internalAdapter` -> Drizzle -> pg returns `Date` for `timestamp` columns), so the string branch was unreachable in current callers but exposed a latent timezone-naive-string parsing hazard if anyone ever called the helper with a naive ISO string (`new Date(str)` is locale-dependent without an offset). Dropping the string branch eliminates that surface and matches Better Auth's User type (`createdAt: Date`). - Restore the multi-line ternary in lib/auth.ts:570-577 to the single-line form that ships on staging. This was a formatter rule that I had reverted in the previous commit thinking it was unrelated; it's the canonical Biome-preferred form. - Strengthen the OAuth-path call-site comment: `user.create.after` only fires on actual inserts in current Better Auth (per `db/with-hooks.mjs:25-42`), so the freshness check there is pure belt-and-suspenders against a future adapter change. Real OAuth signups have `createdAt = now` and pass it trivially. Tests: - Expand from 9 to 11 cases. New: exact-window-boundary-passes (now inclusive), 1-hour-link-TTL with margin (Better Auth default flow), future-dated createdAt (NTP skew), epoch-zero. Pin the constant by importing NEW_SIGNUP_NOTIFICATION_WINDOW_MS so future changes to the window are reflected uniformly. Keep the 28-day production regression anchored on the actual 2026-04-27 incident timestamp. - pnpm type-check: clean - pnpm check (scoped to changed files): 0 diagnostics, no fixes applied - npx vitest run tests/unit/auth-notification-guard.test.ts: 11/11 pass
|
Code review feedback addressed in BLOCKER B-1 (window too tight) — addressed.
WARN W-1 (change-email path also suppressed) — intentional, now noted in comments. WARN W-2, W-4 (type-shape hazards) — addressed. WARN W-3 (boundary semantics didn't match comment) — addressed by WARN W-5 (OAuth-path guard is defense-in-depth) — kept the guard, updated the comment to call it out explicitly. Cost is one boolean predicate; benefit is robustness if Better Auth's hook semantics ever drift. WARN W-6 (test should pin the constant) — addressed. WARN W-7 (test coverage gaps) — most addressed. New cases: future-dated createdAt (NTP skew), epoch zero, exact-boundary inclusive, 1-hour-link-TTL with margin. ISO-string and Unix-millis cases dropped along with the type tightening (no longer reachable). WARN W-8 (clock skew tolerance was zero) — absorbed by the 24h window. INFO I-1, I-2, I-3, I-4 — comment + test docstring tweaks; literal-date kept in the regression test because the value is the actual prod-incident timestamp ( Adjacent finding filed as KEEP-634 (assigned Joel, High, SEV-1 project): the underlying Better Auth Verification:
Ready for re-review. |
🧹 PR Environment Cleaned UpThe PR environment has been successfully deleted. Deleted Resources:
All resources have been cleaned up and will no longer incur costs. |
ℹ️ No PR Environment to Clean UpNo PR environment was found for this PR. This is expected if:
|
Summary
lib/auth-notification-guard.ts) that gatesnotifyDiscordSignupandsubscribeToMailerLiteonuser.createdAtbeing within the last hour.lib/auth.ts(databaseHooks.user.create.afterandemailVerification.afterEmailVerification) with the guard.What was firing the false-positive ping
Last night the signup Discord channel pinged for
jerry george / jerrygithub360@gmail.com / OAutheven though that user has existed since 2026-04-27. Loki + the prod DB nailed the flow end-to-end (timestamps in UTC):The user row's
created_at = 2026-04-27 21:02:58.926Z,updated_atbumped to2026-05-25 14:12:44.445Z(the verification touchedupdated_atbut notcreated_at). Only one accounts row exists,provider_id = github, withcreated_at4 milliseconds after the user row's — confirming the original signup was OAuth-only, not email/password.So the actor: posted
/sign-up/emailwith an existing email, received an OTP in that mailbox, entered it. Better Auth firedafterEmailVerificationagainst the existing user row (better-auth'supdateUserdoesn't resetcreatedAt), and our notification helper happily pinged the signup channel because it had no concept of "is this user actually new". Three follow-up password-sign-in attempts failed because there is nocredentialaccounts row.Why the existing guards didn't catch it
databaseHooks.user.create.afteris gated onemailVerified, which is the right gate for an OAuth signup but doesn't help here because that hook didn't fire at all (no new user row was created).afterEmailVerificationhad no guard at all. The original commit body that introduced this notification path (2026-03-23) assumedafterEmailVerificationonly fires on first-signup verification. That's true in normal use, but Better Auth's hook is event-shaped ("an email got verified") not state-shaped ("a user just signed up"). When/sign-up/emailis hit with an existing email, the synthetic-anti-enumeration response combined withemailOTP.sendVerificationOnSignUp: truecauses an OTP to flow to the real inbox; verifying it then firesafterEmailVerificationagainst the existing row./sign-up/emailonly shipped to prod 10 hours after this incident (PR release: To Prod #1359 merged 2026-05-26 00:08 UTC; the false ping fired 2026-05-25 14:12 UTC). Even with Turnstile, an actor who clears the CAPTCHA can still trigger the same chain.Loki sweep: campaign or one-off?
48-hour search:
Sign-up attempt for existing email: 1 event total — this incident only.Credential account not found: 11 events across 5 emails — 4 of 5 are staff accounts with 1-5 fumbled attempts each (normal "wrong password" volume). Not a campaign signature.Fix
The 1-hour window is generous enough for slow OTP / verification-link flows (OTP expiry is 5 min but resends and email-link clicks can stretch the window) while suppressing any "re-event" path that surfaces a user row older than an hour. The same guard wraps both notification call sites in
lib/auth.ts.This fix narrows the predicate to the state-shape the original commit intended (
a user just signed up) instead of trusting the event-shape that Better Auth actually delivers (an email was verified). It does not depend on Better Auth's hook semantics being correct.Test plan
pnpm type-checkclean (exit 0)pnpm check0 diagnostics on changed files; repo-wide error count drops by 1 (the staleconsole.log("[Auth] afterEmailVerification fired", ...)is removed)npx vitest run tests/unit/auth-notification-guard.test.ts-> 9/9 passnpx vitest run: 6200 pass, 14 fail — all 14 are pre-existing E2E-Vitest failures requiring local Postgres :5433 / LocalStack SQS; none touchlib/auth*Out of scope, filed as separate Linear ticket (Joel / High / Security Incident project)
The deeper "/sign-up/email leaks OTP to existing-email inbox -> existing user can be re-verified by mailbox-holder" footgun. Three potential mitigations: set
sendVerificationOnSignUp: falseand request OTPs explicitly only for confirmed-new users; wireemailAndPassword.onExistingUserSignUpto send an account-recovery email instead; or open a Better Auth upstream issue to make/email-otp/verify-emailmirror/verify-email's "skip-if-already-verified" guard.