Skip to content

release: To Prod#1375

Merged
eskp merged 16 commits into
prodfrom
staging
May 26, 2026
Merged

release: To Prod#1375
eskp merged 16 commits into
prodfrom
staging

Conversation

@suisuss

@suisuss suisuss commented May 26, 2026

Copy link
Copy Markdown

No description provided.

suisuss and others added 11 commits May 26, 2026 11:31
… aliases

Rename MCP execution-tool params for consistency, keeping the old names as
deprecated aliases for one release:
- all three tools: network -> chain_id
- execute_transfer: recipient_address -> to_address

Handlers resolve canonical-then-alias and map onto the existing route body
contract (chainId, recipientAddress), so the HTTP routes and validation are
unchanged. The per-call wallet override from the original ticket is deferred
to a follow-up (multi-wallet-per-org model required).
The signup Discord webhook and MailerLite subscribe fire from two hooks
in lib/auth.ts: databaseHooks.user.create.after (gated on emailVerified)
and emailVerification.afterEmailVerification (no gate). Both can deliver
a user object whose row is not actually freshly created.

The trigger observed in prod on 2026-05-25 14:12 UTC: a POST /sign-up/email
landed for an email (jerrygithub360@gmail.com) that already had an
OAuth-only account from 2026-04-27. Better Auth's email/password sign-up
flow sends an OTP for any email it is asked about; once verified, the
afterEmailVerification hook fires for the already-existing user and
notifyDiscordSignup posts a false-positive "New signup" alert to the
incident channel. Three subsequent password-sign-in attempts failed with
"Credential account not found" because the existing user is OAuth-only.
Loki confirmed the firing sequence end-to-end.

Add a small freshness helper (lib/auth-notification-guard.ts) and wrap
both call sites with it: notifications fire only when user.createdAt is
within the last hour. Real signups complete OTP / verification-link
flows well within that window; any "re-event" path that surfaces an
older user object is now silently filtered out of the signup channel.

The window is intentionally generous (1 hour, not 5 min) to cover slow
inbox-check verification flows without being so long that re-events
slip through.

This fix is independent of and additive to KEEP-621 (Cloudflare
Turnstile on /sign-up/email), which had not yet shipped to prod at the
time of the observed incident.

Tests:
- lib/auth-notification-guard.ts: new helper, single exported function
- tests/unit/auth-notification-guard.test.ts: 9 cases covering fresh
  user, slow-OTP user, expired user, the 2026-05-25 regression case,
  exact-window-boundary, ISO-string timestamp, missing field, null
  field, invalid date string
- pnpm type-check: clean
- pnpm check: 0 diagnostics on changed files; repo-wide error count
  decreases by 1 (the previously-present "[Auth] afterEmailVerification
  fired" console.log is removed)
…ol fields

Drives a real MCP Client <-> McpServer over an in-memory transport so the
SDK's own Zod validation runs the full tools/call path: confirms the renamed
schemas accept both canonical (chain_id, to_address) and deprecated
(network, recipient_address) payloads, and reject a payload missing the
required amount before it reaches the handler.
MCP clients read the tool schema on every discovery, so there is no need to
carry the old field names - a hard rename is safe. Dropping the aliases lets
the renamed fields (chain_id, to_address) stay required (z.string()) instead
of optional with a route-level fallback, so the SDK enforces presence at the
schema boundary again. Renames the test file accordingly and asserts the old
names are now rejected.
…-field tests

Two fixes:
- execute_contract_call dropped the payable value: the route read body.ethValue
  but the handler sends value. Route now reads body.value (the documented and
  MCP-exposed name) and maps it to writeContractCore's ethValue param. Also
  corrects the value description/docs from wei to ether-decimal, matching how
  writeContractCore parses it (ethers.parseEther).
- The previous commit renamed the test file via git mv but did not stage the
  rewritten body, so it shipped the old alias-era assertions and failed CI.
  Stage the no-alias assertions that match the current schema.
Address code review feedback on PR #1372:

- Widen NEW_SIGNUP_NOTIFICATION_WINDOW_MS from 1h to 24h. The 1h ceiling
  collides with Better Auth's default 3600s verification-link TTL (a
  click at minute 59-60 with any network skew falls past the cutoff)
  and with the OTP-resend path where a user who started signup hours
  earlier and finally enters the OTP would be silently dropped. 24h
  comfortably absorbs both flows plus app-pod clock skew while still
  suppressing the re-verification-of-existing-user case the guard
  exists for. Export the constant so tests can reference it directly.

- Change the boundary check from `<` to `<=` so the exact window tick
  is treated as fresh; closes a one-millisecond hole at the upper edge.

- Tighten the parameter type from `Date | string | null` to
  `Date | null`. Better Auth's adapter always passes a Date object
  (`internalAdapter` -> Drizzle -> pg returns `Date` for `timestamp`
  columns), so the string branch was unreachable in current callers
  but exposed a latent timezone-naive-string parsing hazard if anyone
  ever called the helper with a naive ISO string (`new Date(str)` is
  locale-dependent without an offset). Dropping the string branch
  eliminates that surface and matches Better Auth's User type
  (`createdAt: Date`).

- Restore the multi-line ternary in lib/auth.ts:570-577 to the
  single-line form that ships on staging. This was a formatter
  rule that I had reverted in the previous commit thinking it was
  unrelated; it's the canonical Biome-preferred form.

- Strengthen the OAuth-path call-site comment: `user.create.after`
  only fires on actual inserts in current Better Auth (per
  `db/with-hooks.mjs:25-42`), so the freshness check there is pure
  belt-and-suspenders against a future adapter change. Real OAuth
  signups have `createdAt = now` and pass it trivially.

Tests:
- Expand from 9 to 11 cases. New: exact-window-boundary-passes (now
  inclusive), 1-hour-link-TTL with margin (Better Auth default flow),
  future-dated createdAt (NTP skew), epoch-zero. Pin the constant by
  importing NEW_SIGNUP_NOTIFICATION_WINDOW_MS so future changes to the
  window are reflected uniformly. Keep the 28-day production
  regression anchored on the actual 2026-04-27 incident timestamp.

- pnpm type-check: clean
- pnpm check (scoped to changed files): 0 diagnostics, no fixes applied
- npx vitest run tests/unit/auth-notification-guard.test.ts: 11/11 pass
…llet-override

feat: standardise execution-tool field names with deprecated aliases
Anonymous Better Auth sessions carry a session cookie and a user with
twoFactorEnabled = false, so the proxy MFA gate blocked them with a 403
mfa_enrollment_required on /api/workflows and a redirect to /enroll-mfa
on page loads. Anonymous users have no permanent identity to enroll, so
short-circuit them via isAnonymousUserShape before the enrollment and
step-up checks, matching how the totp endpoints already gate them.
Keying the exemption off isAnonymousUserShape was fail-open: its name and
email heuristics are user-controllable (name is editable via /api/user),
so a real MFA-mandated user could bypass the gate by renaming themselves
"Anonymous". Check the authoritative is_anonymous column directly instead.
Add a regression test asserting a real user named "Anonymous" is still
blocked.
The navbar bottom-border pseudo-element used width: 100vw, which includes
the vertical scrollbar gutter and made the element wider than the document,
producing a few pixels of page-level horizontal scroll. The left: 0; right: 0
already span the navbar width, so the width declaration was redundant.

Also add overflow-x: clip on html and body as a guard against future
horizontal overflow. clip (not hidden) avoids creating a scroll container,
so position: sticky on the navbar and sidebar keeps working.
…-scroll

fix: prevent horizontal page scroll in docs-site
…webhook

fix(auth): suppress signup notifications for non-fresh users
fix: exempt anonymous sessions from mandatory-MFA gate
In production better-auth prefixes session cookies with __Secure-
because useSecureCookies is true. The Set-Cookie response from
auth.api.signInEmail (the first link in the strict-signin chain)
was parsed with a regex split that assumed cookie names start with
[A-Za-z], silently dropping the two_factor cookie and feeding only
the session-clearing pair into the next chained call. verifyTOTP
then threw INVALID_TWO_FACTOR_COOKIE and the user saw "Sign-in
failed at session step" on the Authenticator step.

The OAuth path bypasses this chain (mints the session via direct
Drizzle insert in oauth-mfa-finalize), which is why MFA worked for
Google sign-in but failed for credential sign-in in production
only. Dev and staging without __Secure- cookies passed.

Replace the regex parse with Headers.getSetCookie(), which returns
each Set-Cookie as a discrete string. Extract the read + cookie-
header-build helpers into lib/auth-cookie-chain.ts so they can be
unit-tested and reused. Add a regression test that exercises the
production __Secure-prefixed names; the previous parse would fail
it.
eskp and others added 2 commits May 26, 2026 14:36
Review of the initial fix found two follow-ups:

1. readAllSetCookies fell back to headers.get("set-cookie") when the
   runtime lacked getSetCookie(). That fallback returns the
   comma-joined Set-Cookie string as a single array element, which
   the downstream attribute-strip then truncates to the first
   name=value pair only — the exact failure mode the helper exists
   to prevent. Throw instead so the failure is loud and impossible
   to silently reintroduce on a future runtime swap or in a unit-
   test harness that stubs Headers.

2. /api/auth/strict-signin/start had an inline copy of the same
   "getSetCookie() else fallback" pattern. Replace with a direct
   call to readAllSetCookies so both routes share the same loud-
   failure path.

Tests: add an assertion that readAllSetCookies throws when
getSetCookie is unavailable, and a toHaveLength(2) check in the
end-to-end chain test so a future "we joined them again" refactor
is caught immediately.
…ie-prefix

fix(auth): credential MFA sign-in failed at session step in prod
@eskp eskp merged commit 63429d5 into prod May 26, 2026
22 of 23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants