Skip to content

release: to prod#1540

Merged
joelorzet merged 46 commits into
prodfrom
staging
Jun 12, 2026
Merged

release: to prod#1540
joelorzet merged 46 commits into
prodfrom
staging

Conversation

@joelorzet

Copy link
Copy Markdown

No description provided.

joelorzet added 30 commits June 11, 2026 14:27
Shared utilities for the rate-limit header work:
- rate-limit-headers: applyRateLimitHeaders/rateLimitHeaders emit
  X-RateLimit-Limit/Remaining/Reset, Retry-After on 429, and an optional
  X-Poll-Interval-Hint, on both Response and NextResponse.
- http-status: named HttpStatus codes to replace bare numeric literals.
Enrich every request-rate limiter to carry limit/remaining/reset and wire
applyRateLimitHeaders into the endpoints, on both success and 429:
- Shared limiters: execute REST, MCP transport, IP-limited OAuth and MCP
  workflow catalog/call/listing.
- Per-surface limiters: ai/generate, agentic-wallet provision (hour bucket),
  invitations fetch, workflow vote.
- Status/long-poll endpoints also emit X-Poll-Interval-Hint (0 once terminal).
- Concurrency guards (workflow execute + webhook) get best-effort headers.

Anti-abuse limiters (MFA dual-factor, strict-signin, verify-ip) keep
Retry-After only, with no remaining-budget header, to avoid disclosing a
caller's guess budget; added the previously-missing Retry-After header where
it lived only in the JSON body. Better Auth's built-in limiter now also
mirrors its non-standard X-Retry-After onto the standard Retry-After.

Status-code literals in the touched routes are migrated to the HttpStatus
constant.

KEEP-501
Add the X-RateLimit-Limit/Remaining/Reset and Retry-After reference to the
API error docs, note the anti-abuse endpoints that intentionally omit the
remaining-budget header, and document X-Poll-Interval-Hint on the execution
status endpoint.
Unit tests for the header helper (standard headers, Retry-After only on
denial, poll-interval hint including zero, header preservation) and for the
limit/remaining/reset metadata returned by each enriched limiter.
…ders

Some Response variants (redirects) carry immutable headers and throw on
.set(); fall back to rebuilding the response with the rate-limit headers
merged in.
The dual-factor limiter is enforced inside the shared requireDualFactor helper,
and its 429 callers rendered the failure via { status } only, dropping the
retry hint entirely. Add a dualFactorErrorResponse helper that surfaces the
standard Retry-After header on the 429 (anti-abuse limiter, so no
remaining-budget headers) and route every caller through it; thread the value
through the wallet/withdraw validator that returns a plain result object.
…-After

Better Auth emits only X-Retry-After on its 429s. The normalizer was adding
the standard Retry-After but leaving X-Retry-After in place, so auth 429s
carried two headers for the same thing. Copy the value to Retry-After and
delete X-Retry-After so the response carries exactly one, standard, signal.
Backs Stripe-style Idempotency-Key support for mutating API endpoints. A row
is reserved per (organization, scope, key); the response is stored once the
work finishes and replayed for retries within the TTL. Unique index on
(organization_id, scope, idempotency_key); expires_at index for sweeping.
Hand-authored migration, matching the repo convention (drizzle-kit generate
is blocked by pre-existing snapshot collisions).
beginIdempotent reserves a slot race-safely (insert-or-conflict on the unique
index) and resolves to replay / conflict (409 + originalExecutionId) /
in_progress (409) / proceed. recordIdempotentResponse stores a 2xx for replay
and releases the lock on any other status so the client can retry. A short
processing lock (10m, crash recovery) extends to 24h on completion.
Direct-execution writes (transfer, contract-call, check-and-execute, node,
protocol passthrough), the workflow webhook, workflow execute, and workflow
create now reserve and replay on an Idempotency-Key header. Keys are scoped per
organization (webhook and workflow execute additionally per workflow).
Read-only and dry-run paths are unaffected.
create_workflow, execute_workflow, execute_transfer, execute_contract_call, and
execute_check_and_execute gain an optional idempotency_key argument that callApi
forwards as the Idempotency-Key header, so the REST layer stays the single
source of truth for dedup.
Idempotency section in the direct-execution reference (replay / conflict /
in-progress semantics, per-org scope, 24h window) and the conflict error codes
in the API error reference.
Request-hash key-order independence, outcome-to-response mapping, and response
recording (finalize on 2xx with resource-id extraction, release otherwise).
…ent_secret

Authorization codes were persisted in plaintext while refresh tokens on the
same store were sha256-hashed. Store the code hash and look it up by hash so a
database read alone cannot yield a replayable code.

The token endpoint loaded the OAuth client but never verified clientSecretHash,
leaving it dead code on both grant paths. Persist the registered
token_endpoint_auth_method and require a valid client_secret (via body or HTTP
Basic) for confidential clients on the authorization_code and refresh_token
grants. Public PKCE clients ("none") are unaffected; existing rows default to
"none" so they keep working.
The session JWT carries an apiKeyId (the `key` claim) that was reconstructed on
the slow path without re-checking it against the request principal. A leaked
MCP_SESSION_SECRET would let any authenticated caller forge a session pinned to
a different principal's apiKeyId. Require the JWT `key` claim to match the
freshly-authenticated caller on the resolve and delete paths, and check the
cached entry's apiKeyId on the fast path. authenticate() already validates the
caller's key live, so this also rejects sessions whose underlying key was
revoked or rotated.
…shing

wrapWithSessionTokenHash only hashes tokens at the database layer. Enabling
better-auth secondaryStorage would write plaintext session tokens to the cache
and re-open the DB/cache-read replay path. Fail CI if secondaryStorage appears
in lib/auth.ts so the bypass cannot be enabled without re-implementing token
hashing at the cache layer first.
# Conflicts:
#	app/api/ai/generate/route.ts
#	app/api/auth/[...all]/route.ts
#	app/api/execute/[executionId]/status/route.ts
#	app/api/execute/check-and-execute/route.ts
#	app/api/execute/contract-call/route.ts
#	app/api/execute/node/route.ts
#	app/api/execute/transfer/route.ts
#	app/api/mcp/workflows/[slug]/listing/route.ts
#	app/api/oauth/register/route.ts
#	app/api/workflow/[workflowId]/execute/route.ts
# Conflicts:
#	app/api/execute/[...slug]/route.ts
#	app/api/execute/check-and-execute/route.ts
#	app/api/execute/contract-call/route.ts
#	app/api/execute/node/route.ts
#	app/api/execute/transfer/route.ts
The rate-limit work incidentally reflowed lib/oauth-mfa-cookie.ts via biome.
The moved sign() line re-triggered a pre-existing CodeQL js/insufficient-password-hash
false positive (HMAC-SHA256 cookie MAC, not password storage). Restore the file to
match staging so it leaves the PR diff entirely; it is unrelated to rate-limit headers.
Restore the file to staging so the reflowed sign() line leaves the PR diff,
clearing a pre-existing CodeQL js/insufficient-password-hash false positive
(HMAC-SHA256 cookie MAC, not password storage).
The route now imports @/lib/idempotency, which pulls in server-only and broke
suite collection (0 tests). Stub the wrapper to a pass-through; idempotency has
its own unit coverage.
The branch had drifted ~70 files from staging (stale versions carried from an
old base) unrelated to idempotency. Those stale files tripped the ultracite
lint (it cannot parse biome output when the tree has errors). Restore every
non-feature file to current staging so the PR diff is exactly the 17
idempotency files, which are lint-clean.
The branch had drifted ~76 files from staging (stale versions from an old base)
unrelated to rate-limit headers, breaking the build. Restore every non-feature
file to current staging so the PR diff is exactly the rate-limit changes.
…sion-cookie-threat-model

# Conflicts:
#	drizzle/meta/_journal.json
Only require client_secret for clients that registered client_secret_post
or client_secret_basic. The previous check demanded a secret for any value
other than "none", which wrongly rejected PKCE clients whose auth method is
unset (rows predating the column default to "none").
joelorzet added 16 commits June 12, 2026 10:40
…h test

The staging merge added an org-membership re-check on the refresh path; the
existing test mocked the surrounding modules but not @/lib/workflow/access,
so the happy-path cases hit the real db and threw.
The PR routes dual-factor failures through the new dualFactorErrorResponse
helper; the test mocked @/lib/mfa/dual-factor without it, so the call threw
and returned 500 instead of the expected status.
- advertise the binding limiter (call bucket) on a successful public MCP
  tools/call instead of the looser list bucket
- emit Retry-After and rate-limit headers on the webhook execution-limit 429
- derive the agentic-wallet reset/Retry-After from the SQL hour boundary
  rather than the Node process clock
- attach rate-limit headers to the status-route 404 that already consumed a slot
- namespace the reservation scope per action so two distinct operations
  sharing a body and key no longer collide on one record
- heartbeat the processing lock so a long on-chain confirmation is not
  reclaimed and re-broadcast mid-flight
- drive finalize-vs-release off the execution outcome: keep failed records
  for replay instead of deleting a row whose tx may already be broadcast
- fence finalize/release/heartbeat on a per-acquire lock version and bound
  a single request's worst-case runtime below the processing-lock TTL
…ients

- treat a client as confidential only when it explicitly registers
  client_secret_post or client_secret_basic, so a PKCE client that omits
  the field is not forced into a 401 on token exchange
- keep legacy rows public so secret enforcement applies only to clients
  registered after this change
…-threat-model

fix(mcp): harden MCP OAuth codes and bind session tokens to caller
…-webhooks

feat: Idempotency-Key support for mutating API endpoints
# Conflicts:
#	app/api/execute/[...slug]/route.ts
#	app/api/execute/check-and-execute/route.ts
#	app/api/execute/contract-call/route.ts
#	app/api/execute/node/route.ts
#	app/api/execute/transfer/route.ts
…onse-headers

feat: rate-limit response headers across all rate-limited endpoints
@joelorzet joelorzet requested review from a team, OleksandrUA, eskp and suisuss and removed request for a team June 12, 2026 22:41
@joelorzet joelorzet merged commit adf8351 into prod Jun 12, 2026
25 of 27 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant