Skip to content

release: To Prod#1534

Merged
suisuss merged 68 commits into
prodfrom
staging
Jun 12, 2026
Merged

release: To Prod#1534
suisuss merged 68 commits into
prodfrom
staging

Conversation

@suisuss

@suisuss suisuss commented Jun 12, 2026

Copy link
Copy Markdown

No description provided.

suisuss and others added 30 commits June 3, 2026 21:48
Gas-credit allocations were snapshotted once per billing period and never
re-derived. When an org's plan or plan_overrides changed mid-period, the
allocation kept the cap from whatever plan it had when the row was first
written. Hit in prod: an org upgraded free -> enterprise stayed funded at the
free-tier cap for the rest of the period.

resolveAllocation now re-derives the cap on every read and raises the stored
allocation to it via onConflictDoUpdate guarded by setWhere
(allocated_cents < cap). Mid-cycle downgrades intentionally do NOT reduce the
current period's allocation (raise-only), to avoid clawing back credits an org
is already relying on.

Read-time self-heal was chosen over an event-driven refresh in the upgrade
handlers because it also covers manual enterprise provisioning (direct DB
edit, no code path to hook) and corrects existing stale rows on next read - so
no migration or backfill script is needed. No schema change.
…egistration

Open, unauthenticated Dynamic Client Registration validated redirect_uris
only with new URL(), so dangerous schemes (javascript:, data:, file:,
custom app schemes) and non-loopback http passed and were stored verbatim,
then used as redirect targets at the authorize step (F-001).

Add isAllowedRedirectUri() enforcing the OAuth 2.1 BCP / RFC 8252 allowlist
(https any host; http only for loopback) at registration, plus a
defense-in-depth guard on the authorize page for pre-existing rows. KEEP-726.
F-003: the dual-factor gate logged the user's live TOTP code in
cleartext (providedTotp) on every verify, a replayable secrets-in-logs
leak. Delete the leftover KEEP-471 [mfa-debug] diagnostics: the
plaintext-TOTP console.info and the no-two_factor-row console.warn in
the dual-factor gate, and the OAuth verify-mfa block that logged email
PII plus ran an extra two_factor query per login. Logging-only removal,
no control-flow change. Permanent oauth-callback traces are untouched.
…e reserved-field bypass

The /api/execute/node route gated ownership, wallet, and spending-cap checks on top-level integrationId/network only while spreading the full caller config into the step input. A caller could nest integrationId or network inside config to skip gating and still reach the step, enabling cross-tenant credential use and cap/wallet bypass. Gating now derives from a merged view of top-level and config values, and stepInput strips the reserved keys (network, integrationId, _context) so only route-resolved trusted values reach the step. Finding A-02 (KEEP-788).
… IDOR

F-002: getIntegration/updateIntegration/deleteIntegration fell back to a
createdBy-only filter when no active org was present, letting a user with a
null-org context GET/PUT/DELETE org-owned integrations they created in ANY
org, bypassing org-membership and deactivation-cascade gates.

Constrain the no-org personal branch to org-less rows (organizationId IS NULL)
so org-owned rows are unreachable via the personal path, while legitimate
personal (no-org) integrations stay accessible. Adds unit tests proving the
WHERE conditions for the personal vs org branches (KEEP-727).
F-009 (KEEP-734): POST /api/user/totp/setup overwrote the TOTP secret and
wiped backup codes with only a single live session as proof, enabling
account lockout/DoS and an MFA takeover when chained with /enroll. Gate
that rotation behind requireDualFactor (authenticator + email OTP) once
the account is fully enrolled, matching /disable; first-time and
pending-signup enrollment stay ungated. Also scope the /enroll session
path's requires_mfa clear to the just-rotated session instead of every
session, so a parallel quarantined session stays quarantined.
…hild

Close F-010: the sandbox grandchild emitted its result via the async
process.stdout.write and never exited, leaving an event-loop turn in which
escaped user code could schedule a later forged sentinel that the parent's
lastIndexOf-based parser would select over the genuine v8 payload. writeResult
now writes the sentinel line synchronously to fd 1 (write-all loop) and then
process.exit(0)s, so the legitimate result is provably the final bytes the
child ever emits. Wire format (v8.serialize) is unchanged, so all consumers
and the BigInt/Map fidelity contracts are untouched. Refs KEEP-735.
… sinks

OAuth Bearer tokens carry an mcp:read|write|admin scope that the MCP tool
wrapper enforces, but the REST routes those tools forward to dropped it and
ran any authenticated token. A read-only token could POST directly to the
execute and workflow-mutation endpoints and perform writes (A-03 / KEEP-789).

Surface the token scope through the execute and dual-auth contexts and add a
requireScope guard (undefined scope = non-OAuth full access for kh_ keys,
sessions, and internal callers) at the fund and state-changing sinks.
…eter

Condition node expressions were evaluated server-side with new Function(), and the regex validator gating them was bypassable (bare global calls such as fetch(...) and unicode-escaped constructor access), enabling SSRF and remote code execution in the workflow runtime (A-01).

Replace the new Function() evaluation with a self-contained tokenizer, parser, and strictly allowlisted tree-walking interpreter that can only read the resolved values and apply a fixed operator and method set. Harden the code generator to neutralize unsupported syntax in displayed and exported source.

Refs: KEEP-787 (A-01).
…ualfactor-logs-user-submitted-totp-code-in

fix(mfa): remove plaintext TOTP code and PII from debug diagnostics
…ion-getputdelete-falls-back-to-createdby-ex

fix(integrations): scope createdBy fallback to org-less rows to close IDOR
…-accepts-arbitrary-redirect_uri-schemes-oauth

fix(oauth): enforce redirect_uri scheme allowlist on dynamic client registration
…enode-trusts-reserved-fields-nested-in-config

fix(execute): derive node-execution gating from merged config to close reserved-field bypass
…ret-rotation-requires-no-step-up-stolen

fix(mfa): require step-up to rotate an already-enrolled TOTP secret
…-node-expression-validator-is-bypassable-in

fix(workflow): evaluate condition expressions with a safe AST interpreter
Closes A-04 (KEEP-790): OAuth/MCP tokens were never re-validated for current org membership. Re-check membership on every access-token use and refresh-grant exchange, and revoke a user's MCP OAuth refresh tokens for an org when they leave it.
The unauthenticated GET /api/mcp/workflows/[slug]/listing returned full
workflow nodes (contract addresses, webhook URLs, calldata) to anonymous
callers. Add getWorkflowListingPublic projecting an explicit nodes-free
allowlist and use it on the public route; the authenticated per-workflow
MCP server keeps getWorkflowListing for trigger detection. KEEP-739 / F-014.
…te sinks

The scope guard added across the REST execute and workflow-mutation routes
missed two OAuth-reachable write sinks: POST /api/workflows/import (creates a
workflow) and DELETE /api/workflows/[workflowId]/executions (purges execution
rows and logs). A read-scoped token could reach both. Add the same
requireScope(scope, SCOPE_MCP_WRITE) gate, placed before any body parse or DB
access, and extend the scope-enforcement suite to cover both sinks.
…P gate bypass

The better-auth bearer() plugin was unused scaffolding (introduced in
2e17fc6). With it registered, a better-auth session token could be
presented as an Authorization: Bearer header, which sidestepped the
cookie-based origin/CSRF checks and the per-session MFA and IP step-up
gates that key off the session cookie.

Nothing in the app sends a session token as a bearer credential. kh_ API
keys, OAuth JWTs, and wfb_ webhook tokens each have independent auth paths
and are unaffected by this removal. deviceAuthorization() and twoFactor()
and all other plugins remain registered unchanged.

Adds tests/unit/auth-plugins.test.ts as a regression pin: it asserts no
plugin with id "bearer" is registered while deviceAuthorization and
twoFactor still are.

Refs: KEEP-746 (F-021)
…pe-mcpreadwriteadmin-is-enforced-only-at-the

fix(oauth): enforce token scope at REST execute and workflow-mutation sinks
Addresses PR review (F-010): the flush-then-hard-exit timing fix did not
close the forgery vector, because an escaped child can disable process.exit
and let a pre-scheduled timer append a later sentinel that the parent's
stdout lastIndexOf() then selects.

Move the result off stdout entirely. The child now emits its v8-serialized
outcome as a single length-prefixed frame (4-byte big-endian count + raw v8
bytes) on a dedicated pipe (fd 3), written through an fs.writeSync reference
captured at child startup. Both parents (in-pod plugins/code/steps/run-code
and the standalone @keeperhub/sandbox service) spawn with a 4th pipe, read
only the FIRST complete frame, and never deserialize stdout (left purely
user-facing and drained). This removes the 6-byte substring-scan attack
surface and makes post-result async forgery lose to the genuine first frame
even when the escape reassigns process.exit or fs.writeSync.

Reconciles the sentinel inconsistency the review flagged: the bare-"RESULT"
stdout readers are gone; the HTTP path (client.ts <-> index.ts) keeps its
own consistently-framed sentinel for trusted server-to-server framing.

Residual (documented): a fully-escaped child can still write a forged frame
synchronously BEFORE the genuine result, occupying frame #0; eliminating
parent-side v8.deserialize of untrusted bytes would drop structured-clone
fidelity and is a separate product decision.

Tests: forgery rejection via stdout, a later fd-3 frame, process.exit
disabled, and fs.writeSync reassigned; pure first-frame-wins/oversize reader
unit tests; and a larger-than-pipe-buffer multi-chunk round-trip.
Password change now revokes all sessions except the caller's current
one, and password reset deletes every session, so a stolen cookie
cannot survive credential rotation (KEEP-730, F-005).
…llocation-backfill

fix: self-heal gas-credit allocation on mid-period plan change
The A-04 membership re-check added an isUserMemberOfOrganization call to
authenticateOAuthToken; the existing deactivated-user unit test exercises
the real module and its db mock lacks the member query, so the active-user
case failed. Stub membership=true so the deactivation behaviour is tested in
isolation.
suisuss added 2 commits June 12, 2026 14:58
The run-log error message used the destructive (red) Alert variant, which is
jarring for what is often a transient, retryable system notice. Drop the variant
so it falls back to the neutral default (card background, muted text); the code
in the message still conveys the failure.
…failures

feat: surface system error codes and phantom executions on workflow runs
suisuss added 5 commits June 12, 2026 15:40
projectNodesForPublicFeed only read data.config.actionType / triggerType,
so nodes carrying the legacy top-level data.actionType (the shape the rest
of the codebase - lib/mcp/calldata.ts, lib/mcp/listing-validators.ts -
already accepts) projected a blank action type and the marketplace icon
fell back to a generic box. Read both shapes and normalize the colon form
(code:run-code -> code/run-code) so the icon lookup's split("/") resolves
the brand. Output shape is unchanged. Add legacy-shape and precedence tests.
…eaks

The projection functions returned Record<string, unknown>[] and the public
feed response is typed SavedWorkflow[] at the client, so the stripped runtime
shape did not match the declared type: a future reader of
node.data.config.<secret> would compile and silently get undefined, and the
route could spread the full row back in without a compile error.

Add strict PublicFeedNode / PublicFeedEdge types (config holds only the two
render hints) as the projection return types, and pin the route's feed map to
a PublicFeedWorkflow element type so dropping the projection fails to compile.
Coerce projected scalars (asString / asNumber) so the narrow types are sound
rather than asserted. No change to the rendered shape; projection tests
unchanged and green.
Organization API keys are org-owned, not user-owned, and minting one
already requires admin/owner role plus dual-factor. Revoking them on a
password reset took down the whole team's automation on a routine,
usually-benign event for no security gain. Keep the user-scoped
credential revocations (sessions, api keys, MCP refresh tokens, MCP auth
codes, device codes); genuine compromise is handled by account
deactivation, whose cascade already revokes org keys.
Drives the real reset handler against a real Postgres to prove the
user-scoped credential rows are actually gone after a reset and the
org-owned API key is left intact, complementing the mocked spec which
only asserts a delete was issued.
…-resetchange-does-not-invalidate-existing

fix(auth): invalidate sessions on password reset and change
…isting-get-returns-full-workflow-nodes-to

fix(mcp): drop workflow nodes from public listing endpoint
@suisuss suisuss merged commit c90121e into prod Jun 12, 2026
44 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants