Skip to content

release: To Prod#1466

Merged
suisuss merged 43 commits into
prodfrom
staging
Jun 5, 2026
Merged

release: To Prod#1466
suisuss merged 43 commits into
prodfrom
staging

Conversation

@suisuss

@suisuss suisuss commented Jun 5, 2026

Copy link
Copy Markdown

No description provided.

suisuss added 30 commits June 3, 2026 23:04
Migrate workflow ownership from the creating user to the organization, and
add org/member/workflow deactivation cascades. userId becomes createdBy
(audit only); the org is the authoritative owner.

- Anonymous accounts now get an org; account-link re-parents content (lib/auth.ts)
- workflows.organizationId NOT NULL + backfill migration (0100); create paths stamp/guard org
- Access authority is org membership only (lib/workflow/access.ts)
- Execution authority is org-deactivation; retire creator-user owner_deactivated
  (executable.ts, scheduler, execute, webhook, MCP call, keeperhub-executor/*)
- Deactivation cascades: workflows/organization.deactivatedAt, triggers (0099),
  block enable of a deactivated workflow, deny access to a deactivated org

NOT DONE (deferred): Phase E runtime credential/integration + API-key repoint to
the org principal (security-critical, all-or-nothing). Migrations 0098 (deactivatedAt
columns) not yet generated; 0099/0100 need journal entries. Not type-checked/tested
(no local Node toolchain) - draft for review.
…t suite

Complete the ownership migration (Phase E) and verify the whole branch (Phase F)
in a dockerized Node toolchain (node:24, pnpm 9, postgres:16).

Phase E - runtime execution principal + credentials + api-key to the org:
- isIntegrationUsable/filterUnauthorizedIntegrationIds support the ORG principal
  (userId null + organizationId): entitled to its org's organization-visibility
  integrations; private and per-user specific_members grants do not resolve
- all six validateWorkflowIntegrations gates and the database-query runtime
  principal moved to the org principal together (gate matches runtime)
- webhook wfb_ key now requires current membership of the workflow's org
- ownerId stays as createdBy attribution only (build-executor-input)

Phase F - migrations + verification:
- 0098 deactivation columns (hand-authored: drizzle-kit generate is broken by a
  pre-existing snapshot-parent collision at 0081-0089), journal entries 98-100
- full migration chain 0000-0100 applied cleanly to a fresh database
- live psql trigger matrix: NOT NULL enforced; workflow/org deactivation block
  executions; owner cascade fires only when no active owner remains; KH001
  session block intact
- tsc clean; scoped lint clean; unit suite 308 files / 5728 tests green
- stale tests reshaped to the org-ownership contract (access, dual-auth, x402,
  mcp meta-tools, soft-delete, scheduler lifecycle incl. real cascade e2e);
  e2e fixtures seed owning orgs; seeds require an org
Review of the listing/marketplace/Hub surfaces confirmed publishing, payments,
quotas, and reachability are already fully org-keyed. This commit closes the
remaining data + code hygiene:

- migration 0101: normalize workflows.is_anonymous to false. The flag encoded
  "null-org logged-out session", a state that no longer exists; legacy rows
  backfilled into orgs by 0100 were wrongly hidden from their own org's
  workflow list by the is_anonymous filter
- remove the dead anonymous branches from the list/create/import/duplicate
  routes (org-scoped only) and duplicate's unreachable retire-source branch;
  annotate the claim route as retired-in-practice (dialog removal follow-up)
- stop exposing userId/isAnonymous in the public templates feed (unconsumed)
- document audit-only semantics: workflow_executions.userId (createdBy
  lineage; authority is the org), CALL_ROUTE_COLUMNS.userId, and
  getUserIdFromExecution (live: per-user RPC prefs in generated web3 steps)

Verified (dockerized node:24 + pnpm 9 + postgres:16): tsc clean, lint clean on
changed files, unit suite 308 files / 5728 tests green, 0101 applied with live
assertions (legacy row flips and becomes visible in its org list), plugin
codegen byte-identical.
- Add organization to db/schema mocks and leftJoin to select chains for
  execute and webhook routes (routes now gate executability via org join)
- Update org-gate mock default from deactivatedAt to orgDeactivatedAt
- Update workflow-code-route test: owner access now requires org membership
- Update validateWorkflowIntegrations assertions from userId to null
  (create, current, listing routes now use org principal, not user)
Wrap the three re-parent DB updates (workflows, integrations,
workflowExecutions) in a single transaction so a mid-flight failure
cannot leave partially re-parented state. Move the workflowExecutions
update inside the targetMembership guard so execution history is only
re-attributed when there is a resolved target org. Remove the large
commented-out predecessor implementation.
Without a role filter, a user's null-org workflows could be assigned
to any org they are a member of, not just one they own. A user who is
a plain member of another team's org would have their personal
workflows incorrectly placed there. Restrict step 2 to role = 'owner'
to match the intent documented in the migration header.
The claim path moved anonymous null-org workflows into the caller's
org. Since every workflow is org-owned from creation (anonymous
sessions get an org at signup) and migration 0101 normalized
is_anonymous to false, the gate always rejected. Replace the live
DB-write code path with a hard 410 so the behavior is enforced at the
HTTP layer rather than relying on a data invariant.
The webhook auth path relies on the api_keys cascade having cleaned up
deactivated users' keys, but does not independently verify the user is
still active. Add a defense-in-depth check by joining users and
asserting deactivated_at IS NULL. A deactivated member whose key was
not caught by the cascade is now rejected at the membership check
rather than passing through.
Two bare console.error(error) calls in the user.create.after org-mint
block and session.create.after active-org block are swapped for
logSystemError with ErrorCategory.AUTH. This routes through the
project's structured logging path (Sentry + Loki JSON) rather than
raw stderr, matching the convention used elsewhere in the system.
Add an explicit orderBy(createdAt) to the owner-membership lookup so
the oldest org is selected deterministically rather than relying on
undefined LIMIT 1 ordering. Add a comment explaining the intent:
the oldest owner membership is the user's personal org minted at
signup, which is the right target for anonymous content re-parenting.
Dedicated bearer-token guard for KeeperHub platform operator
endpoints. Uses KH_ADMIN_SECRET independently of INTERNAL_SERVICE_KEY
so a service credential compromise does not grant admin deactivation
powers. Includes the deactivation capability matrix as documentation.
POST /api/admin/orgs/:orgId/deactivate (KH admin secret required).
Sets organization.deactivated_at and cascades to all non-deleted,
non-already-deactivated workflows in the org in one transaction.
Returns the count of workflows deactivated alongside the org id and
timestamp. Already-deactivated returns 409; not-found returns 404.
DB trigger block_executions_for_inactive_workflows backstops execution
at the database layer immediately after the org row is updated.
POST /api/admin/users/:userId/deactivate (KH admin secret required).
Deactivates the user's account globally: sets users.deactivated_at,
deletes all active sessions, and revokes org API keys created by the
user — all in one transaction. No dual-factor challenge (that is a
self-service safety rail). DB triggers then cascade: mcp tokens and
device codes are deleted (0085), and any org where the user was the
sole active owner is deactivated (0099). Already-deactivated returns
409; not-found returns 404.
POST /api/admin/workflows/:workflowId/deactivate (KH admin secret required).
Sets workflows.deactivated_at and enabled=false in one transaction.
Soft-deleted workflows return 404 (treated as not found). Already-
deactivated returns 409. The enabled=false write ensures the scheduler's
workflowExecutableConditions() SELECT also excludes the workflow
independently of the deactivation gate.
Three test files covering the admin org, user, and workflow deactivation
routes. Each tests: missing/wrong secret (401), not found (404), already
deactivated (409), happy path (200), and database error (500). Org tests
additionally verify the workflowsDeactivated count in the response.
Follow the mock-DB pattern from existing integration test files.
…anization

The deactivation guard added an innerJoin(users) to the membership
check. Four integration test files had a db mock that only handled
bare from().where() — the new chain is from().innerJoin().where().
Add innerJoin to each mock's from() return so the membership lookup
resolves instead of throwing on an undefined method.
The previous select-then-update pattern had a TOCTOU window: two concurrent
requests could both pass the deactivatedAt IS NULL guard on SELECT and then
both execute the UPDATE, with the second overwriting the first deactivatedAt
timestamp.

Replace with a single conditional UPDATE (WHERE deactivatedAt IS NULL AND
deletedAt IS NULL) that atomically sets the state. Only fall back to a SELECT
when the UPDATE returns no rows, to distinguish not_found from
already_deactivated - the same pattern used in the org deactivate endpoint.
Step 1 previously minted an org only for users with zero memberships. Step 2
assigns workflows to the user's oldest OWNER membership. If a user had null-org
workflows and some member/admin memberships but no owner role, step 1 would
skip them, step 2 would produce NULL (subquery returns no rows for role='owner'),
and step 3's existence check would abort the migration.

Widen the NOT EXISTS predicate to exclude any user with an owner membership,
so step 1 also mints an owner org for the member/admin-only case.
The try/catch in databaseHooks.user.create.after was logging the error but
allowing signup to complete, leaving users without an organization. Those
users could authenticate but would fail on every subsequent workflow or
integration action since organizationId is now NOT NULL.

Re-throw after logging so signup fails cleanly. The user gets a recoverable
error and can retry, rather than a silently broken account.
If the linking user has no owner membership (e.g. their org-mint failed at
signup), the anonymous content was silently left in the anonymous org with
no indication that re-parenting was skipped. The user would lose access to
all pre-link workflows.

Invert the conditional to log a structured error with both user IDs when
targetMembership is absent, so the incident is visible in the error log.
…tions

The function references the organization table in its WHERE fragment but
cannot enforce that the caller has joined it - omitting the join produces
a runtime SQL error. Expand the JSDoc to explain the requirement, the safe
join pattern, the error symptom, and the fetch-then-gate alternative for
callers that cannot add the join.
isUserMemberOfOrganization gained an innerJoin(users, ...) so it can filter
deactivated members at the DB layer. Four test files mocked @/lib/db/schema
without the users export and had db fluent chains that did not include the
innerJoin().where().limit() path used by the membership check.

- Add users to the schema mocks in all four files
- Update the unit test db mocks (workflow-access, workflow-soft-delete) to
  thread innerJoin into the fluent chain
- Update workflow-code-route db mock (innerJoin chain already correct, users
  export missing)
- Update workflow-listing-route db mock: the innerJoin chain has two callers -
  public-tag queries (await .where() directly) and membership checks (call
  .where().limit()). Make .where() return a Promise augmented with .limit so
  both patterns resolve correctly against their respective mocks.
…cipal model

Three pre-existing test failures from the org-ownership semantic change:

1. execute-route-integration-authz.test.ts: route passes null (org principal)
   to validateWorkflowIntegrations; test still expected the caller userId.
   Update assertion and description to reflect org-principal is the gate.

2. webhook-route.test.ts: getWorkflowAccess was called with organizationId: null,
   making all org-owned workflows fail the org-match check and return 404 for
   every downstream gate (trigger type, integrations, rate limit, success).
   Fix: pass workflow.organizationId. Update the two auth tests: under the
   org-member model any member can trigger a webhook, so "403 different user"
   now requires the user to be a non-member (add mockMemberLimit override);
   "404 removed member" becomes 403 since the membership check in validateApiKey
   is the gate.

3. workflow-schedule-validation-route.test.ts: innerJoin().where was wired
   directly to mockSelectFrom; isUserMemberOfOrganization chains .limit() onto
   that result, hitting TypeError on the raw Promise -> 500 on every test.
   Wrap where in a function that augments the promise with .limit, and add
   users to the schema mock.
…ments

Map KH_ADMIN_SECRET from the keeperhub/kh-admin-secret SSM parameter in
the PR template, staging, and prod Helm values so the admin deactivation
endpoints authenticate correctly on deploy.
getWorkflowAccess requires a matching organizationId to grant access.
The internal execution branch passed organizationId: null, which can
never satisfy hasSameOrgContext once migration 0100 enforces NOT NULL
on workflows.organizationId — every internal caller (scheduler, block-
dispatcher, event worker, MCP) returned 404.

Internal service auth is already verified by authenticateInternalService
before this branch runs. Lifecycle gates (deleted, deactivated) are
covered by getWorkflowExecutability further down. Remove the redundant
and broken access check from the internal branch.
workflow-runner.ts and in-process.ts called validateWorkflowIntegrations
with (nodes, workflow.userId, workflow.organizationId). The org-ownership
migration moved all six HTTP-route gates to the org principal (null,
organizationId) but missed these two executor paths.

Effect: private integrations visible to the creator user remained usable
at execution time even after the creator left the org. Change the userId
argument to null so the executor resolves credentials identically to the
HTTP routes and the MCP call route.
ownerId implied authority it never had: it is used only to populate the
owner_id log label in Sentry error contexts and never for credential
resolution. The org-ownership model makes this name actively misleading
since the org is the execution principal, not the creator user.

Rename the TypeScript field to createdBy across WorkflowExecutionInput,
StepContext, executeWorkflowInBackground, and buildExecutorInput. The
external owner_id JSON key in log labels and error contexts is unchanged
so Sentry queries and Grafana dashboards are unaffected.
GET and POST looked up the working draft by userId alone. A user who had
left an org could still find and mutate a draft belonging to that org
because userId confers no access authority under the org-ownership model.

Add organizationId to both WHERE clauses so the draft is only accessible
while the user is acting in the org that owns it. The GET now fetches
orgContext early and short-circuits to an empty draft (rather than 401)
when no org is active, consistent with the POST behaviour. Hoist the
organizationId extraction in POST so it is available before the lookup
instead of after.
Mirrors the semantic demotion already applied to workflows.userId in
migration 0100: the column identifies the credential creator for audit
purposes, not the access owner. The org (organizationId) is the
authorization authority; user_id is creator attribution only.

Changes:
- Migration 0102: ALTER TABLE integrations RENAME COLUMN user_id TO
  created_by, drop/recreate index under new name
- lib/db/schema.ts: field userId -> createdBy, index renamed
- drizzle/relations.ts: FK fields reference updated
- lib/integrations/authorization.ts: IntegrationAuthRow.userId ->
  createdBy; isIntegrationUsable creator shortcut updated;
  filterUnauthorizedIntegrationIds select + ownerIds (renamed
  creatorIds) + deactivation has() call updated
- lib/db/integrations.ts: DecryptedIntegration.userId -> createdBy;
  integrationWithWalletSelect column alias updated; getIntegrations,
  getIntegration, getIntegrationById return objects updated; fallback
  conditions (no-org CRUD path) updated; createIntegration values
  insert uses createdBy: userId mapping
- lib/auth.ts: account-link re-parenting set/where updated
- lib/metrics/db-metrics.ts: countDistinct column updated
- tests: ownerId -> createdBy assertions for executor input chain
suisuss added 13 commits June 5, 2026 14:20
localstack/localstack:latest now requires an auth token even for local
dev (introduced in 4.0). Pin to 3.8.1 (last pre-auth release) and drop
LOCALSTACK_AUTH_TOKEN from the dev/minikube compose config.

deploy.sh sourced .env with bash `set -a / .` which chokes on JSON
values containing parentheses (CHAIN_RPC_CONFIG comment). Replace with
targeted greps for only the two vars the script actually needs.
integration-authorization.test.ts: IntegrationAuthRow.userId was renamed
to createdBy; update the row() helper and all fixtures.integrations entries
so isIntegrationUsable and filterUnauthorizedIntegrationIds read the
correct field. Without this the creator-shortcut check evaluates
principal.userId === undefined (always false) and the deactivation-cascade
test builds creatorIds from undefined values.

execute-api.test.ts: the "validates integrations belong to workflow owner"
test was calling the mock directly with the old two-argument signature
(nodes, userId) and asserting on its own call, which proved nothing.
Rewritten to assert the correct org-principal contract: (nodes, null, orgId).
… rename

drizzle/schema.ts (the generated Drizzle introspection snapshot) still
had userId/user_id for the integrations table after migration 0102
renamed the column to created_by. drizzle/relations.ts references
integrations.createdBy, which caused a TS2339 typecheck failure because
the property did not exist on the generated table type.

drizzle-kit generate cannot be run due to a pre-existing snapshot-parent
collision, so the snapshot is updated manually to mirror what migration
0102 applied to the database.
…tion

IntegrationPrincipal.userId was always null at every call site, making
three code paths permanently unreachable: the creator shortcut in
isIntegrationUsable, the membership check, and the grant lookup in
filterUnauthorizedIntegrationIds.

Remove userId from IntegrationPrincipal and AuthContext (drop
isPrincipalMember and hasGrant). Simplify isIntegrationUsable to only
check org-match for organization-visibility integrations and always deny
private/specific_members. Drop the membership query and grant lookup from
filterUnauthorizedIntegrationIds entirely.

Update validateWorkflowIntegrations signature (drop userId param), all 8
call sites, the fetchCredentials call in database-query step, and all
affected tests.
…havior

After KEEP-696, POST /api/workflows/current requires an active org and
returns 409 when none is present. The test was asserting the old fallback
behavior (pass null to validateWorkflowIntegrations, get 403) which no
longer exists.
The accept-invite page called signUp.email() without a captcha token,
bypassing the Turnstile gate that protects /sign-up/email in production.

- Import Turnstile and TurnstileInstance from @marsidev/react-turnstile
- Add captchaToken state, captchaRef, and resetCaptcha() to AuthFormState
- Show Turnstile widget below password field in signup mode only
- Disable submit button until captcha is solved (signup mode + site key set)
- Pass token via x-captcha-response header in trySignUp, matching dialog.tsx
- Reset captcha on error and on auth mode toggle
Admin deactivation sets deactivatedAt + enabled=false. The sidebar
shouldShowDisabledBadge only checked enabled, so deactivated workflows
showed Disabled — hiding the distinction from users.

- Add deactivatedAt to SavedWorkflow type (lib/api-client.ts)
- Add deactivatedAt to WorkflowEntry type (navigation-sidebar.tsx)
- WorkflowItem checks deactivatedAt first; shows Deactivated over Disabled
- shouldShowDisabledBadge is untouched; its guard skips deactivated rows
feat: org-owned workflows and deactivation cascades
@suisuss suisuss added the metrics-db-reviewed Reviewer sign-off: metrics aggregate queries optimised + tables indexed (KEEP-680) label Jun 5, 2026
@suisuss suisuss merged commit 12b1b7e into prod Jun 5, 2026
44 of 46 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

metrics-db-reviewed Reviewer sign-off: metrics aggregate queries optimised + tables indexed (KEEP-680)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant