perf(frontend): speed up the real-time board hot path and trim the initial bundle#326
Merged
Merged
Conversation
…itial bundle Index the board's per-frame queries, group execution gate lookups, reconcile hydrate in place, lazy-load rarely-open panels, and reset per-workspace caches on a board switch. - useBlockQueries: single-pass parentId→children / epicId→members index so tasksOf/modulesOf/childrenOf/allTasksUnder/epicMembers are O(1) lookups; a streamed single-block upsert no longer costs O(frames × N). - execution store: decisionsByBlock / approvalsByBlock maps; BlockNode reads badges via O(1) lookups and counts merged/PR tasks in one pass. - board.hydrate: reuse the existing object for unchanged blocks so a full refresh doesn't re-render every frame. - index.vue: defineAsyncComponent + v-if-gate ~25 heavy/rare panels. - requirements/clarity/brainstorm/consensus/github stores: reset() on workspace switch (wired in workspace.hydrate) so a switched-to board drops stale state. - sandbox table single-pass joins, toRaw manifest clone, drop redundant deep settings watchers. Tests for the index + gate maps + hydrate identity. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01532dao6X1gnpAgHdCRQrqM
… hot-path helpers
The lazy-panel change gated ~25 heavy panels behind `defineAsyncComponent` +
`v-if="<openFlag>"`. Most of those panels trigger their data fetch from a
non-immediate `watch(open|executionId|kind, …)`, which only fired on the
`false→true` flip. Under `v-if` the component now mounts *after* the flag is
already true, so that flip never occurs within the watcher's lifetime and the
load-on-open never ran — 16 panels opened empty/stale (Observability, Kaizen,
Bootstrap, DocumentImport, AddServiceFromRepo, GitHub, Slack, IntegrationsHub,
ObservabilityConnection, ModelConfiguration, LocalModelEndpoints,
LocalModeSettings, OpenRouterCatalog, VendorCredentials, UserSecrets, Sandbox).
Make each such watcher `{ immediate: true }` (correct now that mount ⇔ open), so
the first open fetches again. Panels that already loaded via an immediate watcher
(ProviderConnection, SpawnPreview, WorkspaceSettings) are unchanged.
Also:
- board.hydrate: cache per-block JSON by object identity (WeakMap) so a refresh
stringifies each kept block once instead of twice; self-invalidates on upsert.
- execution store: fold the two identical group-by-block builders into one
`groupByBlock` helper.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01VZKnFLdzsUFhkZuYWCCkFm
…oads The board page's panels are now defineAsyncComponent(() => import(...)), so Vite's startup dep scan (static imports only) stops at the dynamic-import boundary and defers discovery of their transitive deps to runtime. Each runtime discovery triggers a dep re-optimization + full page reload; under `nuxt dev` (which the Playwright e2e suite drives) such a mid-test reload aborts an in-flight page.goto with net::ERR_ABORTED, hanging a spec to its 180s timeout and inflating the e2e job from ~75s to ~4.5min. Pre-bundle the exact set the dev server reports discovering (@vue-flow/*, @vueuse/core, markdown-it, wretch, valibot, the @toad-contracts client) so they're optimized once at startup, keeping dev/e2e deterministic without giving back the production code-splitting win. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01TNfzeQZkC1HPAY8YHu2LAb
The optimizeDeps.include pin in the previous commit was silently ignored — nuxt dev runs from the deploy/frontend consumer, where the layer's deps aren't resolvable, so Vite logged "Unresolvable optimizeDeps.include entries" and pre-bundled nothing. The e2e flakiness (a spec hanging ~3min to its timeout, inflating the job to ~4.5min) therefore persisted. Root cause: nuxt dev pre-bundles deps by crawling static imports only, so the board page's defineAsyncComponent(() => import(...)) panels hide their transitive deps from the startup scan. Vite discovers them at runtime and re-optimizes, each re-optimization forcing a full page reload that aborts an in-flight page.goto with net::ERR_ABORTED. Fix: point the Playwright frontend webServer at a production build (nuxt build -> nuxt preview) rather than nuxt dev. A prod build emits every chunk ahead of time — no runtime re-optimization, no reloads — which removes the flake entirely and is robust to any future lazy-loaded panel. It also makes the e2e a more faithful test of the shipped artifact. Revert the ineffective optimizeDeps.include (net-zero change to @cat-factory/app, so the existing perf changeset still applies; @cat-factory/e2e is changeset-ignored). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01TNfzeQZkC1HPAY8YHu2LAb
kibertoad
added a commit
that referenced
this pull request
Jun 27, 2026
* Add "forgot my password" reset flow for password logins Implements self-service password reset for email/password users: - POST /auth/forgot-password mints a single-use, 1h-expiring token (only its SHA-256 hash stored) and emails a reset link; the request always returns 204 so it can't be used to enumerate accounts. - POST /auth/reset-password redeems the token, sets a new password (reusing the PBKDF2 PasswordHasher), consumes the token, and supersedes other pending ones. - New password_reset_tokens table mirrored across D1 and Drizzle/Postgres, with a cross-runtime conformance suite asserting parity, plus retention pruning on both runtimes. - New deployment-level system email sender (EMAIL_SYSTEM_PROVIDER/FROM/API_KEY) for auth emails, independent of the per-account connections; absent, the reset link is logged for local/dev. - Frontend: "Forgot password?" entry in the login screen and a public /reset-password page. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_012NVz64XyUrRU1ofPseTXiM * feat(local): warm container pool + checkout reuse, and optional native execution (#298) * feat(local): warm container pool + checkout reuse, and optional native execution Two local-mode performance features, both opt-in and default-off so the existing per-run-container path is byte-identical when unused. Warm pool + persistent checkout (LOCAL_POOL_SIZE): the local runner transport can keep idle harness containers warm and lease one — preferring a member already holding the run's repo — instead of cold-starting per run. A leased member reuses a stable per-repo checkout (reset --hard + keep-list clean sweep preserving dep caches, then fetch + switch branch) via the new persistentCheckout harness job field. Pooling is Docker-family only (new capabilities.pooling); Apple container keeps the per-run path. Native execution (LOCAL_NATIVE_AGENTS): runs the harness as a host process (LocalProcessRunnerTransport) driving the developer's own installed claude/codex CLI with its ambient login (new harness ambientAuth mode) — no Docker, no leased credential, no personal-credential gate (gated, local-facade-only). Tester local-infra is reported unsupported in native mode for now (host-compose + worktrees are a follow-up phase). Bumps the executor-harness image to 1.16.0 for the new persistentCheckout/ambientAuth handling. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(local): address review findings for warm pool + native execution Correctness: - Native ambient auth no longer fires for every non-pi harness. It now receives the resolved vendor and engages ONLY for a listed harness whose vendor is the native CLI's own vendor (Anthropic claude / OpenAI codex); a non-native vendor reusing the claude-code harness (GLM/Kimi/DeepSeek) keeps leasing, so its subscriptionBaseUrl is no longer silently dropped and run on the developer's own Anthropic login. - LOCAL_NATIVE_AGENTS is parsed as the documented harness allow-list (claude-code,codex) instead of a boolean; AppConfig.nativeAmbientAuth carries it. - Native mode no longer routes EVERY dispatch to the unsandboxed host process. New NativeRoutingRunnerTransport routes per job: ambient-CLI steps to the host process, everything else (proxy/pi, non-native vendor) to the sandboxed per-run container (built lazily). - Personal-credential gate now drops only the ambient-served vendors rather than skipping wholesale, so a non-native individual vendor still gates. - Warm pool: claim an idle member synchronously before the health probe so two concurrent runs can't double-lease one container; count in-flight starts toward poolMax so a concurrent cold-start burst can't overshoot. - pi-workspace dirLocks: store the awaited tail promise so the tail-identity cleanup actually fires (the map no longer grows unbounded). - prepareExistingCheckout: fetch target + base into tracking refs in one command and check out origin/<fetchRef> (not FETCH_HEAD, which a second base fetch clobbered — resetting a resumed branch to base and losing its commits); base fetch is no longer best-effort-swallowed; checkout is forced so a preserved dep-cache dir colliding with a tracked path can't abort it. - LocalProcessRunnerTransport: kill the long-lived harness child on parent exit so dev restarts don't orphan it. Docs/changeset updated for the allow-list + container-path semantics. Tests added: routing transport, double-lease race, resume-with-distinct-base. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(local): move warm-pool + checkout-reuse config from env to DB + UI Address review: the warm-container-pool sizing and per-repo checkout-reuse knobs no longer come from env vars (LOCAL_POOL_SIZE / LOCAL_POOL_MIN_WARM / LOCAL_POOL_MAX / LOCAL_POOL_IDLE_TTL_MS / HARNESS_WORKSPACE_ROOT / HARNESS_CLEAN_KEEP). They are now a per-deployment singleton stored in the DB and edited through a dedicated "Local mode" settings panel (Integrations → Local mode), reachable under local mode's open auth. - contracts: localSettings schema (pool + checkout, fully-defaulting). - kernel: LocalSettingsRepository singleton port. - integrations: LocalSettingsService (resolve/read/write, short cache). - server: ServerContainer.localSettings + GET|PUT /local-settings controller (503 on non-local facades). - node: local_settings table (Drizzle, Postgres-only — local-mode-only, no D1 mirror) + DrizzleLocalSettingsRepository + migration. - local: build the service, attach it to the container, and resolve the serving transport's pool config from the DB (checkout knobs forwarded into the harness container as HARNESS_* env). The serving transport now owns the boot reap + pre-warm (eager when an image is set), replacing the throwaway boot-reap transport in startLocal. - frontend: types + api + store + LocalModeSettingsPanel.vue, gated on auth.localMode.enabled. Native execution (LOCAL_NATIVE_AGENTS / LOCAL_HARNESS_ENTRY) deliberately stays env-only: it is a network-reachable, unsandboxed host-execution opt-in under local mode's open auth, so it should not be a UI toggle. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(local): address review findings — live settings, native-mode parse, pool floor, shared harness HTTP - LocalSettingsService.write now applies the new config LIVE to the already-built serving transport via an onChange hook (LocalContainerRunnerTransport.applySettings), so warm-pool/checkout edits take effect without a service restart. poll/release/dispatch route a run to the backend it already holds (leased member vs per-run container) rather than the current pool mode, so a live resize — including toggling pooling on/off — never strands an in-flight run. Panel + README text corrected (no longer "restart to resize"). - parseNativeHarnesses: LOCAL_NATIVE_AGENTS=false/0/off/no/none/disabled now means OFF (previously any non-empty unrecognised value enabled BOTH native harnesses, so disabling accidentally turned on the unsandboxed, unmetered mode). Affirmative keywords still enable both; a non-affirmative typo now fails safe (off). - Clamp poolMinWarm to poolSize (not poolMax): a minWarm above poolSize was pre-warmed at boot only for trimIdle to reap the excess on the first release, silently breaking the warm floor. - Extract the shared executor-harness HTTP protocol (postJob/pollJob/health-wait + EVICTION_ERROR/SECRET_HEADER/safeText) into harnessHttp.ts; both local transports use it instead of each carrying their own copy of the eviction-marker string and request shape. Tests: pool live-resize + minWarm clamp + live-enable-no-strand, parseNativeHarnesses fail-safe parsing, and LocalSettingsService onChange (incl. best-effort on throw). Existing 25 transport tests + 23 process/native/runtime tests still green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(local): address review — image tag, prewarm idle floor, shared ambient predicate - deploy/backend: bump runner image tag 1.16.0 -> 1.18.0 (matches the executor-harness version this PR's pending minor changeset releases) in both package.json (image:publish) and wrangler.toml, so a deploy actually rolls out the new persistentCheckout/ambientAuth harness instead of reusing a stale tag. - LocalContainerRunnerTransport.prewarmPool: count only IDLE members (+ in-flight starts), not total members, so a live applySettings/reconcilePool while runs are leased no longer under-warms the idle floor. - Extract the ambient-native-vendor decision into the shared isAmbientNativeVendor (kernel), used by both personalCredentialGate and the ContainerAgentExecutor wiring, so the two halves of the ambient-auth decision can't drift. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * chore: release packages (#314) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * feat: add Human Review gate with Fixer escalation (#300) * feat: add Human Review gate with Fixer escalation Add an opt-in `human-review` polling gate (pipeline `pl_pr_review`) that watches a task's PR for a human code review on GitHub and loops the existing `fixer` to address feedback: - advances once the PR meets GitHub's required approvals (read from branch protection) with no unresolved review threads - dispatches the fixer on outstanding review threads (immediately when approved; after a per-task grace window otherwise) and resolves each handed thread via the GraphQL review-thread API so the next probe sees it cleared; a reviewer re-opening re-triggers - waits indefinitely for the human (re-arms, never auto-fails) and raises a `human_review` notification while waiting - a human can request a freeform fix any time from the gate window (request-fix endpoint), dispatched immediately Built as a registry gate in @cat-factory/gates (new PullRequestReviewProvider port + GitHubPullRequestReviewProvider, wired in every facade), reusing the generic gate driver, plus small generic engine seams: pollExhaustion 'rearm', GateDefinition.onHelperComplete, and a pendingFix manual-inject path. Adds a per-task humanReviewGraceMinutes merge-preset knob (D1 + Drizzle migration). Cross-runtime conformance asserts the gate on both runtimes. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_015em4MKnNdmieft8CRkixXM * fix(human-review): harden the gate against the review findings Address the PR review findings on the human-review gate: 1. Probe resilience (never auto-fail the wait): the gate's probe now treats a transient GitHub read error as `pending` instead of letting it propagate. The driver's first gate entry runs outside the fault-tolerant poll loop, so a single 502/rate-limit/GraphQL blip would otherwise terminally fail an indefinitely- waiting review (worst on Node, which has no step.do retry). 2. Stuck-thread desync: the provider now RESOLVES before posting the courtesy reply (and skips the reply on an empty string = resolve-only), so a failed resolve can never leave a bot reply as a thread's latest comment (which would hide a still- unresolved thread from the outstanding set forever). The probe also reconciles any bot-latest unresolved thread each poll with a resolve-only retry, so a lagging resolve self-heals instead of stranding the thread. 3. Notification re-push spam: NotificationService.raise only delivers on a new card or one whose user-visible content (title/body/severity/status/payload) changed, so the indefinitely-polling gate stops re-pushing an identical card to the SPA every poll. 4. Sidebar attempt display: the gate window renders a plain "N fix round(s)" for the unbounded human-review budget instead of "0/9007199254740991 attempts". 5. Prompt freeform-fix dispatch: requestHumanReviewFix re-drives the run via workRunner.startRun (idempotent for a live run) so a fix requested after the driver died is picked up promptly instead of waiting for the stale-run sweeper. 6. Post-approval chatter: plain conversation comments only trigger the fixer while the PR is NOT yet approved; once approved only explicit review threads do, so a casual "lgtm/thanks" no longer churns the branch with a pointless fixer round. 7. Long-thread misclassification: listReviewThreads reads comments(last:50) so the true latest comment drives isBot/latestCommentAt (a human re-open past comment #50 is no longer invisible). 8. Documented that the required-approval floor of 1 is intentional for the opt-in gate. Adds gate tests for the bot-latest reconcile, probe resilience, and the post-approval no-churn behaviour. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(human-review): address PR review findings - onHelperComplete: only resolve review threads when the fixer SUCCEEDED; a failed fixer left feedback unaddressed, so resolving + posting the "addressed" reply masked unfixed concerns and could advance the gate. - gate probe: add a no-progress backoff — don't re-dispatch the fixer when the PR head sha is unchanged since the last round (failed / pushed nothing), which otherwise hot-looped a container every poll given the unbounded budget. - awaiting-review notification: stop telling the user to "assign a reviewer" when a reviewer already approved but more approvals are still required. - use the exported requiredApprovals() helper instead of inlining Math.max(1, requiredApprovingReviewCount) twice. - NotificationService: include executionId in the delivery-gating compare so a content-identical card re-raised under a new run isn't suppressed (the inbox would keep deep-linking the stale terminal run). - drop the dead pendingFix.source ('github' arm was never implemented or read) from the shared gate-state schema + write site + frontend type. - PR-review provider: read the PR head via the single-ref branchHeadSha() instead of paginating the whole branch history every poll. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(human-review): harden the gate against PR-review findings - gate required-approval count against the PR's actual base branch (not the repo default) so a stricter protected base isn't silently under-gated to 1 - raise a human_review card when the fixer stalls on an unchanged head, instead of waiting silently/invisibly forever - carry the run executionId on the awaiting-approval card so the inbox deep-links into the gate window ("request a fix here") rather than just selecting the block - scope the thread-resolve reconcile strictly to gate-handed threads (retained until confirmed resolved) so a third-party review bot's open thread is never silently closed - reject requestHumanReviewFix (409) when no review provider / async executor is wired, instead of silently dropping the parked fix - cache the static branch-protection read on the gate state after the first probe so an indefinite wait doesn't re-read it every poll Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(human-review): address review findings on the gate - onHelperComplete now only resolves a fixer round's review threads when the PR head actually advanced since dispatch, so a no-op "done" fixer can no longer auto-resolve unaddressed feedback and let an approved PR advance. - The Node driver releases an unbounded-wait gate after one in-process poll budget (pollExhaustion 'rearm' → DriveOutcome.rearmedGate) instead of holding one pg-boss advance job open past its expire cap (which could let a second worker double-drive); the stale-run sweeper re-drives for the next cycle. - resolveThreads now throws on a partial failure so onHelperComplete's retain-on-failure path is reachable and the cheap resolve-only reconcile retries instead of re-dispatching a whole fixer round. - The PR-review provider skips the issue-comment + requested-reviewer reads once the PR is approved (the gate ignores them then), trimming per-poll GitHub reads over a long review wait. - Drop the derivable requiredApprovals gate-state field; the UI derives it from requiredApprovingReviewCount via the same max(1, …) floor. - Share a resolve-and-retain helper between the probe reconcile and onHelperComplete. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * chore(node): re-root leaf migration snapshot after merging main Two branches each added a Drizzle migration (human_review_grace_minutes vs main's new tables). db:check stays commutative (an ALTER vs CREATEs), but the leaf snapshot predated my column. Re-root gorgeous_spirit onto every branch tip so its snapshot reflects the merged schema.ts (migration.sql untouched). db:check: Everything's fine. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com> * chore: release packages (#315) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * docs: require opening a PR for task work done on main (#317) * docs: require opening a PR for task work done on main Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01CRNxKZH2FNapubdWdorwfg * docs: clarify always finish a task with a PR without being asked Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01CRNxKZH2FNapubdWdorwfg * docs: allow direct commits to main when explicitly asked Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01CRNxKZH2FNapubdWdorwfg --------- Co-authored-by: Claude <noreply@anthropic.com> * feat: adopt @toad-contracts/* for typed, validated API contracts (#316) * feat: adopt @toad-contracts/* for typed, validated API contracts Define each HTTP route once with defineApiContract in @cat-factory/contracts and consume it on both sides: the backend mounts it with @toad-contracts/hono buildHonoRoute (method/path + request validation derived from the contract; the handler c.req.valid(...) inputs and c.json(body, status) return are type-checked against it), and the SPA calls it with @toad-contracts/frontend-http-client sendByApiContract over wretch (runtime-validating every response). The frontend wire-type mirror (frontend/app/app/types/*) now re-exports the inferred types from @cat-factory/contracts instead of redefining them, so the two sides cannot drift. - Remove jsonBody + the @hono/valibot-validator dependency; request-validation failures still return the same { error: { code: validation, issues } } 400 envelope, mapped centrally in handleError. - updateBlockSchema now accepts responsibleProductUserId (it was silently dropped on the wire despite the domain block carrying it). - AuthUser.id corrected to string. - Internal non-JSON endpoints (WS event stream, LLM/web-search proxies, GitHub webhook, Slack OAuth callback) intentionally stay on plain Hono routing. * refactor(contracts): single-source the wire shapes flagged in review Address the PR #316 review findings and remove the duplication they exposed: - Delete the stray vendored `toadhono/` directory (an unpacked @toad-contracts/hono tarball that nothing consumes — the dep resolves from npm). It sat outside every workspace package location. - Give `/auth/config` `localMode` a real schema (`localModeConfigSchema`) instead of `v.unknown()`, and derive the server's `AppConfig.localMode` type from it. Drops the hand cast in the SPA auth store. - Make `@cat-factory/contracts` the single source of truth for the wire-returned shapes the kernel ports also describe: `ProvisionedRepo` (now `provisionedRepoSchema`) and `AgentContextSnapshot`/`AgentContextFile`/`AgentContextFragment` (now schemas in observability.ts). The kernel ports re-export the inferred types and the route contracts reuse the schemas, so the response validator and the port can't drift. - Dedup `apiKeyListResultSchema` (was defined identically in two route files) into the shared api-keys entity module. - Dedup the `BrainstormItemStatus`/`ClarityItemStatus` aliases (4 copies across components + stores) into their `~/types/*` modules. The remaining findings are decisions, not code changes: response validation on the workspace-snapshot refresh is inherent to sendByApiContract (no per-call opt-out; the cost is bounded and debounced) and special-casing it would reintroduce the drift the contracts kill; the ReviewComment optional fields are the correct canonical shape (the old required frontend type was the bug) and the live consumers read from local drafts. * Restructure Integrations menu for usability (#319) * Restructure Integrations menu for usability Split per-user connections out of the workspace Integrations hub into a new user-scoped "My setup" hub (UserMenu → My setup): personal GitHub token, local model runners, and personal subscriptions. A "Personal (only you)" fallback group keeps them reachable in the hub when auth is disabled. The hub itself gains a search filter, explicit per-row state (Connected / amber Disabled / muted Not connected) with connected rows sorted first, a "Get started" cue recommending GitHub + a model provider on an empty workspace, and demotes issue-tracker settings to a quiet footer link. The integration sub-panels' back control now returns to whichever hub the panel was opened from, and the vendor-credentials modal accepts a deep-linked tab so "My subscriptions" opens straight onto the personal tab. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_012rhE9cgAwPYpo12rbpshMY * Fix requirements-review incorporate test for contract body validation PR #316's @toad-contracts adoption makes a route with a request body schema reject an empty (bodyless) POST with 400 before the handler runs, so the "gates incorporation until every item is settled" worker test — which POSTed with no body — got 400 instead of the domain guard's 422. The real SPA client always sends a JSON body (`{}` when there is no feedback), and the cross-runtime conformance suite already does too; this test was the straggler. Send `{}` so the assertion exercises the 422 guard, not the empty-body rejection. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_012rhE9cgAwPYpo12rbpshMY --------- Co-authored-by: Claude <noreply@anthropic.com> * chore: release packages (#318) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * docs: add high-impact refactoring candidates reference (#320) * docs: add high-impact refactoring candidates reference Captures eight prioritized refactoring opportunities across the backend engine, the cross-runtime facades, and the frontend, with problem, evidence, approach, and impact/effort for each. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_015PznWmBk9JHjn6KsRKzmaC * docs: order refactoring candidates least → most intrusive Reorders the eight candidates by blast radius / disruption to existing code, lowest first, and notes the intrusiveness rationale per entry. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_015PznWmBk9JHjn6KsRKzmaC --------- Co-authored-by: Claude <noreply@anthropic.com> * refactor: land top-3 refactoring candidates (provider base-URLs, row mappers, store factories) (#321) Implements the three least-intrusive candidates from docs/refactoring-candidates.md (the recommended "land the contained, low-risk wins first" sequence). 1. Shared OpenAI-compatible base-URL resolution. The env-override→default logic (and the "litellm has no public default" rule) was reconstructed per facade — a NODE_BASE_URLS map + `||` lookup on Node and a provider `switch` on the Worker. Both now route through a single resolveOpenAiCompatibleBaseUrl() in @cat-factory/agents driven by DEFAULT_OPENAI_COMPATIBLE_BASE_URLS, so adding a vendor is a one-line table entry both runtimes pick up. Aligns the Worker's blank-override handling with Node's long-standing fallback semantics. 2. Generic row mappers. rowToBlock / blockInsertValues / blockPatchToColumns were three hand-enumerated functions kept in sync by eye. They now derive all three directions from a single blockFields table (scalarField / optField / optJsonField / optBoolIntField builders, snake_case-derived columns), with the genuinely divergent columns spelled out inline. Behaviour unchanged; mapper test suite preserved and extended (tri-state, length-clear, insert-only columns). 3. Store pattern factories (frontend). Extract useUpsertList() (keyed find-by-key upsert/remove/get/hydrate) and useSourceIntegration() (the document/task source integration lifecycle), adopted in the notifications, documents and tasks stores. Standardizes probe-error capture across both integration stores. Each candidate carries its own changeset. Backend builds, the mapper + frontend unit suites, and nuxt typecheck all pass. Claude-Session: https://claude.ai/code/session_01SiaJqq5rtRT7WiUSh8C9Ed Co-authored-by: Claude <noreply@anthropic.com> * chore: release packages (#322) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * chore(deps): bump pg-boss 12.23.0 + harness Claude Code/Codex CLIs (#324) - node-server: pg-boss 12.21.0 -> 12.23.0. Dependency-only; the durable execution wiring is unchanged and the API we use is stable across the bump. pg-boss's internal v33/v34 schema migrations apply automatically on boss.start() (v33 slims the job-fetch index + adds the flow-resolver index; v34 adds dead-letter provenance columns, inert for us). - executor-harness image 1.18.0 -> 1.19.0: Claude Code 2.1.193 -> 2.1.195 and Codex 0.142.2 -> 0.142.3 (routine upstream patches). Matching tag bumped in deploy/backend wrangler.toml + image:publish. - Add changesets for both versioned packages. Claude-Session: https://claude.ai/code/session_01NFxXwkp89iqfuBbTz4taey Co-authored-by: Claude <noreply@anthropic.com> * chore: release packages (#325) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * fix(frontend): repair API error handling after the toad-contracts migration (#327) The contract client (`sendByApiContract`) reports a contract-declared non-2xx as a plain `{ statusCode, headers, body }` value (not an Error), with the `{ error: { code, message, details } }` envelope under `body`. The old `$fetch` threw an ofetch `FetchError` with the body under `data`, always an Error. Several handlers still read the old shape: - `parseCredentialError` returned null for every 428, so the personal-subscription password modal never opened and individual-usage runs (Claude/Codex/GLM) could not be started or retried. - `parseConflict` returned null for every 409, so run-control conflict toasts lost their tailored guidance (including the providers_unconfigured "Configure AI" jump). - `instanceof Error` message extraction rendered "[object Object]" for declared 4xx/5xx across many catch blocks, and login/account/tracker-probe handlers dropped the server's message. Fix at the source: `sendContract` wraps a bare non-2xx into a real `ApiError` (an Error carrying statusCode, the parsed body, and the server message), and a shared `apiErrorEnvelope`/`apiErrorStatus` reads the envelope from either client shape, so the 50+ `instanceof Error` sites recover automatically. Also: - provisioning-logs query now validates through the contract schema, returning the standard `{ code: 'validation' }` 400 like every other route (kept the empty-string stripping that the optionals depend on). - add `singleStringParam` to @cat-factory/contracts and collapse the one-key path-param schemas the route files each re-declared (exact per-key typing preserved). - unit tests for ApiError / apiErrorEnvelope / apiErrorStatus across both client shapes. * chore: release packages (#328) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * perf(frontend): speed up the real-time board hot path and trim the initial bundle (#326) * perf(frontend): speed up the real-time board hot path and trim the initial bundle Index the board's per-frame queries, group execution gate lookups, reconcile hydrate in place, lazy-load rarely-open panels, and reset per-workspace caches on a board switch. - useBlockQueries: single-pass parentId→children / epicId→members index so tasksOf/modulesOf/childrenOf/allTasksUnder/epicMembers are O(1) lookups; a streamed single-block upsert no longer costs O(frames × N). - execution store: decisionsByBlock / approvalsByBlock maps; BlockNode reads badges via O(1) lookups and counts merged/PR tasks in one pass. - board.hydrate: reuse the existing object for unchanged blocks so a full refresh doesn't re-render every frame. - index.vue: defineAsyncComponent + v-if-gate ~25 heavy/rare panels. - requirements/clarity/brainstorm/consensus/github stores: reset() on workspace switch (wired in workspace.hydrate) so a switched-to board drops stale state. - sandbox table single-pass joins, toRaw manifest clone, drop redundant deep settings watchers. Tests for the index + gate maps + hydrate identity. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01532dao6X1gnpAgHdCRQrqM * fix(frontend): run lazy-panel load-on-open watchers immediately; tidy hot-path helpers The lazy-panel change gated ~25 heavy panels behind `defineAsyncComponent` + `v-if="<openFlag>"`. Most of those panels trigger their data fetch from a non-immediate `watch(open|executionId|kind, …)`, which only fired on the `false→true` flip. Under `v-if` the component now mounts *after* the flag is already true, so that flip never occurs within the watcher's lifetime and the load-on-open never ran — 16 panels opened empty/stale (Observability, Kaizen, Bootstrap, DocumentImport, AddServiceFromRepo, GitHub, Slack, IntegrationsHub, ObservabilityConnection, ModelConfiguration, LocalModelEndpoints, LocalModeSettings, OpenRouterCatalog, VendorCredentials, UserSecrets, Sandbox). Make each such watcher `{ immediate: true }` (correct now that mount ⇔ open), so the first open fetches again. Panels that already loaded via an immediate watcher (ProviderConnection, SpawnPreview, WorkspaceSettings) are unchanged. Also: - board.hydrate: cache per-block JSON by object identity (WeakMap) so a refresh stringifies each kept block once instead of twice; self-invalidates on upsert. - execution store: fold the two identical group-by-block builders into one `groupByBlock` helper. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01VZKnFLdzsUFhkZuYWCCkFm * perf(frontend): pin vite optimizeDeps to stop mid-test dev-server reloads The board page's panels are now defineAsyncComponent(() => import(...)), so Vite's startup dep scan (static imports only) stops at the dynamic-import boundary and defers discovery of their transitive deps to runtime. Each runtime discovery triggers a dep re-optimization + full page reload; under `nuxt dev` (which the Playwright e2e suite drives) such a mid-test reload aborts an in-flight page.goto with net::ERR_ABORTED, hanging a spec to its 180s timeout and inflating the e2e job from ~75s to ~4.5min. Pre-bundle the exact set the dev server reports discovering (@vue-flow/*, @vueuse/core, markdown-it, wretch, valibot, the @toad-contracts client) so they're optimized once at startup, keeping dev/e2e deterministic without giving back the production code-splitting win. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01TNfzeQZkC1HPAY8YHu2LAb * test(e2e): serve a production build instead of the dev server The optimizeDeps.include pin in the previous commit was silently ignored — nuxt dev runs from the deploy/frontend consumer, where the layer's deps aren't resolvable, so Vite logged "Unresolvable optimizeDeps.include entries" and pre-bundled nothing. The e2e flakiness (a spec hanging ~3min to its timeout, inflating the job to ~4.5min) therefore persisted. Root cause: nuxt dev pre-bundles deps by crawling static imports only, so the board page's defineAsyncComponent(() => import(...)) panels hide their transitive deps from the startup scan. Vite discovers them at runtime and re-optimizes, each re-optimization forcing a full page reload that aborts an in-flight page.goto with net::ERR_ABORTED. Fix: point the Playwright frontend webServer at a production build (nuxt build -> nuxt preview) rather than nuxt dev. A prod build emits every chunk ahead of time — no runtime re-optimization, no reloads — which removes the flake entirely and is robust to any future lazy-loaded panel. It also makes the e2e a more faithful test of the shipped artifact. Revert the ineffective optimizeDeps.include (net-zero change to @cat-factory/app, so the existing perf changeset still applies; @cat-factory/e2e is changeset-ignored). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01TNfzeQZkC1HPAY8YHu2LAb --------- Co-authored-by: Claude <noreply@anthropic.com> * chore: release packages (#329) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Address review findings on the forgot-password flow - Anti-enumeration: PasswordResetService.request swallows email-provider failures (logs instead), and the controller never lets the registered-only path surface a 500 — so neither timing-to-error nor a 500/204 split can be used to enumerate accounts. Also log when a reset is requested but no appBaseUrl is configured (token minted but unreachable). - Single-use is now atomic: new PasswordResetTokenRepository.consume() flips pending->used conditionally (D1 + Drizzle), and reset() consumes before setting the password, so two concurrent redemptions can't both win. - reset() resolves the password identity from the token's userId (listIdentities) instead of round-tripping through users.email, removing a fragile coupling to email casing. - reset-password throttle keys on client IP, not the token value (a per-token bucket limited nothing against brute force). - Frontend forgot/reset calls use the @toad-contracts send() pattern via new forgotPasswordContract / resetPasswordContract, matching the rest of auth. - Tests: anti-enumeration (provider failure) + single-use unit tests; consume atomicity added to the cross-runtime conformance suite. Documented (not changed): a reset does not revoke already-issued sessions, since sessions are stateless self-expiring tokens; revoking would require a per-user session epoch checked per request — a separate design decision. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01GzKKvUGnFLPLJ9H3bGFYfU * ci: re-trigger checks Re-run CI after a flaky Docker Hub pull timeout (`docker pull postgres:18` context deadline exceeded) failed the "Test DB (node/local)" job on the previous run. No code changes. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01GzKKvUGnFLPLJ9H3bGFYfU --------- Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Context
The SPA (
frontend/app, Nuxt 3 / Vue 3 / Pinia,ssr: false) is a real-time board driven by a per-workspace WebSocket stream. The streaming layer itself is solid (targetedupsert()per event, debounced coarse refresh, proper cleanup) — but the cost downstream ofupsertwas: derived board queries that re-scan all blocks per frame, per event, an initial bundle that eagerly mounts ~40 panels/modals, and a few local hot spots. This PR is the analysis + the fixes.What changed
P0 — real-time render hot path
composables/useBlockQueries.ts): a single-passparentId → children(andepicId → members) index, rebuilt once perblockschange (same pattern as the existingbyIdmap).tasksOf/modulesOf/childrenOf/allTasksUnder/epicMembersare now O(1) lookups instead of full-arrayfilter()scans. A streamed single-block upsert no longer costs O(frames × N).board.hydrate(stores/board.ts): reuses the existing object for any unchanged block, so a coarse full refresh doesn't hand every frame/task a new reference and re-render the whole board. Server stays authoritative.P2 — local hot spots
stores/execution.ts: newdecisionsByBlock/approvalsByBlockmaps;BlockNode.vueresolves its decision/approval badges via O(1) lookups instead of re-filtering the global lists once per frame, and computes merged/PR task counts in a single pass.SandboxPanel.vue: pre-joins each run with its grade + fixture name once (fixtureMap+detailRows) instead of.find()-ing the fixture and.get()-ing the grade 4× per row.ProviderConnectionPanel.vue:structuredClone(toRaw(...))instead ofJSON.parse(JSON.stringify(...)).deep: truefrom the settings watchers inWorkspaceSettingsPanel.vue/IssueTrackerPanel.vue(those stores reassignsettingswholesale).P1 — bundle size / idle work
pages/index.vue: ~25 heavy, rarely-open panels (settings / integrations / providers / sandbox / kaizen / observability / bootstrap / github / slack / fragments) are nowdefineAsyncComponent+v-if-gated on theirui-store open flag, so they code-split out of the initial chunk and don't run setup/watchers while closed. Fast-path surfaces (decision/result views, add-task, command bar, etc.) stay eager.P3 — per-workspace cache cleanup
requirements/clarity/brainstorm/consensus/githubstores gainedreset(), wired intoworkspace.hydrateto run only on an actual board switch (not on a same-board refresh, which would wipe an open review window). A switched-to board no longer shows the previous workspace's stale reviews/sessions/repos.Notes / follow-ups
boardrefresh (server carries the changed block so the client canupsertinstead of full-refresh) needs backend coordination and is intentionally not in this PR — the in-place reconcile is the frontend-only win.repos/pulls/issuesare still unpaginated;reset()bounds them per session but true server-side pagination is a separate, larger change.Verification
pnpm test:run(frontend): 59 passed — added cases for the new child/epic index, theboard.hydrateidentity reuse, and the executiondecisionsByBlock/approvalsByBlockmaps.pnpm lint(oxlint + oxfmt --check): clean.nuxt typecheck: clean (after building@cat-factory/contracts).@cat-factory/app, patch).🤖 Generated with Claude Code
Generated by Claude Code