Guard desktop startup against legacy provider state#1027
Guard desktop startup against legacy provider state#1027Amer-alsayed wants to merge 6 commits intopingdotgg:mainfrom
Conversation
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment Tip CodeRabbit can generate a title for your PR based on the changes.Add |
| if (normalizedLimit === 0) { | ||
| return Stream.empty; | ||
| } | ||
| const readPage = ( |
There was a problem hiding this comment.
🔴 Critical Layers/OrchestrationEventStore.ts:216
When a row fails decoding, its sequence is excluded from events, so the pagination cursor (events[events.length - 1]!.sequence) advances past only successfully decoded events. The next query fetches from that cursor, re-retrieving the failed row which fails decoding again, creating an infinite loop. For example: rows with sequences [1,2,3] are fetched, row 3 fails decoding, events becomes [{seq:1},{seq:2}], cursor becomes 2, next query refetches row 3, which fails again.
🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file apps/server/src/persistence/Layers/OrchestrationEventStore.ts around line 216:
When a row fails decoding, its sequence is excluded from `events`, so the pagination cursor (`events[events.length - 1]!.sequence`) advances past only successfully decoded events. The next query fetches from that cursor, re-retrieving the failed row which fails decoding again, creating an infinite loop. For example: rows with sequences [1,2,3] are fetched, row 3 fails decoding, `events` becomes [{seq:1},{seq:2}], cursor becomes 2, next query refetches row 3, which fails again.
Evidence trail:
apps/server/src/persistence/Layers/OrchestrationEventStore.ts lines 230-241 (decode failure returns Option.none), lines 248-261 (pagination logic with cursor based on last successful event's sequence at line 258)
|
Updated pagination to advance using the last row sequence when skipping invalid events, preventing decode loops. Desktop WS URL IPC fix and legacy provider cleanup unchanged. Tests: bun fmt, bun lint, bun typecheck. |
|
did you run into an issue where the state wasn't backwards compatible? i think we just need to make sure we have good migrations in https://github.com/pingdotgg/t3code/tree/main/apps/server/src/persistence/Migrations whenever we do breaking changes that auto-migrates people's state to be compatible? |
|
@juliusmarminge Yes, that’s what I ran into. The failure mode here was older persisted local state containing legacy provider/runtime data and orchestration events that the current startup path no longer decoded successfully, which caused bootstrap to fail and the backend to restart repeatedly. I agree the right long-term fix is to keep persisted state backwards compatible through explicit migrations in |
you know when we broke your state? were you using some fork maybe? we did make some breaking changes in the 0.0.0-alpha releases but since 0.0.1 every change should have been stable |
|
@juliusmarminge Yes, I think that explains it. I had previously been running a fork of From what I saw, the concrete failure was legacy provider/runtime/event data in local state that the current startup path could no longer decode cleanly. So I don’t want to frame this as evidence of a stable migration regression upstream. I think this PR is better understood as a small defensive guard so invalid or fork-produced persisted state does not leave the app unusable. |
|
Yes i think it's good defensive check. Just wanted to track down the cause of your error as we shouldn't have broken anything on our end! Will do a proper review later |
What Changed
gemini) instead of failing startup.Why
Persisted legacy provider data can cause orchestration event decode failures at startup, which restarts the backend and leaves the UI stuck (e.g., Add project hangs). This keeps startup resilient and preserves UX while still logging the issue.
UI Changes
None.
Checklist
bun fmt,bun lint,bun typecheckNote
Guard desktop startup against legacy provider state and undecodable events
readFromSequencein OrchestrationEventStore.ts now skips individual decode failures (logging a warning) rather than failing the stream, and advances pagination based on the last row sequence instead of the last decoded event.getBindingin ProviderSessionDirectory.ts deletes and drops invalid provider sessions instead of propagating a decode error, preventing stale/legacy state from blocking startup.wsUrlsynchronously from the main process via a newdesktop:get-ws-urlIPC channel instead of reading from an environment variable, returningnullon failure.Macroscope summarized c038d0c.