perf(web): isolate streaming text to fix main-thread jank#1111
perf(web): isolate streaming text to fix main-thread jank#1111wbxl2000 wants to merge 5 commits into
Conversation
🦋 Changeset detectedLatest commit: 318e752 The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d2c0d7f577
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| // StreamingBlocks component — instead of cloning all of `rawState` and | ||
| // re-rendering the whole App + sidebar on every token. | ||
| if (event.type === 'assistantDelta') { | ||
| appendStreamingDelta(sessionId, event.messageId, event.contentIndex, event.delta); |
There was a problem hiding this comment.
Clear the streaming store when applying snapshots
Because this writes deltas outside rawState, the snapshot path no longer overwrites them: syncSessionFromSnapshot flushes pending render events before setSessionMessages, so any queued assistantDelta lands in streamingBySession and then survives the authoritative snapshot/seed. On reconnect or delta-gap resync while a reply is streaming, the old live chunks are rendered in addition to the seeded snapshot (or can leak into the next assistant turn) until another clear event arrives. Please clear streamingBySession[sessionId] when installing a snapshot, before seedSnapshot.
Useful? React with 👍 / 👎.
| <div v-else-if="blk.kind === 'text' && blk.text" class="msg"> | ||
| <Markdown :text="blk.text" :streaming="true" :open-file="(target) => emit('openFile', target)" /> |
There was a problem hiding this comment.
Merge live deltas with seeded markdown blocks
When a user refreshes or reconnects mid-reply, seedInFlight puts the already-generated assistantText/thinkingText into turn.blocks, and later deltas for the same contentIndex render here as a separate Markdown/ThinkingBlock appended after the seeded block. Markdown constructs spanning the snapshot boundary, such as an open code fence or list, are parsed as two documents and render incorrectly until messageUpdated commits. Seed the streaming store with the snapshot block or render same-index live text through the existing block instead of a separate Markdown instance.
Useful? React with 👍 / 👎.
Since streaming deltas now live outside rawState, the snapshot install path (which overwrites messagesBySession) no longer clears them. A queued delta flushed right before a snapshot would survive the authoritative seed and render on top of it (or leak into the next turn) after a reconnect or delta-gap resync. Clear the per-session streaming store when installing a snapshot and when tearing down a session. Addresses review feedback on PR #1111.
During a long streaming reply every assistantDelta was routed through the immutable reducer and the coarse rawState graph, which made the whole App and the sidebar computeds re-render on every token and stalled the main thread for hundreds of milliseconds per frame. Keep the live streaming text in a separate fine-grained store outside rawState and let assistantDelta bypass the reducer, so only the actively streaming block re-renders. Also stop cloneState from copying the sessions array for events that do not touch it, so the sidebar computeds stay stable during streaming.
Since streaming deltas now live outside rawState, the snapshot install path (which overwrites messagesBySession) no longer clears them. A queued delta flushed right before a snapshot would survive the authoritative seed and render on top of it (or leak into the next turn) after a reconnect or delta-gap resync. Clear the per-session streaming store when installing a snapshot and when tearing down a session. Addresses review feedback on PR #1111.
8fc414d to
ab17799
Compare
commit: |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ab177991a2
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| :text="blk.text" | ||
| :mobile="mobile" | ||
| :streaming="true" | ||
| @open="emit('openThinking', { turnId, blockIndex: blk.contentIndex })" |
There was a problem hiding this comment.
Resolve streaming thinking from live blocks
When a user clicks a still-streaming thinking block, this emits blockIndex: blk.contentIndex, but the live block is no longer present in client.turns because assistantDelta now bypasses messagesBySession; useDetailPanel still reads turn.blocks?.[blockIndex], so the side panel either stays closed or shows an older committed thinking block. This regresses the existing “click thinking to view full text” behavior for long live thoughts until the final messageUpdated arrives.
Useful? React with 👍 / 👎.
Clicking a still-streaming thinking block emitted a blockIndex (contentIndex) that did not match the turn.blocks index the detail panel reads, and the live block is not in client.turns at all (deltas bypass messagesBySession). So the panel either stayed closed or showed a stale committed block. Tag live thinking targets with a `live` flag and read their text from the streaming store in thinkingPanelText, so the panel shows the growing thinking text while the reply is streaming (restoring the pre-existing "click to view full thinking" behavior). Once the turn settles the store is cleared and the live target returns null, closing the panel rather than indexing turn.blocks with a mismatched contentIndex.
…aries The previous fix read live thinking text from the streaming store, but the store is cleared on every messageUpdated (tool slot / step end / turn end) so the committed content can take over in the chat. The final deltas and that clear land in the same tick and coalesce, so the panel would close *before* rendering the last chunk, and at step boundaries it would stay closed for the next step. That made the stream look unstable and drop the final content. When the store is empty, fall back to the committed thinking block in the turn (which already holds the full text via messageUpdated). The panel now shows the live text while streaming and the complete committed text through each boundary, instead of flickering closed. Also add an integration test that drives the real projector -> store -> reducer pipeline (thinking deltas, step/turn completion, and a mid-stream reconnect) to lock in the live-store lifecycle.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 119eba8e49
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| // The committed prefix is NOT duplicated into the live store. | ||
| expect(live!.blocks[0]!.text).not.toContain('prefix'); | ||
| }); | ||
| }); |
There was a problem hiding this comment.
Remove the extra test-suite closer
With this extra closing });, the newly added test file has one more closing block than it opens; running the file fails with ERR_INVALID_TYPESCRIPT_SYNTAX before any tests execute. Any kimi-web test job that includes this suite will abort instead of running the regression coverage.
Useful? React with 👍 / 👎.
| // holds the full text — so the panel keeps showing the complete content | ||
| // through the boundary instead of flickering closed. | ||
| const turn = client.turns.value.find((tn) => tn.id === target.turnId); | ||
| const committed = turn?.blocks?.find((b) => b.kind === 'thinking'); |
There was a problem hiding this comment.
Preserve the selected live thinking block
When a user opens a later live thinking block, such as a turn that already had an earlier thinking segment before a tool and then starts thinking again, messageUpdated clears streamingBySession and this fallback ignores the selected block identity by returning the first committed thinking block in the merged turn. The right-side panel then switches to an unrelated earlier thought at the commit boundary instead of the block the user opened.
Useful? React with 👍 / 👎.
Two follow-ups to the streaming-store migration: - "Copy conversation" and "copy final summary" serialized only props.turns, which during streaming stops at the last messageUpdated and drops the live tail. Merge the streaming store's live blocks into the streaming turn before serializing, so a copy mid-stream captures the full in-flight text. - The WS sessionDeleted event flows through the reducer (not forgetSession), so it never cleared the streaming store and left an orphaned entry behind. Clear it in applyEvent, mirroring forgetSession.
|
Nice approach isolating streaming deltas into a separate reactive store — that's a clean way to avoid dirtying the whole computed graph on every token. The |
Related Issue
No linked issue — this came out of a profiling session on the web chat.
Problem
During a long streaming reply the web chat stalls the main thread for hundreds of milliseconds per frame, so the stream looks janky. A Chrome profile showed a single
Animation frametaking ~15 s, dominated byApp.vuere-rendering and the sidebar computeds (sessionsForView/workspaceGroups/mergedWorkspaces) re-evaluating.Root cause: every
assistantDeltawas routed through the immutable reducer and the coarserawStategraph.cloneStatealso copied thesessionsarray on every event, so each token dirtied the sidebar computeds and re-rendered the whole App — even though a token only appends a few characters to one message.What changed
rawState, andassistantDeltabypasses the reducer to append there in O(1). Only the actively streaming block re-renders on each token.cloneStateno longer copies thesessionsarray for events that do not touch it, so the sidebar computeds stay stable during streaming.StreamingBlockscomponent subscribes to the store and renders the live text/thinking blocks; the committed content takes over at turn end via the existingmessageUpdated.This keeps the cost of a token proportional to the size of the change (a few characters) instead of the size of the whole UI.
Checklist
gen-changesetsskill, or this PR needs no changeset.gen-docsskill, or this PR needs no doc update.