Skip to content

feat: add Text-to-Speech support via Web Speech API#502

Open
x64zx wants to merge 7 commits intositeboon:mainfrom
x64zx:feat/tts-web-speech-api
Open

feat: add Text-to-Speech support via Web Speech API#502
x64zx wants to merge 7 commits intositeboon:mainfrom
x64zx:feat/tts-web-speech-api

Conversation

@x64zx
Copy link

@x64zx x64zx commented Mar 8, 2026

Summary

Add browser-native Text-to-Speech (TTS) integration that reads Claude's responses aloud using the Web Speech API. Zero external dependencies — uses the browser's built-in speechSynthesis API.

Features

  • One-click toggle in chat composer (Volume2/VolumeX icons via lucide-react)
  • Full settings panel in Quick Settings: voice selection, rate/pitch sliders, language filter, voice preview
  • Streaming-aware: chunks text at sentence boundaries for responsive reading during streaming responses
  • Graceful degradation: TTS UI hidden when browser has no voices available (e.g. headless Linux)
  • Persistent settings: all preferences saved to localStorage

Files changed

File Change
src/hooks/useSpeechOutput.ts New — Web Speech API wrapper hook
src/contexts/TtsContext.tsx New — React context provider
src/components/quick-settings-panel/view/QuickSettingsTtsSection.tsx New — Settings UI
src/components/chat/view/ChatInterface.tsx Modified — TtsProvider integration
src/components/chat/view/subcomponents/ChatComposer.tsx Modified — Toggle button
src/components/quick-settings-panel/view/QuickSettingsContent.tsx Modified — Settings section

Design decisions

  • navigator.language fallback instead of hardcoded locale — works for any user language
  • availableVoices.length > 0 guard on both toggle button and settings panel — prevents broken UI on environments without speech synthesis voices
  • lucide-react icons (Volume2, VolumeX, StopCircle) — consistent with existing icon usage in the project
  • No new dependencies — only uses existing lucide-react and browser APIs

Screenshots

TTS toggle button appears to the left of the send button when voices are available. Settings accessible via Quick Settings panel (scroll to bottom).

Test plan

  • Verify TTS toggle button appears on desktop/mobile browsers with speech synthesis support
  • Verify TTS toggle button is hidden on browsers without voices (e.g. headless Chrome)
  • Toggle TTS on → send a message → verify Claude's response is read aloud
  • Click stop button while speaking → verify speech stops
  • Open Quick Settings → scroll to TTS section → change voice, rate, pitch
  • Click "Test voice" → verify selected voice speaks sample text
  • Change language filter → verify voice list updates
  • Refresh page → verify all TTS settings persist
  • Verify no console errors on browsers without speechSynthesis API

🤖 Generated with Claude Code

Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com

Summary by CodeRabbit

  • New Features

    • Global Text-to-Speech (TTS) for chat messages with start/stop and speaking state.
    • TTS toggle buttons in the chat composer (near input and mic).
    • Automatic speaking of finalized assistant messages with persisted TTS preferences.
  • New UI

    • Quick Settings TTS panel: enable/disable, language & voice selection, rate, pitch, test/stop controls.
  • Localization

    • TTS strings added for English, Japanese, Korean, and Simplified Chinese.

Add browser-native TTS integration that reads Claude's responses aloud:

- useSpeechOutput hook: Web Speech API wrapper with voice selection,
  rate/pitch control, language filtering, and streaming-aware chunking
- TtsContext: React context provider for app-wide TTS state
- QuickSettingsTtsSection: Full TTS configuration panel with voice
  preview, rate/pitch sliders, and language filter
- ChatComposer: Toggle button with lucide-react icons (Volume2/VolumeX/
  StopCircle) and graceful degradation when no voices available
- QuickSettingsContent: TTS section with voices=0 guard

All settings persist to localStorage. Falls back to navigator.language
instead of hardcoded locale. No external dependencies required — uses
the browser's built-in speechSynthesis API.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 8, 2026

📝 Walkthrough

Walkthrough

Adds Text‑to‑Speech (TTS): new TtsProvider and useTts hook backed by useSpeechOutput; ChatInterface is wrapped with the provider; ChatComposer gains TTS UI controls; a Quick Settings TTS section component is added; and TTS localization keys are added across locales.

Changes

Cohort / File(s) Summary
Core TTS infra
src/contexts/TtsContext.tsx, src/hooks/useSpeechOutput.ts
New TtsContext/TtsProvider and useTts; useSpeechOutput implements SpeechSynthesis integration, persisted settings, voice lists/filtering, speak/stop/test/toggle controls, and auto-speaking of finalized assistant messages.
Chat integration
src/components/chat/view/ChatInterface.tsx, src/components/chat/view/subcomponents/ChatComposer.tsx
Wraps ChatInterface subtree with <TtsProvider>; ChatComposer consumes TTS context and adds toggle buttons and controls (enable/toggle, speak/stop, isSpeaking state, localized labels).
Quick Settings UI
src/components/quick-settings-panel/view/QuickSettingsContent.tsx, src/components/quick-settings-panel/view/QuickSettingsTtsSection.tsx
Adds internal wrapper consuming TTS context and a new QuickSettingsTtsSection component with enable/toggle, language/voice selection, rate/pitch sliders, test/stop controls, and voice filtering props.
Localization
src/i18n/locales/en/settings.json, src/i18n/locales/ja/settings.json, src/i18n/locales/ko/settings.json, src/i18n/locales/zh-CN/settings.json
Adds quickSettings.tts translation keys across locales (section title, enable/disable, language, voice, speed/pitch labels, test/stop, and button labels).

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant ChatInterface
    participant ChatComposer
    participant TtsProvider
    participant useSpeechOutput
    participant Browser_SpeechSynthesis

    User->>ChatInterface: open chat / send input
    ChatInterface->>TtsProvider: provide chatMessages
    TtsProvider->>useSpeechOutput: init with messages
    useSpeechOutput->>Browser_SpeechSynthesis: load voices (onvoiceschanged)
    Browser_SpeechSynthesis-->>useSpeechOutput: voices available

    rect rgba(100, 150, 255, 0.5)
    Note over ChatInterface,useSpeechOutput: Assistant message finalized
    ChatInterface->>TtsProvider: chatMessages updated
    TtsProvider->>useSpeechOutput: detect new finalized assistant message
    useSpeechOutput->>Browser_SpeechSynthesis: speak(utterance)
    Browser_SpeechSynthesis-->>useSpeechOutput: onend/onerror -> update isSpeaking
    end

    rect rgba(100, 200, 100, 0.5)
    Note over User,ChatComposer: User interacts with TTS controls
    User->>ChatComposer: toggle / adjust rate/pitch / test / stop
    ChatComposer->>TtsProvider: call toggle/setRate/testVoice/stop
    TtsProvider->>useSpeechOutput: execute control
    useSpeechOutput->>Browser_SpeechSynthesis: apply or cancel speech
    end
Loading

Possibly related PRs

  • Feat:add i18n #303 — overlaps with i18n locale key changes and may intersect with the new quickSettings.tts translation entries.

Suggested reviewers

  • blackmammoth

Poem

🐰
I nibble keys and make the messages sing,
Toggles blink and tiny speakers spring,
Voices hop out from words upon the screen,
Sliders hum soft — a chat that can preen.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 18.18% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately and concisely describes the main change: adding Text-to-Speech support using the Web Speech API, which aligns with all modified and new files in the changeset.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/components/chat/view/subcomponents/ChatComposer.tsx`:
- Around line 332-352: The TTS icon-only button lacks a localized accessible
name; update the button in ChatComposer (the element using tts, onClick, and
title) to supply an aria-label that uses the app's localization helper (e.g.,
t(...) or i18n.t(...)) and reflect the same state logic as title (when
tts.isSpeaking: "Stop speaking", when tts.enabled: "TTS ON (click to disable)",
else "TTS OFF (click to enable)"); keep the existing title in sync by setting it
from the same localized strings so both tooltip and screen readers receive
identical, localized text.

In `@src/components/quick-settings-panel/view/QuickSettingsTtsSection.tsx`:
- Around line 42-183: QuickSettingsTtsSection currently hardcodes all UI text;
replace those literals with the app i18n settings keys like the rest of Quick
Settings by importing the same translation hook (e.g., useTranslation or
useSettingsTranslations used elsewhere) into QuickSettingsTtsSection, swap
strings such as "TTS Enabled", "Language", "All Languages", "Voice
({filteredVoices.length} available)" (use pluralization/count support), "Auto
(first match)", "Speed", "Pitch", "Test Voice", "Stop", the range labels
("0.5x", "1.0x", "2.0x", "3.0x", "Low", "Normal", "High") and any network suffix
text for voices with corresponding t('settings.xxx') keys, and ensure keys
accept interpolations (count for filteredVoices.length) and provide sensible
fallback text. Use the same translation namespace and patterns as other Quick
Settings components so the component follows the app language settings.
- Around line 44-152: The controls are not programmatically associated with
their labels; add explicit id/htmlFor pairs and switch labeling: give the
visible "TTS Enabled" span a unique id (e.g., tts-enabled-label) and set the
switch button's aria-labelledby to that id (or aria-label if you prefer), add id
attributes to the Language select (lang select bound to lang/onLangChange),
Voice select (voiceURI/onVoiceChange), Rate input (rate/onRateChange) and Pitch
input (pitch/onPitchChange), and update each corresponding <label> to use
htmlFor pointing to those ids; ensure the voice-count label and the numeric
value spans (rate/pitch display) are reachable via aria-labelledby or have ids
referenced by their inputs if needed so screen readers see the association.

In `@src/hooks/useSpeechOutput.ts`:
- Around line 206-236: The effect in useSpeechOutput watches streaming assistant
messages but when lastMsg.isStreaming is true it only caches content
(lastStreamingContentRef.current) and never speaks partial updates; change the
isStreaming branch to compute the incremental delta between
lastStreamingContentRef.current (previous partial) and lastMsg.content (new
partial), update lastStreamingContentRef.current to the new content, and call
speak(delta) only if the delta is non-empty so partial TTS plays during
streaming; keep the existing finalization logic that clears
lastStreamingContentRef and uses lastSpokenIndexRef to avoid duplicate final
speaks.
- Around line 183-199: Add an unmount cleanup in the useSpeechOutput hook (where
stop and toggle are defined) to ensure active speech is canceled when
TtsProvider unmounts: inside a useEffect with no deps, return a cleanup that
checks for typeof window !== 'undefined' && window.speechSynthesis and calls
window.speechSynthesis.cancel() and setIsSpeaking(false). This ensures any
ongoing utterance started via stop/toggle is stopped when the provider/component
is torn down.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 56cb86fb-6cf0-4cb2-a86b-fe5fb1363ba6

📥 Commits

Reviewing files that changed from the base of the PR and between 3950c0e and 09f8fed.

📒 Files selected for processing (6)
  • src/components/chat/view/ChatInterface.tsx
  • src/components/chat/view/subcomponents/ChatComposer.tsx
  • src/components/quick-settings-panel/view/QuickSettingsContent.tsx
  • src/components/quick-settings-panel/view/QuickSettingsTtsSection.tsx
  • src/contexts/TtsContext.tsx
  • src/hooks/useSpeechOutput.ts

Comment on lines +206 to +236
// Watch for finalized assistant messages
useEffect(() => {
if (!enabled || chatMessages.length === 0) {
return;
}

const lastIndex = chatMessages.length - 1;
const lastMsg = chatMessages[lastIndex];

if (
!lastMsg ||
lastMsg.type !== 'assistant' ||
lastMsg.isToolUse ||
lastMsg.isInteractivePrompt ||
!lastMsg.content
) {
lastStreamingContentRef.current = null;
return;
}

if (lastMsg.isStreaming) {
lastStreamingContentRef.current = lastMsg.content;
return;
}

if (lastIndex > lastSpokenIndexRef.current) {
lastSpokenIndexRef.current = lastIndex;
lastStreamingContentRef.current = null;
speak(lastMsg.content);
}
}, [chatMessages, enabled, speak]);
Copy link
Contributor

@coderabbitai coderabbitai bot Mar 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

The streaming TTS path never actually speaks partial output.

When lastMsg.isStreaming is true, this effect only caches the latest content and returns. Because lastStreamingContentRef is never read to compute/speak completed sentence deltas, assistant replies are only spoken after finalization instead of incrementally during streaming.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/hooks/useSpeechOutput.ts` around lines 206 - 236, The effect in
useSpeechOutput watches streaming assistant messages but when
lastMsg.isStreaming is true it only caches content
(lastStreamingContentRef.current) and never speaks partial updates; change the
isStreaming branch to compute the incremental delta between
lastStreamingContentRef.current (previous partial) and lastMsg.content (new
partial), update lastStreamingContentRef.current to the new content, and call
speak(delta) only if the delta is non-empty so partial TTS plays during
streaming; keep the existing finalization logic that clears
lastStreamingContentRef and uses lastSpokenIndexRef to avoid duplicate final
speaks.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is intentional design. Streaming partial TTS has several practical issues:

  1. Sentence fragmentation — partial updates cut mid-sentence, producing unnatural speech
  2. Cancellation overhead — each delta requires cancel+speak, causing stuttering
  3. Race conditions — rapid cancel/speak can silently drop utterances on some browsers
  4. Duplicate content — overlapping deltas can cause repeated speech without careful diffing

The current approach (speak the complete message after streaming finishes) provides a cleaner UX — full sentences read in one pass. This matches how most TTS integrations work (e.g. ChatGPT Read Aloud waits for the full response).

If streaming TTS is desired later, it would need a sentence-boundary chunking buffer — a separate feature.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@x64zx, that's a well-reasoned design decision — the four issues you've identified (fragmentation, cancellation overhead, race conditions, duplicate content) are real pitfalls of naïve streaming TTS. Waiting for finalization before speaking is indeed the cleaner approach. I'll note this for future reviews.

( •ᴗ•) 🐇


✏️ Learnings added
Learnt from: x64zx
Repo: siteboon/claudecodeui PR: 502
File: src/hooks/useSpeechOutput.ts:215-245
Timestamp: 2026-03-08T21:21:53.608Z
Learning: In `src/hooks/useSpeechOutput.ts`, the streaming TTS behavior is intentional: when `lastMsg.isStreaming` is true, the hook only caches content in `lastStreamingContentRef.current` and does NOT speak partial updates. TTS is deliberately triggered only after the full assistant message is finalized (streaming complete). Reasons: sentence fragmentation, cancellation overhead, browser race conditions, and duplicate content risk. Streaming TTS with sentence-boundary chunking is a potential future feature, not the current design.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

- Add aria-label to TTS toggle button in ChatComposer (a11y)
- Move all hardcoded TTS UI strings to i18n (en/ja/ko/zh-CN)
- Add htmlFor/id pairs to all TTS form controls (a11y)
- Add aria-labelledby to TTS enabled switch (a11y)
- Cancel active speech on TtsProvider unmount (cleanup)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/hooks/useSpeechOutput.ts`:
- Around line 84-86: When TTS is enabled it currently treats index -1 as new and
will immediately speak the most recent assistant message; to fix it seed the
tail on initial hydration and on enable transitions by setting
lastSpokenIndexRef.current to messages.length - 1 (and set
lastStreamingContentRef.current to the last assistant message text or null)
whenever the hook mounts with enabled===true or when enabled flips from
false→true so only messages appended afterward are spoken; update the
useEffect(s) that watch enabled/messages (around lastSpokenIndexRef and
lastStreamingContentRef usage) to perform this seeding instead of immediately
queuing the latest reply.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 9888d4e4-f457-4cb5-b2c2-63fb01d5bdce

📥 Commits

Reviewing files that changed from the base of the PR and between 09f8fed and 81d3265.

📒 Files selected for processing (7)
  • src/components/chat/view/subcomponents/ChatComposer.tsx
  • src/components/quick-settings-panel/view/QuickSettingsTtsSection.tsx
  • src/hooks/useSpeechOutput.ts
  • src/i18n/locales/en/settings.json
  • src/i18n/locales/ja/settings.json
  • src/i18n/locales/ko/settings.json
  • src/i18n/locales/zh-CN/settings.json
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/components/quick-settings-panel/view/QuickSettingsTtsSection.tsx

When TTS is enabled on page load or toggled on mid-session,
lastSpokenIndexRef was -1, causing the last assistant message to be
immediately spoken. Now seeded to chatMessages.length - 1 on mount
and on enable toggle so only new messages are spoken.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
src/hooks/useSpeechOutput.ts (2)

128-135: Polling may be redundant with onend/onerror handlers.

The speak function already attaches onend and onerror handlers (lines 175-176) that set isSpeaking(false). The polling interval mainly catches the speaking=true transition, which is immediately set by speak() anyway (line 178). Consider if polling is needed or if it could be removed to reduce overhead.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/hooks/useSpeechOutput.ts` around lines 128 - 135, The polling effect that
checks window.speechSynthesis.speaking should be removed to avoid redundant
work; delete the useEffect block that sets up setInterval and clearInterval, and
instead rely on the existing speak function's explicit state transitions (ensure
speak() calls setIsSpeaking(true) when starting and that the utterance onend and
onerror handlers call setIsSpeaking(false)); also verify there are no other code
paths that require polling to detect speaking state and, if any exist, update
them to use the same explicit state updates in speak/utterance handlers.

3-10: Consider filtering out isThinking messages.

The local ChatMessage type omits isThinking from the actual ChatMessage interface (see src/components/chat/types/types.ts:28-50). While the hook filters isToolUse and isInteractivePrompt, thinking indicator messages (isThinking: true) could be spoken. You may want to add this field and filter it similarly.

 type ChatMessage = {
   type: string;
   content?: string;
   isStreaming?: boolean;
   isToolUse?: boolean;
   isInteractivePrompt?: boolean;
+  isThinking?: boolean;
   [key: string]: unknown;
 };

Then at line 233:

       lastMsg.isToolUse ||
       lastMsg.isInteractivePrompt ||
+      lastMsg.isThinking ||
       !lastMsg.content
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/hooks/useSpeechOutput.ts` around lines 3 - 10, The ChatMessage type in
useSpeechOutput.ts is missing the isThinking flag which allows
thinking-indicator messages to be spoken; update the ChatMessage type to include
isThinking?: boolean and update the filtering logic inside the useSpeechOutput
hook (where it already filters isToolUse and isInteractivePrompt) to also skip
messages with isThinking === true so thinking indicators are not spoken; ensure
references to isThinking are used in the same places as
isToolUse/isInteractivePrompt checks.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/hooks/useSpeechOutput.ts`:
- Around line 54-61: The readStorageFloat function can return NaN when
localStorage holds a malformed value; update readStorageFloat to validate the
parsed number and return the provided fallback if parseFloat yields NaN or an
infinite value. Specifically, inside readStorageFloat (used for rate/pitch
state), parse the stored string, then check Number.isFinite(result) (or
isFinite) and only return the parsed value when valid; otherwise return the
fallback so rate/pitch never become NaN.

---

Nitpick comments:
In `@src/hooks/useSpeechOutput.ts`:
- Around line 128-135: The polling effect that checks
window.speechSynthesis.speaking should be removed to avoid redundant work;
delete the useEffect block that sets up setInterval and clearInterval, and
instead rely on the existing speak function's explicit state transitions (ensure
speak() calls setIsSpeaking(true) when starting and that the utterance onend and
onerror handlers call setIsSpeaking(false)); also verify there are no other code
paths that require polling to detect speaking state and, if any exist, update
them to use the same explicit state updates in speak/utterance handlers.
- Around line 3-10: The ChatMessage type in useSpeechOutput.ts is missing the
isThinking flag which allows thinking-indicator messages to be spoken; update
the ChatMessage type to include isThinking?: boolean and update the filtering
logic inside the useSpeechOutput hook (where it already filters isToolUse and
isInteractivePrompt) to also skip messages with isThinking === true so thinking
indicators are not spoken; ensure references to isThinking are used in the same
places as isToolUse/isInteractivePrompt checks.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 1ccc3c97-cbd6-41a3-8d26-5a44a2e1d292

📥 Commits

Reviewing files that changed from the base of the PR and between 81d3265 and 1babd63.

📒 Files selected for processing (1)
  • src/hooks/useSpeechOutput.ts

Comment on lines +54 to +61
function readStorageFloat(key: string, fallback: number): number {
try {
const v = localStorage.getItem(key);
return v ? parseFloat(v) : fallback;
} catch {
return fallback;
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

parseFloat can return NaN for malformed stored values.

If localStorage contains a non-numeric string (e.g., corrupted data), parseFloat returns NaN, which propagates to rate/pitch state and could cause unexpected speech behavior.

🛡️ Suggested defensive fix
 function readStorageFloat(key: string, fallback: number): number {
   try {
     const v = localStorage.getItem(key);
-    return v ? parseFloat(v) : fallback;
+    if (!v) return fallback;
+    const parsed = parseFloat(v);
+    return Number.isNaN(parsed) ? fallback : parsed;
   } catch {
     return fallback;
   }
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/hooks/useSpeechOutput.ts` around lines 54 - 61, The readStorageFloat
function can return NaN when localStorage holds a malformed value; update
readStorageFloat to validate the parsed number and return the provided fallback
if parseFloat yields NaN or an infinite value. Specifically, inside
readStorageFloat (used for rate/pitch state), parse the stored string, then
check Number.isFinite(result) (or isFinite) and only return the parsed value
when valid; otherwise return the fallback so rate/pitch never become NaN.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/hooks/useSpeechOutput.ts (1)

193-206: Minor: toggle closure captures chatMessages.length at render time.

The dependency on chatMessages.length means the seeding logic (line 198) uses the length from when toggle was last created, not the current length at invocation time. If messages change between re-renders and the toggle is called, the seed index could be off.

This is unlikely to cause issues in practice since users typically toggle TTS during idle states, but using a ref for chatMessages.length would be more robust.

💡 Optional fix using ref
+  const chatMessagesLengthRef = useRef(chatMessages.length);
+  chatMessagesLengthRef.current = chatMessages.length;
+
   const toggle = useCallback(() => {
     setEnabled((prev) => {
       const next = !prev;
       if (next) {
-        lastSpokenIndexRef.current = chatMessages.length - 1;
+        lastSpokenIndexRef.current = chatMessagesLengthRef.current - 1;
         lastStreamingContentRef.current = null;
       } else if (typeof window !== 'undefined' && window.speechSynthesis) {
         window.speechSynthesis.cancel();
         setIsSpeaking(false);
       }
       return next;
     });
-  }, [chatMessages.length]);
+  }, []);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/hooks/useSpeechOutput.ts` around lines 193 - 206, The toggle callback
currently captures chatMessages.length at render time which can be stale; update
the hook to track the latest length in a ref (e.g., chatMessagesLengthRef) that
you update whenever chatMessages changes, then inside toggle use
lastSpokenIndexRef.current = chatMessagesLengthRef.current - 1 instead of
chatMessages.length; remove chatMessages.length from toggle's dependency array
so toggle doesn't re-create unnecessarily and ensure the new ref is kept in sync
via an effect that runs on chatMessages updates.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/hooks/useSpeechOutput.ts`:
- Around line 193-206: The toggle callback currently captures
chatMessages.length at render time which can be stale; update the hook to track
the latest length in a ref (e.g., chatMessagesLengthRef) that you update
whenever chatMessages changes, then inside toggle use lastSpokenIndexRef.current
= chatMessagesLengthRef.current - 1 instead of chatMessages.length; remove
chatMessages.length from toggle's dependency array so toggle doesn't re-create
unnecessarily and ensure the new ref is kept in sync via an effect that runs on
chatMessages updates.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 28190615-b1bc-4852-8d6a-8e8d209b2b73

📥 Commits

Reviewing files that changed from the base of the PR and between 1babd63 and 839ca30.

📒 Files selected for processing (1)
  • src/hooks/useSpeechOutput.ts

Prevents stale closure capturing outdated length when toggle is
called between re-renders.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/hooks/useSpeechOutput.ts (1)

132-139: Consider conditionally running the polling interval.

The interval polls speechSynthesis.speaking every 200ms regardless of whether TTS is enabled. For a minor efficiency improvement, you could skip polling when !enabled && !isSpeaking, though the current approach ensures the state is always accurate.

♻️ Optional optimization
   useEffect(() => {
     if (typeof window === 'undefined' || !window.speechSynthesis) return;
+    if (!enabled && !isSpeaking) return;
     const interval = setInterval(() => {
       setIsSpeaking(window.speechSynthesis.speaking);
     }, 200);
     return () => clearInterval(interval);
-  }, []);
+  }, [enabled, isSpeaking]);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/hooks/useSpeechOutput.ts` around lines 132 - 139, The current useEffect
always starts a polling interval to setIsSpeaking from
window.speechSynthesis.speaking; change it to only start the interval when TTS
is active by checking the hook state (e.g., enabled || isSpeaking) before
creating the interval, and add those values to the effect dependency array so it
starts/stops correctly; keep the existing cleanup (clearInterval) and the guard
for typeof window === 'undefined' and window.speechSynthesis to avoid side
effects.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/hooks/useSpeechOutput.ts`:
- Around line 132-139: The current useEffect always starts a polling interval to
setIsSpeaking from window.speechSynthesis.speaking; change it to only start the
interval when TTS is active by checking the hook state (e.g., enabled ||
isSpeaking) before creating the interval, and add those values to the effect
dependency array so it starts/stops correctly; keep the existing cleanup
(clearInterval) and the guard for typeof window === 'undefined' and
window.speechSynthesis to avoid side effects.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 7de5de61-a99e-48a9-8f11-d03bd360f01b

📥 Commits

Reviewing files that changed from the base of the PR and between 839ca30 and cf5f795.

📒 Files selected for processing (1)
  • src/hooks/useSpeechOutput.ts

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@x64zx
Copy link
Author

x64zx commented Mar 9, 2026

@coderabbitai resume

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 9, 2026

✅ Actions performed

Reviews resumed.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
src/hooks/useSpeechOutput.ts (1)

142-152: Consider memoizing derived values.

filteredVoices and availableLanguages are computed on every render. While the arrays are typically small, wrapping these in useMemo would prevent unnecessary recomputation and array allocations.

♻️ Optional memoization
+import { useCallback, useEffect, useMemo, useRef, useState } from 'react';
-import { useCallback, useEffect, useRef, useState } from 'react';

   // ...

-  const filteredVoices = availableVoices.filter((v) => {
-    if (lang === '') return true;
-    const langPrefix = lang.split('-')[0];
-    return v.lang.startsWith(langPrefix);
-  });
+  const filteredVoices = useMemo(() => 
+    availableVoices.filter((v) => {
+      if (lang === '') return true;
+      const langPrefix = lang.split('-')[0];
+      return v.lang.startsWith(langPrefix);
+    }),
+    [availableVoices, lang]
+  );

-  const availableLanguages = Array.from(
-    new Set(availableVoices.map((v) => v.lang)),
-  ).sort();
+  const availableLanguages = useMemo(() => 
+    Array.from(new Set(availableVoices.map((v) => v.lang))).sort(),
+    [availableVoices]
+  );
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/hooks/useSpeechOutput.ts` around lines 142 - 152, Compute-intensive
derived arrays filteredVoices and availableLanguages are recreated on every
render; wrap their calculations in React's useMemo to memoize results based on
dependencies (use availableVoices and lang for filteredVoices, and
availableVoices for availableLanguages). Update the logic inside
useSpeechOutput.ts to replace the direct Array.filter/Array.from+Set
computations with useMemo hooks so allocations only run when their dependencies
change, referencing the existing variables filteredVoices, availableLanguages,
availableVoices, and lang.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/hooks/useSpeechOutput.ts`:
- Around line 3-10: The auto-speech message filter in useSpeechOutput.ts is
missing exclusion of assistant "thinking" messages; update the filter that
currently checks lastMsg.isToolUse and lastMsg.isInteractivePrompt (the
exclusion condition near where lastMsg is evaluated) to also include
lastMsg.isThinking (i.e., add lastMsg.isThinking || to the OR chain) so
ChatMessage thinking messages are not sent to TTS, keeping behavior consistent
with the UI; reference the ChatMessage type and the lastMsg filter in your
change.

---

Nitpick comments:
In `@src/hooks/useSpeechOutput.ts`:
- Around line 142-152: Compute-intensive derived arrays filteredVoices and
availableLanguages are recreated on every render; wrap their calculations in
React's useMemo to memoize results based on dependencies (use availableVoices
and lang for filteredVoices, and availableVoices for availableLanguages). Update
the logic inside useSpeechOutput.ts to replace the direct
Array.filter/Array.from+Set computations with useMemo hooks so allocations only
run when their dependencies change, referencing the existing variables
filteredVoices, availableLanguages, availableVoices, and lang.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 4b743bbb-f0ce-4a8c-964a-645c0b6d40f0

📥 Commits

Reviewing files that changed from the base of the PR and between cf5f795 and 81b538b.

📒 Files selected for processing (1)
  • src/hooks/useSpeechOutput.ts

Filter out isThinking assistant messages to avoid reading aloud
Claude's internal reasoning process.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants