feat: add Text-to-Speech support via Web Speech API#502
feat: add Text-to-Speech support via Web Speech API#502x64zx wants to merge 7 commits intositeboon:mainfrom
Conversation
Add browser-native TTS integration that reads Claude's responses aloud: - useSpeechOutput hook: Web Speech API wrapper with voice selection, rate/pitch control, language filtering, and streaming-aware chunking - TtsContext: React context provider for app-wide TTS state - QuickSettingsTtsSection: Full TTS configuration panel with voice preview, rate/pitch sliders, and language filter - ChatComposer: Toggle button with lucide-react icons (Volume2/VolumeX/ StopCircle) and graceful degradation when no voices available - QuickSettingsContent: TTS section with voices=0 guard All settings persist to localStorage. Falls back to navigator.language instead of hardcoded locale. No external dependencies required — uses the browser's built-in speechSynthesis API. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
📝 WalkthroughWalkthroughAdds Text‑to‑Speech (TTS): new TtsProvider and useTts hook backed by useSpeechOutput; ChatInterface is wrapped with the provider; ChatComposer gains TTS UI controls; a Quick Settings TTS section component is added; and TTS localization keys are added across locales. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant ChatInterface
participant ChatComposer
participant TtsProvider
participant useSpeechOutput
participant Browser_SpeechSynthesis
User->>ChatInterface: open chat / send input
ChatInterface->>TtsProvider: provide chatMessages
TtsProvider->>useSpeechOutput: init with messages
useSpeechOutput->>Browser_SpeechSynthesis: load voices (onvoiceschanged)
Browser_SpeechSynthesis-->>useSpeechOutput: voices available
rect rgba(100, 150, 255, 0.5)
Note over ChatInterface,useSpeechOutput: Assistant message finalized
ChatInterface->>TtsProvider: chatMessages updated
TtsProvider->>useSpeechOutput: detect new finalized assistant message
useSpeechOutput->>Browser_SpeechSynthesis: speak(utterance)
Browser_SpeechSynthesis-->>useSpeechOutput: onend/onerror -> update isSpeaking
end
rect rgba(100, 200, 100, 0.5)
Note over User,ChatComposer: User interacts with TTS controls
User->>ChatComposer: toggle / adjust rate/pitch / test / stop
ChatComposer->>TtsProvider: call toggle/setRate/testVoice/stop
TtsProvider->>useSpeechOutput: execute control
useSpeechOutput->>Browser_SpeechSynthesis: apply or cancel speech
end
Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs). Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/components/chat/view/subcomponents/ChatComposer.tsx`:
- Around line 332-352: The TTS icon-only button lacks a localized accessible
name; update the button in ChatComposer (the element using tts, onClick, and
title) to supply an aria-label that uses the app's localization helper (e.g.,
t(...) or i18n.t(...)) and reflect the same state logic as title (when
tts.isSpeaking: "Stop speaking", when tts.enabled: "TTS ON (click to disable)",
else "TTS OFF (click to enable)"); keep the existing title in sync by setting it
from the same localized strings so both tooltip and screen readers receive
identical, localized text.
In `@src/components/quick-settings-panel/view/QuickSettingsTtsSection.tsx`:
- Around line 42-183: QuickSettingsTtsSection currently hardcodes all UI text;
replace those literals with the app i18n settings keys like the rest of Quick
Settings by importing the same translation hook (e.g., useTranslation or
useSettingsTranslations used elsewhere) into QuickSettingsTtsSection, swap
strings such as "TTS Enabled", "Language", "All Languages", "Voice
({filteredVoices.length} available)" (use pluralization/count support), "Auto
(first match)", "Speed", "Pitch", "Test Voice", "Stop", the range labels
("0.5x", "1.0x", "2.0x", "3.0x", "Low", "Normal", "High") and any network suffix
text for voices with corresponding t('settings.xxx') keys, and ensure keys
accept interpolations (count for filteredVoices.length) and provide sensible
fallback text. Use the same translation namespace and patterns as other Quick
Settings components so the component follows the app language settings.
- Around line 44-152: The controls are not programmatically associated with
their labels; add explicit id/htmlFor pairs and switch labeling: give the
visible "TTS Enabled" span a unique id (e.g., tts-enabled-label) and set the
switch button's aria-labelledby to that id (or aria-label if you prefer), add id
attributes to the Language select (lang select bound to lang/onLangChange),
Voice select (voiceURI/onVoiceChange), Rate input (rate/onRateChange) and Pitch
input (pitch/onPitchChange), and update each corresponding <label> to use
htmlFor pointing to those ids; ensure the voice-count label and the numeric
value spans (rate/pitch display) are reachable via aria-labelledby or have ids
referenced by their inputs if needed so screen readers see the association.
In `@src/hooks/useSpeechOutput.ts`:
- Around line 206-236: The effect in useSpeechOutput watches streaming assistant
messages but when lastMsg.isStreaming is true it only caches content
(lastStreamingContentRef.current) and never speaks partial updates; change the
isStreaming branch to compute the incremental delta between
lastStreamingContentRef.current (previous partial) and lastMsg.content (new
partial), update lastStreamingContentRef.current to the new content, and call
speak(delta) only if the delta is non-empty so partial TTS plays during
streaming; keep the existing finalization logic that clears
lastStreamingContentRef and uses lastSpokenIndexRef to avoid duplicate final
speaks.
- Around line 183-199: Add an unmount cleanup in the useSpeechOutput hook (where
stop and toggle are defined) to ensure active speech is canceled when
TtsProvider unmounts: inside a useEffect with no deps, return a cleanup that
checks for typeof window !== 'undefined' && window.speechSynthesis and calls
window.speechSynthesis.cancel() and setIsSpeaking(false). This ensures any
ongoing utterance started via stop/toggle is stopped when the provider/component
is torn down.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 56cb86fb-6cf0-4cb2-a86b-fe5fb1363ba6
📒 Files selected for processing (6)
src/components/chat/view/ChatInterface.tsxsrc/components/chat/view/subcomponents/ChatComposer.tsxsrc/components/quick-settings-panel/view/QuickSettingsContent.tsxsrc/components/quick-settings-panel/view/QuickSettingsTtsSection.tsxsrc/contexts/TtsContext.tsxsrc/hooks/useSpeechOutput.ts
src/components/quick-settings-panel/view/QuickSettingsTtsSection.tsx
Outdated
Show resolved
Hide resolved
| // Watch for finalized assistant messages | ||
| useEffect(() => { | ||
| if (!enabled || chatMessages.length === 0) { | ||
| return; | ||
| } | ||
|
|
||
| const lastIndex = chatMessages.length - 1; | ||
| const lastMsg = chatMessages[lastIndex]; | ||
|
|
||
| if ( | ||
| !lastMsg || | ||
| lastMsg.type !== 'assistant' || | ||
| lastMsg.isToolUse || | ||
| lastMsg.isInteractivePrompt || | ||
| !lastMsg.content | ||
| ) { | ||
| lastStreamingContentRef.current = null; | ||
| return; | ||
| } | ||
|
|
||
| if (lastMsg.isStreaming) { | ||
| lastStreamingContentRef.current = lastMsg.content; | ||
| return; | ||
| } | ||
|
|
||
| if (lastIndex > lastSpokenIndexRef.current) { | ||
| lastSpokenIndexRef.current = lastIndex; | ||
| lastStreamingContentRef.current = null; | ||
| speak(lastMsg.content); | ||
| } | ||
| }, [chatMessages, enabled, speak]); |
There was a problem hiding this comment.
The streaming TTS path never actually speaks partial output.
When lastMsg.isStreaming is true, this effect only caches the latest content and returns. Because lastStreamingContentRef is never read to compute/speak completed sentence deltas, assistant replies are only spoken after finalization instead of incrementally during streaming.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/hooks/useSpeechOutput.ts` around lines 206 - 236, The effect in
useSpeechOutput watches streaming assistant messages but when
lastMsg.isStreaming is true it only caches content
(lastStreamingContentRef.current) and never speaks partial updates; change the
isStreaming branch to compute the incremental delta between
lastStreamingContentRef.current (previous partial) and lastMsg.content (new
partial), update lastStreamingContentRef.current to the new content, and call
speak(delta) only if the delta is non-empty so partial TTS plays during
streaming; keep the existing finalization logic that clears
lastStreamingContentRef and uses lastSpokenIndexRef to avoid duplicate final
speaks.
There was a problem hiding this comment.
This is intentional design. Streaming partial TTS has several practical issues:
- Sentence fragmentation — partial updates cut mid-sentence, producing unnatural speech
- Cancellation overhead — each delta requires cancel+speak, causing stuttering
- Race conditions — rapid cancel/speak can silently drop utterances on some browsers
- Duplicate content — overlapping deltas can cause repeated speech without careful diffing
The current approach (speak the complete message after streaming finishes) provides a cleaner UX — full sentences read in one pass. This matches how most TTS integrations work (e.g. ChatGPT Read Aloud waits for the full response).
If streaming TTS is desired later, it would need a sentence-boundary chunking buffer — a separate feature.
There was a problem hiding this comment.
@x64zx, that's a well-reasoned design decision — the four issues you've identified (fragmentation, cancellation overhead, race conditions, duplicate content) are real pitfalls of naïve streaming TTS. Waiting for finalization before speaking is indeed the cleaner approach. I'll note this for future reviews.
( •ᴗ•) 🐇
✏️ Learnings added
Learnt from: x64zx
Repo: siteboon/claudecodeui PR: 502
File: src/hooks/useSpeechOutput.ts:215-245
Timestamp: 2026-03-08T21:21:53.608Z
Learning: In `src/hooks/useSpeechOutput.ts`, the streaming TTS behavior is intentional: when `lastMsg.isStreaming` is true, the hook only caches content in `lastStreamingContentRef.current` and does NOT speak partial updates. TTS is deliberately triggered only after the full assistant message is finalized (streaming complete). Reasons: sentence fragmentation, cancellation overhead, browser race conditions, and duplicate content risk. Streaming TTS with sentence-boundary chunking is a potential future feature, not the current design.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
- Add aria-label to TTS toggle button in ChatComposer (a11y) - Move all hardcoded TTS UI strings to i18n (en/ja/ko/zh-CN) - Add htmlFor/id pairs to all TTS form controls (a11y) - Add aria-labelledby to TTS enabled switch (a11y) - Cancel active speech on TtsProvider unmount (cleanup) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/hooks/useSpeechOutput.ts`:
- Around line 84-86: When TTS is enabled it currently treats index -1 as new and
will immediately speak the most recent assistant message; to fix it seed the
tail on initial hydration and on enable transitions by setting
lastSpokenIndexRef.current to messages.length - 1 (and set
lastStreamingContentRef.current to the last assistant message text or null)
whenever the hook mounts with enabled===true or when enabled flips from
false→true so only messages appended afterward are spoken; update the
useEffect(s) that watch enabled/messages (around lastSpokenIndexRef and
lastStreamingContentRef usage) to perform this seeding instead of immediately
queuing the latest reply.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 9888d4e4-f457-4cb5-b2c2-63fb01d5bdce
📒 Files selected for processing (7)
src/components/chat/view/subcomponents/ChatComposer.tsxsrc/components/quick-settings-panel/view/QuickSettingsTtsSection.tsxsrc/hooks/useSpeechOutput.tssrc/i18n/locales/en/settings.jsonsrc/i18n/locales/ja/settings.jsonsrc/i18n/locales/ko/settings.jsonsrc/i18n/locales/zh-CN/settings.json
🚧 Files skipped from review as they are similar to previous changes (1)
- src/components/quick-settings-panel/view/QuickSettingsTtsSection.tsx
When TTS is enabled on page load or toggled on mid-session, lastSpokenIndexRef was -1, causing the last assistant message to be immediately spoken. Now seeded to chatMessages.length - 1 on mount and on enable toggle so only new messages are spoken. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
src/hooks/useSpeechOutput.ts (2)
128-135: Polling may be redundant withonend/onerrorhandlers.The
speakfunction already attachesonendandonerrorhandlers (lines 175-176) that setisSpeaking(false). The polling interval mainly catches thespeaking=truetransition, which is immediately set byspeak()anyway (line 178). Consider if polling is needed or if it could be removed to reduce overhead.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/hooks/useSpeechOutput.ts` around lines 128 - 135, The polling effect that checks window.speechSynthesis.speaking should be removed to avoid redundant work; delete the useEffect block that sets up setInterval and clearInterval, and instead rely on the existing speak function's explicit state transitions (ensure speak() calls setIsSpeaking(true) when starting and that the utterance onend and onerror handlers call setIsSpeaking(false)); also verify there are no other code paths that require polling to detect speaking state and, if any exist, update them to use the same explicit state updates in speak/utterance handlers.
3-10: Consider filtering outisThinkingmessages.The local
ChatMessagetype omitsisThinkingfrom the actualChatMessageinterface (seesrc/components/chat/types/types.ts:28-50). While the hook filtersisToolUseandisInteractivePrompt, thinking indicator messages (isThinking: true) could be spoken. You may want to add this field and filter it similarly.type ChatMessage = { type: string; content?: string; isStreaming?: boolean; isToolUse?: boolean; isInteractivePrompt?: boolean; + isThinking?: boolean; [key: string]: unknown; };Then at line 233:
lastMsg.isToolUse || lastMsg.isInteractivePrompt || + lastMsg.isThinking || !lastMsg.content🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/hooks/useSpeechOutput.ts` around lines 3 - 10, The ChatMessage type in useSpeechOutput.ts is missing the isThinking flag which allows thinking-indicator messages to be spoken; update the ChatMessage type to include isThinking?: boolean and update the filtering logic inside the useSpeechOutput hook (where it already filters isToolUse and isInteractivePrompt) to also skip messages with isThinking === true so thinking indicators are not spoken; ensure references to isThinking are used in the same places as isToolUse/isInteractivePrompt checks.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/hooks/useSpeechOutput.ts`:
- Around line 54-61: The readStorageFloat function can return NaN when
localStorage holds a malformed value; update readStorageFloat to validate the
parsed number and return the provided fallback if parseFloat yields NaN or an
infinite value. Specifically, inside readStorageFloat (used for rate/pitch
state), parse the stored string, then check Number.isFinite(result) (or
isFinite) and only return the parsed value when valid; otherwise return the
fallback so rate/pitch never become NaN.
---
Nitpick comments:
In `@src/hooks/useSpeechOutput.ts`:
- Around line 128-135: The polling effect that checks
window.speechSynthesis.speaking should be removed to avoid redundant work;
delete the useEffect block that sets up setInterval and clearInterval, and
instead rely on the existing speak function's explicit state transitions (ensure
speak() calls setIsSpeaking(true) when starting and that the utterance onend and
onerror handlers call setIsSpeaking(false)); also verify there are no other code
paths that require polling to detect speaking state and, if any exist, update
them to use the same explicit state updates in speak/utterance handlers.
- Around line 3-10: The ChatMessage type in useSpeechOutput.ts is missing the
isThinking flag which allows thinking-indicator messages to be spoken; update
the ChatMessage type to include isThinking?: boolean and update the filtering
logic inside the useSpeechOutput hook (where it already filters isToolUse and
isInteractivePrompt) to also skip messages with isThinking === true so thinking
indicators are not spoken; ensure references to isThinking are used in the same
places as isToolUse/isInteractivePrompt checks.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 1ccc3c97-cbd6-41a3-8d26-5a44a2e1d292
📒 Files selected for processing (1)
src/hooks/useSpeechOutput.ts
| function readStorageFloat(key: string, fallback: number): number { | ||
| try { | ||
| const v = localStorage.getItem(key); | ||
| return v ? parseFloat(v) : fallback; | ||
| } catch { | ||
| return fallback; | ||
| } | ||
| } |
There was a problem hiding this comment.
parseFloat can return NaN for malformed stored values.
If localStorage contains a non-numeric string (e.g., corrupted data), parseFloat returns NaN, which propagates to rate/pitch state and could cause unexpected speech behavior.
🛡️ Suggested defensive fix
function readStorageFloat(key: string, fallback: number): number {
try {
const v = localStorage.getItem(key);
- return v ? parseFloat(v) : fallback;
+ if (!v) return fallback;
+ const parsed = parseFloat(v);
+ return Number.isNaN(parsed) ? fallback : parsed;
} catch {
return fallback;
}
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/hooks/useSpeechOutput.ts` around lines 54 - 61, The readStorageFloat
function can return NaN when localStorage holds a malformed value; update
readStorageFloat to validate the parsed number and return the provided fallback
if parseFloat yields NaN or an infinite value. Specifically, inside
readStorageFloat (used for rate/pitch state), parse the stored string, then
check Number.isFinite(result) (or isFinite) and only return the parsed value
when valid; otherwise return the fallback so rate/pitch never become NaN.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/hooks/useSpeechOutput.ts (1)
193-206: Minor:toggleclosure captureschatMessages.lengthat render time.The dependency on
chatMessages.lengthmeans the seeding logic (line 198) uses the length from whentogglewas last created, not the current length at invocation time. If messages change between re-renders and the toggle is called, the seed index could be off.This is unlikely to cause issues in practice since users typically toggle TTS during idle states, but using a ref for
chatMessages.lengthwould be more robust.💡 Optional fix using ref
+ const chatMessagesLengthRef = useRef(chatMessages.length); + chatMessagesLengthRef.current = chatMessages.length; + const toggle = useCallback(() => { setEnabled((prev) => { const next = !prev; if (next) { - lastSpokenIndexRef.current = chatMessages.length - 1; + lastSpokenIndexRef.current = chatMessagesLengthRef.current - 1; lastStreamingContentRef.current = null; } else if (typeof window !== 'undefined' && window.speechSynthesis) { window.speechSynthesis.cancel(); setIsSpeaking(false); } return next; }); - }, [chatMessages.length]); + }, []);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/hooks/useSpeechOutput.ts` around lines 193 - 206, The toggle callback currently captures chatMessages.length at render time which can be stale; update the hook to track the latest length in a ref (e.g., chatMessagesLengthRef) that you update whenever chatMessages changes, then inside toggle use lastSpokenIndexRef.current = chatMessagesLengthRef.current - 1 instead of chatMessages.length; remove chatMessages.length from toggle's dependency array so toggle doesn't re-create unnecessarily and ensure the new ref is kept in sync via an effect that runs on chatMessages updates.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/hooks/useSpeechOutput.ts`:
- Around line 193-206: The toggle callback currently captures
chatMessages.length at render time which can be stale; update the hook to track
the latest length in a ref (e.g., chatMessagesLengthRef) that you update
whenever chatMessages changes, then inside toggle use lastSpokenIndexRef.current
= chatMessagesLengthRef.current - 1 instead of chatMessages.length; remove
chatMessages.length from toggle's dependency array so toggle doesn't re-create
unnecessarily and ensure the new ref is kept in sync via an effect that runs on
chatMessages updates.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 28190615-b1bc-4852-8d6a-8e8d209b2b73
📒 Files selected for processing (1)
src/hooks/useSpeechOutput.ts
Prevents stale closure capturing outdated length when toggle is called between re-renders. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/hooks/useSpeechOutput.ts (1)
132-139: Consider conditionally running the polling interval.The interval polls
speechSynthesis.speakingevery 200ms regardless of whether TTS is enabled. For a minor efficiency improvement, you could skip polling when!enabled && !isSpeaking, though the current approach ensures the state is always accurate.♻️ Optional optimization
useEffect(() => { if (typeof window === 'undefined' || !window.speechSynthesis) return; + if (!enabled && !isSpeaking) return; const interval = setInterval(() => { setIsSpeaking(window.speechSynthesis.speaking); }, 200); return () => clearInterval(interval); - }, []); + }, [enabled, isSpeaking]);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/hooks/useSpeechOutput.ts` around lines 132 - 139, The current useEffect always starts a polling interval to setIsSpeaking from window.speechSynthesis.speaking; change it to only start the interval when TTS is active by checking the hook state (e.g., enabled || isSpeaking) before creating the interval, and add those values to the effect dependency array so it starts/stops correctly; keep the existing cleanup (clearInterval) and the guard for typeof window === 'undefined' and window.speechSynthesis to avoid side effects.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/hooks/useSpeechOutput.ts`:
- Around line 132-139: The current useEffect always starts a polling interval to
setIsSpeaking from window.speechSynthesis.speaking; change it to only start the
interval when TTS is active by checking the hook state (e.g., enabled ||
isSpeaking) before creating the interval, and add those values to the effect
dependency array so it starts/stops correctly; keep the existing cleanup
(clearInterval) and the guard for typeof window === 'undefined' and
window.speechSynthesis to avoid side effects.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 7de5de61-a99e-48a9-8f11-d03bd360f01b
📒 Files selected for processing (1)
src/hooks/useSpeechOutput.ts
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
@coderabbitai resume |
✅ Actions performedReviews resumed. |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
src/hooks/useSpeechOutput.ts (1)
142-152: Consider memoizing derived values.
filteredVoicesandavailableLanguagesare computed on every render. While the arrays are typically small, wrapping these inuseMemowould prevent unnecessary recomputation and array allocations.♻️ Optional memoization
+import { useCallback, useEffect, useMemo, useRef, useState } from 'react'; -import { useCallback, useEffect, useRef, useState } from 'react'; // ... - const filteredVoices = availableVoices.filter((v) => { - if (lang === '') return true; - const langPrefix = lang.split('-')[0]; - return v.lang.startsWith(langPrefix); - }); + const filteredVoices = useMemo(() => + availableVoices.filter((v) => { + if (lang === '') return true; + const langPrefix = lang.split('-')[0]; + return v.lang.startsWith(langPrefix); + }), + [availableVoices, lang] + ); - const availableLanguages = Array.from( - new Set(availableVoices.map((v) => v.lang)), - ).sort(); + const availableLanguages = useMemo(() => + Array.from(new Set(availableVoices.map((v) => v.lang))).sort(), + [availableVoices] + );🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/hooks/useSpeechOutput.ts` around lines 142 - 152, Compute-intensive derived arrays filteredVoices and availableLanguages are recreated on every render; wrap their calculations in React's useMemo to memoize results based on dependencies (use availableVoices and lang for filteredVoices, and availableVoices for availableLanguages). Update the logic inside useSpeechOutput.ts to replace the direct Array.filter/Array.from+Set computations with useMemo hooks so allocations only run when their dependencies change, referencing the existing variables filteredVoices, availableLanguages, availableVoices, and lang.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/hooks/useSpeechOutput.ts`:
- Around line 3-10: The auto-speech message filter in useSpeechOutput.ts is
missing exclusion of assistant "thinking" messages; update the filter that
currently checks lastMsg.isToolUse and lastMsg.isInteractivePrompt (the
exclusion condition near where lastMsg is evaluated) to also include
lastMsg.isThinking (i.e., add lastMsg.isThinking || to the OR chain) so
ChatMessage thinking messages are not sent to TTS, keeping behavior consistent
with the UI; reference the ChatMessage type and the lastMsg filter in your
change.
---
Nitpick comments:
In `@src/hooks/useSpeechOutput.ts`:
- Around line 142-152: Compute-intensive derived arrays filteredVoices and
availableLanguages are recreated on every render; wrap their calculations in
React's useMemo to memoize results based on dependencies (use availableVoices
and lang for filteredVoices, and availableVoices for availableLanguages). Update
the logic inside useSpeechOutput.ts to replace the direct
Array.filter/Array.from+Set computations with useMemo hooks so allocations only
run when their dependencies change, referencing the existing variables
filteredVoices, availableLanguages, availableVoices, and lang.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 4b743bbb-f0ce-4a8c-964a-645c0b6d40f0
📒 Files selected for processing (1)
src/hooks/useSpeechOutput.ts
Filter out isThinking assistant messages to avoid reading aloud Claude's internal reasoning process. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary
Add browser-native Text-to-Speech (TTS) integration that reads Claude's responses aloud using the Web Speech API. Zero external dependencies — uses the browser's built-in
speechSynthesisAPI.Features
Files changed
src/hooks/useSpeechOutput.tssrc/contexts/TtsContext.tsxsrc/components/quick-settings-panel/view/QuickSettingsTtsSection.tsxsrc/components/chat/view/ChatInterface.tsxsrc/components/chat/view/subcomponents/ChatComposer.tsxsrc/components/quick-settings-panel/view/QuickSettingsContent.tsxDesign decisions
navigator.languagefallback instead of hardcoded locale — works for any user languageavailableVoices.length > 0guard on both toggle button and settings panel — prevents broken UI on environments without speech synthesis voiceslucide-reactand browser APIsScreenshots
TTS toggle button appears to the left of the send button when voices are available. Settings accessible via Quick Settings panel (scroll to bottom).
Test plan
speechSynthesisAPI🤖 Generated with Claude Code
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
Summary by CodeRabbit
New Features
New UI
Localization