feat(tts): add Doubao TTS 2.0 (Volcengine Seed-TTS 2.0) provider#283
feat(tts): add Doubao TTS 2.0 (Volcengine Seed-TTS 2.0) provider#283
Conversation
Add Doubao TTS as a new TTS provider with 17 voices (14 Chinese, 3 English), streaming response parsing for Volcengine's chunked JSON format, rate-limit error handling, and full integration across types, constants, provider implementation, server config, settings store, i18n, and UI components. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Show two input fields (App ID + Access Key) for Doubao TTS in settings, instead of requiring the compound "appId:accessKey" format. The values are combined internally so no changes are needed to the API layer. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
cosarah
left a comment
There was a problem hiding this comment.
Issues
Important
-
lib/audio/tts-providers.ts:109—TTSRateLimitErroris exported and thrown but never caught distinctly fromErroranywhere. If retry/backoff logic is planned, add a note or TODO; otherwise this is dead code. -
lib/audio/tts-providers.ts:484—atob()+ manual byte copy is less robust and less efficient thanBuffer.from(chunk.data, 'base64')for server-side code. Consider:const bytes = Buffer.from(chunk.data, 'base64'); audioChunks.push(new Uint8Array(bytes));
-
tts-settings.tsx:121,149— "App ID" and "Access Key" labels are hard-coded English strings. Per project convention, UI text needs i18n — uset('settings.doubaoAppId')/t('settings.doubaoAccessKey')with entries inlib/i18n/settings.ts. -
lib/audio/tts-providers.ts:452+constants.ts—supportedFormatsdeclares['mp3', 'ogg_opus', 'pcm']but the request always hard-codesformat: 'mp3'. Either use the format from config or trimsupportedFormatsto['mp3']to avoid misleading callers.
Suggestions
getTTSProviderNameis duplicated across 3 files (pre-existing, but this PR deepens it). Worth extracting to a shared utility.- Access Key input field lacks the show/hide toggle that App ID has — minor inconsistency.
Summary
Well-structured PR that follows existing TTS provider patterns. The i18n gap and atob → Buffer.from fix should be addressed before merge; the rest are minor.
- Add TODO note explaining TTSRateLimitError's future use for retry/backoff - Replace atob() + manual byte loop with Buffer.from() for server-side code - Trim supportedFormats to ['mp3'] since only mp3 is used in requests - Add i18n keys for Doubao App ID / Access Key labels (zh-CN + en-US) - Add show/hide toggle to Access Key input field Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
cosarah
left a comment
There was a problem hiding this comment.
Previous review items are all properly addressed in aa6b28b ✓
TTSRateLimitError: TODO comment added — acceptableatob()→Buffer.from(): fixed- i18n labels: fixed, both locales updated
supportedFormats: trimmed to['mp3']- Bonus: Access Key show/hide toggle added
No new blocking issues. The duplicated getTTSProviderName across 3 files is pre-existing debt worth a follow-up issue.
LGTM — ready to merge.
Summary
Closes #282
Changed files
lib/audio/types.tsdoubao-ttstoTTSProviderIdunionlib/audio/constants.tslib/audio/tts-providers.tsTTSRateLimitError+generateDoubaoTTS()lib/server/provider-config.tsTTS_DOUBAOenv var mappinglib/store/settings.tsdoubao-ttsdefault config entrylib/i18n/settings.tsdoubao-ttsin provider name mapsTest plan
pnpm check/pnpm lint/npx tsc --noEmitpassappId:accessKeyand verify voice preview works🤖 Generated with Claude Code