Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .changeset/elevenlabs-stream-flags.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
---
'@livekit/agents-plugin-elevenlabs': patch
---

Add missing ElevenLabs streaming request flags for normalization and logging.
13 changes: 11 additions & 2 deletions plugins/elevenlabs/src/tts.ts
Original file line number Diff line number Diff line change
Expand Up @@ -136,8 +136,8 @@ function sampleRateFromFormat(encoding: TTSEncoding): number {
}

function synthesizeUrl(opts: ResolvedTTSOptions): string {
const { baseURL, voiceId, model, encoding, streamingLatency } = opts;
let url = `${baseURL}/text-to-speech/${voiceId}/stream?model_id=${model}&output_format=${encoding}`;
const { baseURL, voiceId, encoding, streamingLatency } = opts;
let url = `${baseURL}/text-to-speech/${voiceId}/stream?output_format=${encoding}&enable_logging=${String(opts.enableLogging).toLowerCase()}`;
if (streamingLatency !== undefined) {
url += `&optimize_streaming_latency=${streamingLatency}`;
}
Expand Down Expand Up @@ -837,6 +837,13 @@ export class ChunkedStream extends tts.ChunkedStream {
const voiceSettings = this.#opts.voiceSettings
? stripUndefined(this.#opts.voiceSettings)
: undefined;
const extraParams: Record<string, string | boolean> = {};
if (this.#opts.language) {
extraParams.language_code = getBaseLanguage(this.#opts.language);
}
if (this.#opts.applyLanguageTextNormalization !== undefined) {
extraParams.apply_language_text_normalization = this.#opts.applyLanguageTextNormalization;
}

const requestId = shortuuid();
const bstream = new AudioByteStream(this.#opts.sampleRate, 1);
Expand All @@ -852,6 +859,8 @@ export class ChunkedStream extends tts.ChunkedStream {
text: this.inputText,
model_id: this.#opts.model,
voice_settings: voiceSettings,
apply_text_normalization: this.#opts.applyTextNormalization,
...extraParams,
Comment on lines 859 to +863

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 ChunkedStream body params are a subset of what multiStreamUrl sends

The multiStreamUrl function (line 147-168) sends several parameters that are not included in the ChunkedStream body: enable_ssml_parsing, inactivity_timeout, sync_alignment, auto_mode, and pronunciation_dictionary_locators. Some of these are WebSocket-specific (e.g. inactivity_timeout), but others like enable_ssml_parsing and pronunciation_dictionary_locators could potentially apply to the REST endpoint too. This is pre-existing behavior not introduced by this PR, but worth noting since the PR's intent is to add "missing" flags — there may still be additional ones missing for the chunked stream path.

(Refers to lines 858-863)

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

}),
signal: this.abortSignal,
});
Expand Down