feat(elevenlabs): add noVerbatim STT option#1769
feat(elevenlabs): add noVerbatim STT option#1769rosetta-livekit-bot[bot] wants to merge 1 commit into
Conversation
🦋 Changeset detectedLatest commit: 4655ac8 The changes in this PR will be included in the next version bump. This PR includes changesets to release 34 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
| if (this.#opts.noVerbatim) { | ||
| params.push('no_verbatim=true'); | ||
| } |
There was a problem hiding this comment.
🚩 keyterms not propagated to streaming WebSocket — pre-existing asymmetry
The keyterms option is only used in batch mode (#recognizeImpl at line 348-351) and is NOT added to the WebSocket URL params in #connectWs. In contrast, noVerbatim is correctly sent in both batch and streaming modes. If ElevenLabs' realtime API supports keyterms, this would be a pre-existing gap unrelated to this PR. The updateOptions method also doesn't propagate keyterms changes to streams, consistent with it being batch-only.
Was this helpful? React with 👍 or 👎 to provide feedback.
Summary
Testing
Notes
Ported from livekit/agents#6032
Original PR description
Summary
Exposes ElevenLabs'
no_verbatimspeech-to-text option in the plugin. When enabled, the model removes filler words, false starts and disfluencies from the transcript, producing cleaner output.Both Scribe v2 (batch) and Scribe v2 realtime support this flag (it's a documented STT parameter — a form field for batch and a websocket query parameter for realtime), but the plugin previously had no way to set it.
Changes
no_verbatim: bool = FalsetoSTT.__init__andSTTOptions(documented)._recognize_impl): add theno_verbatimform field when enabled._connect_ws): appendno_verbatim=trueto the websocket query params when enabled.update_options: allow togglingno_verbatimat runtime.tests/test_plugin_elevenlabs_stt.py) covering default, enabling, andupdate_options.Why
Spontaneous speech transcripts carry heavy disfluency ("eh", "mmm", false starts) that degrades downstream LLM consumers (e.g. classification/scoring over the transcript).
no_verbatimlets the STT clean this at the source. Default remainsFalse, so behavior is unchanged unless opted in.Notes
ruff checkandruff formatpass (per CONTRIBUTING).False→ no behavior change for existing users.🤖 Generated with Claude Code