Describe the bug
The Silero adapter cuts off the prefixPaddingSamples before calling _recognize when not streaming.
This causes the following:
- We pass
prefixPaddingDuration as 500ms so that we receive the audio right before the VAD fires
- Right before _recognize gets called in vad.ts, the adapter cuts off the prefix.
The culprit is line 306 on 77a8355: this.#speechBuffer.subarray(this.#prefixPaddingSamples, speechBufferIndex)
Which should be this.#speechBuffer.subarray(0, speechBufferIndex) instead.
Relevant log output
No response
Describe your environment
We're running agents framework 1.2.7 and using the Silero adapter version 1.2.7, but we verified the same issue exists on the latest version of Silero.
Minimal reproducible example
- Set up the most basic Livekit pipeline
- Create a custom STT handler by extending the
stt.STT class
- Implement the _recognize class
- Pass
prefixPaddingDuration: 500 to the Silero settings
- The audio buffer in _recognize will not receive the 500ms before the VAD fires
Additional information
No response
Describe the bug
The Silero adapter cuts off the
prefixPaddingSamplesbefore calling_recognizewhen not streaming.This causes the following:
prefixPaddingDurationas 500ms so that we receive the audio right before the VAD firesThe culprit is line 306 on 77a8355:
this.#speechBuffer.subarray(this.#prefixPaddingSamples, speechBufferIndex)Which should be
this.#speechBuffer.subarray(0, speechBufferIndex)instead.Relevant log output
No response
Describe your environment
We're running agents framework 1.2.7 and using the Silero adapter version 1.2.7, but we verified the same issue exists on the latest version of Silero.
Minimal reproducible example
stt.STTclassprefixPaddingDuration: 500to the Silero settingsAdditional information
No response