minEndpointingDelay is nullified in VAD turn-detection mode (swallowed by Silero minSilenceDuration), breaking multi-segment turn grouping

### Summary
In `turnDetection: 'vad'` mode, `minEndpointingDelay` has effectively **no effect** when it is ≤ the VAD's `minSilenceDuration`. The end-of-utterance grouping window collapses to ~0, so the turn commits the instant `END_OF_SPEECH` fires. This is the same root cause as Python issue https://github.com/livekit/agents/issues/4325 (closed without a code fix), and it is still present in the latest `@livekit/agents@1.4.5`.

### Root cause
`agents/src/voice/audio_recognition.ts`, in `bounceEOUTask` (compiled `dist/voice/audio_recognition.js`, 1.4.5 line ~818):

```js
let extraSleep = endpointingDelay;            // = endpointing.minDelay (default 500ms)
if (lastSpeakingTime !== void 0) {
  extraSleep += lastSpeakingTime - Date.now(); // subtracts silence already elapsed
}
if (extraSleep > 0) {
  await delay(Math.max(extraSleep, 0), { signal: controller.signal });
}
```

`lastSpeakingTime` is stamped on `INFERENCE_DONE` (≈ when the user stops), but `bounceEOUTask` only runs at `END_OF_SPEECH`, which Silero emits **`minSilenceDuration` (~550ms) later**. So `lastSpeakingTime - Date.now() ≈ -550ms`, giving `extraSleep ≈ minDelay - minSilenceDuration`. With the defaults (`minDelay=500`, `minSilence=550`) that's **negative → no wait → immediate commit**. Effective delay ≈ `max(minSilenceDuration, minDelay)`, so `minDelay` is silently ignored unless it exceeds `minSilenceDuration`.

### Why it matters (worse than latency in realtime/manual-activity mode)
With a realtime model using manual activity detection (e.g. `@livekit/agents-plugin-google` with `automaticActivityDetection.disabled`), the missing grouping window means a **natural mid-sentence pause** ("No, that's okay. … just use Alex") splits into two VAD segments:
1. Segment 1 commits a turn immediately (generation starts).
2. Segment 2 begins while segment 1 is still generating, so a second `userTurnCompleted`/`generateReply` never fires for it.
3. The activity window opened for segment 2 is never closed → the model waits indefinitely → **the agent never responds (dead call).**

Reproduced consistently in low-concurrency local runs: any caller utterance containing a ~1s pause stalls the turn.

### Expected
`minDelay` should provide a real grouping window *after* `END_OF_SPEECH` (so `START_OF_SPEECH` can cancel the pending commit), independent of how long silence detection took.

### Suggested fix
Per #4325's proposed solution #2: in VAD-based turn detection, measure the endpointing delay from `END_OF_SPEECH` rather than from `lastSpeakingTime` — e.g. skip the `lastSpeakingTime - Date.now()` adjustment when `vadBaseTurnDetection` is true. STT mode (where the adjustment compensates for transcription latency) is unaffected.

### Environment
- `@livekit/agents` 1.3.4 (verified identical in 1.4.5)
- `@livekit/agents-plugin-silero` 1.3.4, `@livekit/agents-plugin-google` 1.3.4 (realtime, manual activity)
- `turnHandling: { turnDetection: 'vad', endpointing: { minDelay: 500, maxDelay: 1500 } }`, Silero `minSilenceDuration` 550ms
- Node.js, Linux/macOS

Related: #926 (unnecessary delay in manual mode — opposite direction).


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

minEndpointingDelay is nullified in VAD turn-detection mode (swallowed by Silero minSilenceDuration), breaking multi-segment turn grouping #1741

Summary

Root cause

Why it matters (worse than latency in realtime/manual-activity mode)

Expected

Suggested fix

Environment

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

minEndpointingDelay is nullified in VAD turn-detection mode (swallowed by Silero minSilenceDuration), breaking multi-segment turn grouping #1741

Description

Summary

Root cause

Why it matters (worse than latency in realtime/manual-activity mode)

Expected

Suggested fix

Environment

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions