Skip to content

feat(speaker): same-voice consolidation to cut over-segmented speakers#1140

Draft
r3dbars wants to merge 2 commits into
mainfrom
feat/embedding-clusterer-same-voice-consolidation
Draft

feat(speaker): same-voice consolidation to cut over-segmented speakers#1140
r3dbars wants to merge 2 commits into
mainfrom
feat/embedding-clusterer-same-voice-consolidation

Conversation

@r3dbars

@r3dbars r3dbars commented Jun 16, 2026

Copy link
Copy Markdown
Owner

Resurrected from uncommitted WIP that was sitting in the local main checkout (staged but never committed). Opened as a draft for review.

What it does

Adds a third EmbeddingClusterer post-process pass — same-voice consolidation — that fixes over-segmentation: one real voice (often a remote participant under offline VBx clustering) split across several clusters that each accumulate enough speech to survive absorbSmallClusters. The symptom is a one-on-one call surfacing 4–7 "speakers" to name for a single person.

  • New consolidateSameVoiceClusters() agglomeratively merges clusters whose mean embeddings clear the SpeakerNamingPolicy auto-accept bar (0.88), recomputing centroids after each merge so the combined centroid must still clear the bar before the next cluster joins — avoiding the transitive A≈B, B≈C → A+B+C collapse that made broad pairwise merging unsafe on VBx output.
  • Wired into postProcess via an optional consolidationThreshold (default 0.88, pass nil to skip).
  • Quality-filtered embeddings (qualityScore ≥ 0.3, duration ≥ 1.0) with an all-sample fallback so every cluster has a centroid.
  • Test coverage added in EmbeddingClustererTests.

2 files, +233/-4. Branches cleanly off current origin/main.

⚠️ Overlaps open PR #1114

#1114 ("Speaker-naming simulator + same-voice consolidation to cut speakers-to-name") tackles the same problem with a different implementation of EmbeddingClusterer. These two are not identical and will conflict. Decide which approach wins (or reconcile) before merging either — don't land both blindly.

🤖 Generated with Claude Code

Codex CI fix / review (2026-06-16)

  • Pushed commit 886d6e28 to fix the stale simulator negative-control fixture that failed Swift CI.
  • Reproduced the failing SpeakerNamingSimulationRunnerTests/testSimulationReportFlagsConfusionFalseMergeAndFalseSplit locally after bash build-deps.sh --force; it now passes.
  • Mapped local checks passed: bash build-deps.sh --force, bash build.sh --no-open, bash run-tests.sh, bash run-integration-smoke.sh, swift test, plus focused SpeakerNamingSimulationRunnerTests and EmbeddingClustererTests.
  • ~/.codex/skills/codex-review/scripts/codex-review --mode branch reported no accepted/actionable findings.
  • Do not merge as-is: Speaker-naming simulator + same-voice consolidation to cut speakers-to-name #1114 has since merged the same consolidation surface, and this PR is now DIRTY against current origin/main with conflicts in EmbeddingClusterer.swift and EmbeddingClustererTests.swift. eval: multi-corpus speaker-naming harness + AMI scale-up #1144 is eval harness scale-up only, not an app-behavior duplicate.

r3dbars and others added 2 commits June 16, 2026 05:44
Adds a third EmbeddingClusterer post-process pass that agglomeratively
merges large clusters whose mean embeddings clear the SpeakerNamingPolicy
auto-accept bar (0.88), recomputing centroids after each merge so distinct
speakers in crowded meetings don't chain-collapse. Fixes one remote voice
surfacing as 4-7 "speakers" to name on a one-on-one call.

- New consolidateSameVoiceClusters() wired into postProcess via optional
  consolidationThreshold (default 0.88; nil to skip)
- Quality-filtered embeddings with all-sample fallback per cluster
- Test coverage in EmbeddingClustererTests

Resurrected from uncommitted WIP in the local checkout.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant