Skip to content

SentenceBoundaryDetector: O(n) heuristic without indexOf #159

@ysdede

Description

@ysdede

Location: src/lib/transcription/SentenceBoundaryDetector.ts (around lines 287-301)

Finding: In detectSentenceEndingsHeuristic, the code uses words.indexOf(word) inside a .map() callback. For each word that matches the sentence-ending pattern this does a linear search, resulting in O(n^2) behavior.

Suggested fix: Preserve the original index by iterating with the index from the source array. For example: use words.reduce, words.forEach, or words.flatMap with a (word, idx) callback; when /[.?!]$/.test(word.text) is true, push a SentenceEndingWord with wordIndex set to idx and sentenceMetadata set as before. Return a SentenceEndingWord[] built in O(n) without using indexOf.

Verification: Current implementation at 288-300: .filter(...).map((word, _idx) => { const wordIndex = words.indexOf(word); ... }).

Metadata

Metadata

Assignees

No one assigned

    Labels

    low-hanging-fruitSmall scoped, fast-to-ship improvements

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions