Background
parseData in js/hyperaudio-lite-editor-deepgram.js builds the editor's transcript HTML from a Deepgram response. It currently has a few fragile spots that bit a downstream pipeline using the same logic; the same issues apply in HLE.
What's fragile today
transcript.split(' ') alignment with the words array. Word text comes from punctuatedWords[index], where punctuatedWords = alt.transcript.split(' '). That holds for clean transcripts but smart_format=true can break the 1:1 alignment on formatted numbers, hyphenations, etc. — words then silently shift.
- No detection of missing diarization. If a request goes out without
diarize=true, every word's speaker is undefined and the output gets [speaker-undefined] markup throughout.
.toFixed(2) * 1000 for ms conversion. Mildly buggy (string → multiply) and uglier than Math.round(s * 1000).
- Sentence-end check reads the split-transcript array (
punctuatedWords[index - 1]), inheriting the same alignment fragility.
- No early guard for empty responses. A silent/truncated response fails cryptically deep in the loop instead of with a clear message.
Proposed changes
- Read each word's text from
element.punctuated_word || element.word directly. Drop the transcript.split(' ') array. Word data is canonical.
- Add
const showDiarization = wordData.some(w => w.speaker !== undefined); and gate speaker markup on that. No diarization → clean transcript with no speaker labels, instead of [speaker-undefined].
- Add a small
ms = s => Math.round(s * 1000) helper and use it everywhere ms appears in attributes. Replaces every .toFixed(2) * 1000.
- Sentence-end check reads
wordText(wordData[index - 1]) (using the same helper) so it can never go out of sync.
- Early throw if
alt.words is missing or empty, with a clear message identifying the failure.
Reference implementation
A hardened version of the loop body (slightly adapted from an n8n node that does the same work):
const alt = dg?.results?.channels?.[0]?.alternatives?.[0];
if (!alt || !Array.isArray(alt.words) || alt.words.length === 0) {
throw new Error(`No transcribed words in Deepgram response`);
}
const wordData = alt.words;
const maxWordsInPara = 100;
const significantGapInSeconds = 4.0;
const speakerReassignGap = 0.3;
const ms = (s) => Math.round(s * 1000);
const wordText = (w) => (w.punctuated_word || w.word || '');
const showDiarization = wordData.some((w) => w.speaker !== undefined);
// Diarization edge-case fix (unchanged from current code)
for (let i = 1; i < wordData.length - 1; i++) {
const prev = wordData[i - 1];
const cur = wordData[i];
const next = wordData[i + 1];
if (cur.speaker !== prev.speaker && next.speaker === cur.speaker) {
const gapBefore = cur.start - prev.end;
const gapAfter = next.start - cur.end;
if (gapBefore < speakerReassignGap && gapAfter > speakerReassignGap) {
cur.speaker = prev.speaker;
}
}
}
let hyperTranscript = "<article>\n <section>\n <p>\n ";
let previousElementEnd = 0;
let wordsInPara = 0;
wordData.forEach((element, index) => {
const currentWord = wordText(element);
wordsInPara++;
if ((previousElementEnd !== 0 && (element.start - previousElementEnd) > significantGapInSeconds) || wordsInPara > maxWordsInPara) {
const previousWord = wordText(wordData[index - 1]);
const lastChar = previousWord.charAt(previousWord.length - 1);
if (lastChar === '.' || lastChar === '?' || lastChar === '!') {
hyperTranscript += "\n </p>\n <p>\n ";
wordsInPara = 0;
}
}
if (showDiarization && index > 0 && element.speaker !== wordData[index - 1].speaker) {
hyperTranscript += "\n </p>\n <p>\n ";
wordsInPara = 0;
}
if (showDiarization && (index === 0 || element.speaker !== wordData[index - 1].speaker)) {
hyperTranscript += `<span class="speaker" data-m='${ms(element.start)}' data-d='0'>[speaker-${element.speaker}] </span>`;
}
hyperTranscript += `<span data-m='${ms(element.start)}' data-d='${ms(element.end - element.start)}'>${currentWord} </span>`;
previousElementEnd = element.end;
});
hyperTranscript += "\n </p> \n </section>\n</article>\n ";
hyperTranscript = hyperTranscript.replace(/<p>\s*<\/p>\s*/g, '');
This is structurally identical to today's parseData body — same paragraph-break rules, same diarization edge-case fix, same empty-<p> cleanup — so output is unchanged on the cases that already work. It just no longer relies on transcript.split(' '), no longer emits [speaker-undefined] when diarization is missing, uses a clean ms() helper, and fails loudly on empty responses.
Acceptance criteria
Background
parseDatainjs/hyperaudio-lite-editor-deepgram.jsbuilds the editor's transcript HTML from a Deepgram response. It currently has a few fragile spots that bit a downstream pipeline using the same logic; the same issues apply in HLE.What's fragile today
transcript.split(' ')alignment with the words array. Word text comes frompunctuatedWords[index], wherepunctuatedWords = alt.transcript.split(' '). That holds for clean transcripts butsmart_format=truecan break the 1:1 alignment on formatted numbers, hyphenations, etc. — words then silently shift.diarize=true, every word'sspeakerisundefinedand the output gets[speaker-undefined]markup throughout..toFixed(2) * 1000for ms conversion. Mildly buggy (string → multiply) and uglier thanMath.round(s * 1000).punctuatedWords[index - 1]), inheriting the same alignment fragility.Proposed changes
element.punctuated_word || element.worddirectly. Drop thetranscript.split(' ')array. Word data is canonical.const showDiarization = wordData.some(w => w.speaker !== undefined);and gate speaker markup on that. No diarization → clean transcript with no speaker labels, instead of[speaker-undefined].ms = s => Math.round(s * 1000)helper and use it everywhere ms appears in attributes. Replaces every.toFixed(2) * 1000.wordText(wordData[index - 1])(using the same helper) so it can never go out of sync.alt.wordsis missing or empty, with a clear message identifying the failure.Reference implementation
A hardened version of the loop body (slightly adapted from an n8n node that does the same work):
This is structurally identical to today's
parseDatabody — same paragraph-break rules, same diarization edge-case fix, same empty-<p>cleanup — so output is unchanged on the cases that already work. It just no longer relies ontranscript.split(' '), no longer emits[speaker-undefined]when diarization is missing, uses a cleanms()helper, and fails loudly on empty responses.Acceptance criteria
parseDatainjs/hyperaudio-lite-editor-deepgram.jsno longer readsalt.transcriptfor per-word text — usespunctuated_word/worddirectly.punctuatedWordsarray is removed fromparseData.speakerfield.(...).toFixed(2) * 1000ms conversions inparseDataare replaced withMath.round(seconds * 1000).alt.wordsthrows a clear error before the loop runs.<p>cleanup all still work).