feat(capi): segment-timestamp support (frame_sec + streaming JSON, ABI v4)#16
Open
localai-bot wants to merge 1 commit into
Open
feat(capi): segment-timestamp support (frame_sec + streaming JSON, ABI v4)#16localai-bot wants to merge 1 commit into
localai-bot wants to merge 1 commit into
Conversation
…SON)
Add the data LocalAI needs to build NeMo-faithful segment timestamps:
- Offline JSON (transcribe_*_json) now carries "frame_sec", the encoder
frame stride in seconds, so a consumer can convert NeMo's frame-unit
segment_gap_threshold into the seconds gap between words.
- New streaming JSON entry points parakeet_capi_stream_feed_json /
parakeet_capi_stream_finalize_json return {text, eou, frame_sec, words}
by surfacing the streaming session's existing drain_words() per-word
start/end/conf alongside the newly-finalized text and EOU flag.
Bumps PARAKEET_CAPI_ABI_VERSION to 4. All existing entry points are
unchanged; the new symbols are additive (consumers probe for them).
tests/test_capi_stream_json.cpp drives the new streaming JSON path on the
EOU model (skips with 77 when PARAKEET_TEST_GGUF_EOU is unset, like the
sibling streaming tests).
Assisted-by: Claude:claude-opus-4-8 [Claude Code]
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Collaborator
Author
|
Consumed by LocalAI PR: mudler/LocalAI#10207 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Adds the C-ABI surface LocalAI needs to emit NeMo-faithful segment timestamps from the parakeet-cpp backend.
parakeet_capi_transcribe_*_json) now includes a top-level"frame_sec"field (the encoder frame stride in seconds,hop * subsampling / sample_rate). Consumers multiply NeMo's frame-unitsegment_gap_thresholdby it to get the seconds gap between words when forming segments.parakeet_capi_stream_feed_json/parakeet_capi_stream_finalize_jsonreturn{text, eou, frame_sec, words}, surfacing the streaming session's already-existingdrain_words()per-wordstart/end/confalongside the newly-finalized text and EOU flag — so callers can build timestamped per-utterance segments.Bumps
PARAKEET_CAPI_ABI_VERSIONto 4. All existing entry points are unchanged; the new symbols are additive (consumers probe for them, so olderlibparakeet.sostill works).Why
The LocalAI
parakeet-cppbackend currently collapses everything into one synthetic whole-clip segment offline, and untimestamped text-only segments while streaming. This PR exposes the per-word timestamps +frame_secthe LocalAI side needs to replicate NeMo'sget_segment_offsets(punctuation-only by default, optional frame-gap split) and attach real start/end times to segments.Tests
tests/test_capi_stream_json.cppdrives the new streaming JSON path on the cache-aware EOU model and asserts the documents carryframe_sec+ per-word timestamps. Skips (exit 77) whenPARAKEET_TEST_GGUF_EOUis unset, like the sibling streaming tests. Library + test build and link clean.Consumed by the companion LocalAI PR (real segment timestamps for parakeet-cpp).