fix: measure token speed from streaming phase only (exclude TTFB)#227
Merged
fix: measure token speed from streaming phase only (exclude TTFB)#227
Conversation
Token speed (TOK/S) was calculated using total wall-clock latency including TTFB wait time, producing misleadingly low averages. Now captures the first streaming chunk timestamp and uses streaming-only duration for the TPS formula. A 200ms minimum threshold prevents inflated numbers from imprecise timing on fast responses.
f6fc943 to
e0d3be9
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
_streamStartTimewhen the first streaming chunk arrives, and computes TPS using streaming-only durationChanges
src/types.ts— added_streamStartTime?: numbertoRequestContextsrc/server.ts— capture streaming start on first chunk (SSE + JSON paths), use streaming-only duration inrecordMetricswith 200ms floorTest plan
npx tsc --noEmit— passesnpm run build— succeedsnpx vitest run— 323/323 tests passing