fix(harness-llm): surface OpenAI streaming token usage before Finish#59
Draft
TYRMars wants to merge 1 commit into
Draft
fix(harness-llm): surface OpenAI streaming token usage before Finish#59TYRMars wants to merge 1 commit into
TYRMars wants to merge 1 commit into
Conversation
OpenAI Chat Completions streaming ships the token-usage payload in a separate SSE chunk (`choices: []`) that arrives *after* the `finish_reason` chunk. The agent loop breaks out of the stream the moment it sees `Finish`, so the trailing `Usage` chunk was never consumed and usage accounting read zero on the default provider. Buffer the terminal `Finish` in `StreamAccumulator` and release it either when the trailing usage-only chunk is ingested (emitting `Usage` first) or when the stream closes. This matches the ordering the other three providers already produce. Adds a regression test and updates the existing accumulator tests to flush the buffered Finish on close. Closes #48
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #48 — on the default OpenAI Chat Completions streaming path, token usage was silently dropped.
OpenAI ships the usage payload in a separate final SSE chunk (
choices: []) that arrives after thefinish_reasonchunk. The agent loop breaks out of the stream the moment it seesFinish(crates/harness-core/src/agent.rs:636), so the trailingUsagechunk was never consumed and usage accounting read zero.Fix
Buffer the terminal
FinishinStreamAccumulator(a newpending_finishfield) instead of emitting it inline. It's released either:Usagefirst, thenFinish, orflush()helper, which also covers gateways that close the body without afinish_reason).This matches the
Usage-before-Finishordering the other three providers (Anthropic / Responses / Google) already produce, so the agent loop no longer drops usage on the default provider.Tests
usage_emitted_before_finish_when_usage_trailsasserts the[Usage, Finish]ordering across the two-chunk sequence.Finishfromflush()on stream close.cargo test -p harness-llmandcargo clippy -p harness-llm --all-targets -- -D warningsboth pass.https://claude.ai/code/session_01E28FLiYKcDuos5wiA8vVUC
Generated by Claude Code