fix: handle streaming usage chunk and soft-deleted provider filtering#76
fix: handle streaming usage chunk and soft-deleted provider filtering#76
Conversation
Some OpenAI-compatible providers send token usage in a separate chunk (with choices=[]) after the finish_reason chunk, rather than bundling them together. Previously, this usage-only chunk was silently skipped because data.choices[0] was undefined, causing token counts to always be recorded as -1 for streaming requests. This broke TPM rate limiting and usage tracking. Now both patterns are supported: - Usage bundled with finish_reason in the same chunk - Usage in a separate subsequent chunk (choices=[]) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
listUniqueSystemNames() only filtered NOT models.deleted but did not join the providers table or check NOT providers.deleted. When a provider was soft-deleted, its models (still deleted=false) would appear in the global model registry, but getModelsWithProviderBySystemName() correctly filtered them out, causing the UI to show models with no providers. Add innerJoin on ProvidersTable and NOT providers.deleted filter to match the behavior of getModelsWithProviderBySystemName(). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses two distinct issues: accurately tracking token usage for streaming responses from certain OpenAI-compatible APIs and ensuring that models linked to soft-deleted providers are correctly excluded from the system's unique model name listings. These changes improve the reliability of usage reporting and the accuracy of the model registry. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review infoConfiguration used: Repository UI (base), Organization UI (inherited) Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
📝 WalkthroughWalkthrough修改了 OpenAI 上游适配器中流式响应的使用量发送逻辑,改为仅在 chunk 的 Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~15 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request introduces two important fixes. First, it correctly handles streaming usage data from OpenAI-compatible providers that send it in a separate chunk, which resolves an issue with incorrect token counts. Second, it filters out models belonging to soft-deleted providers from the global model list, preventing them from appearing in the UI. The changes are well-implemented and address the described issues effectively. I have one suggestion to improve consistency in the adapter implementation for better maintainability.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/src/adapters/upstream/openai.ts`:
- Around line 512-523: Remove the explicit stopReason: null in the usage-only
branch inside the generator that yields message deltas for OpenAI streaming (the
block checking data.usage). Instead of yielding a message_delta with
messageDelta: { stopReason: null }, omit the messageDelta field entirely and
yield only the usage object (so the yielded object contains type:
"message_delta" and usage: { inputTokens: data.usage.prompt_tokens,
outputTokens: data.usage.completion_tokens }). This matches the behavior in
openai-responses.ts and prevents overwriting a previously set stopReason.
ℹ️ Review info
Configuration used: Repository UI (base), Organization UI (inherited)
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
backend/src/adapters/upstream/openai.tsbackend/src/db/index.ts
…ting stopReason Remove explicit stopReason: null from the usage-only message_delta yield to prevent overwriting a previously set stopReason. This aligns with the pattern used in openai-responses.ts for usage-only events. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary
choices=[]) after thefinish_reasonchunk, rather than bundling them together. Previously this chunk was silently skipped, causing streaming token counts to always be-1. Now both patterns are supported.listUniqueSystemNames()did not join/filter the providers table for soft-delete, causing models from deleted providers to appear in the global model registry with "no providers" shown in the UI.Test plan
promptTokens/completionTokensin the database (no longer-1)🤖 Generated with Claude Code
Summary by CodeRabbit
发布说明
Bug Fixes
Chores