Skip to content

fix: handle streaming usage chunk and soft-deleted provider filtering#76

Merged
pescn merged 3 commits intomainfrom
fix/streaming-usage-and-soft-delete
Feb 28, 2026
Merged

fix: handle streaming usage chunk and soft-deleted provider filtering#76
pescn merged 3 commits intomainfrom
fix/streaming-usage-and-soft-delete

Conversation

@pescn
Copy link
Contributor

@pescn pescn commented Feb 28, 2026

Summary

  • fix(adapters): handle separate usage chunk in OpenAI streaming responses — Some OpenAI-compatible providers send token usage in a separate chunk (choices=[]) after the finish_reason chunk, rather than bundling them together. Previously this chunk was silently skipped, causing streaming token counts to always be -1. Now both patterns are supported.
  • fix(db): filter soft-deleted providers in listUniqueSystemNameslistUniqueSystemNames() did not join/filter the providers table for soft-delete, causing models from deleted providers to appear in the global model registry with "no providers" shown in the UI.

Test plan

  • Verify streaming requests to OpenAI-compatible APIs record correct promptTokens / completionTokens in the database (no longer -1)
  • Verify TPM rate limiting works correctly for streaming requests
  • Delete a provider, create a new provider with the same model name, confirm the global models page shows the model correctly with the new provider
  • Confirm non-streaming requests are unaffected

🤖 Generated with Claude Code

Summary by CodeRabbit

发布说明

  • Bug Fixes

    • 优化了流式响应中使用数据的处理逻辑,确保仅在实际可用时发送相关信息,避免空字段。
    • 改进了系统名称查询逻辑,现仅基于有效的提供方记录,提升结果一致性与准确性。
  • Chores

    • 进行了不影响运行的公共类型导出调整。

pescn and others added 2 commits February 28, 2026 12:27
Some OpenAI-compatible providers send token usage in a separate chunk
(with choices=[]) after the finish_reason chunk, rather than bundling
them together. Previously, this usage-only chunk was silently skipped
because data.choices[0] was undefined, causing token counts to always
be recorded as -1 for streaming requests. This broke TPM rate limiting
and usage tracking.

Now both patterns are supported:
- Usage bundled with finish_reason in the same chunk
- Usage in a separate subsequent chunk (choices=[])

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
listUniqueSystemNames() only filtered NOT models.deleted but did not
join the providers table or check NOT providers.deleted. When a provider
was soft-deleted, its models (still deleted=false) would appear in the
global model registry, but getModelsWithProviderBySystemName() correctly
filtered them out, causing the UI to show models with no providers.

Add innerJoin on ProvidersTable and NOT providers.deleted filter to
match the behavior of getModelsWithProviderBySystemName().

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@gemini-code-assist
Copy link

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses two distinct issues: accurately tracking token usage for streaming responses from certain OpenAI-compatible APIs and ensuring that models linked to soft-deleted providers are correctly excluded from the system's unique model name listings. These changes improve the reliability of usage reporting and the accuracy of the model registry.

Highlights

  • OpenAI Streaming Usage: Implemented support for OpenAI-compatible providers that send token usage in a separate chunk (with an empty choices array) after the finish_reason chunk during streaming. Previously, these usage chunks were ignored, leading to incorrect token counts.
  • Soft-Deleted Providers Filtering: Modified the listUniqueSystemNames() database function to filter out models associated with soft-deleted providers. This prevents models from deleted providers from appearing in the global model registry.
Changelog
  • backend/src/adapters/upstream/openai.ts
    • Removed the InternalUsage type import as it's no longer directly instantiated in the adapter.
    • Added logic to process usage information from streaming chunks where the choices array is empty, specifically for providers that send usage data separately.
    • Updated the handling of usage data when a finish_reason is encountered, ensuring usage is only included if present in that specific chunk, preventing default -1 values.
  • backend/src/db/index.ts
    • Modified the listUniqueSystemNames function to include an innerJoin with the ProvidersTable.
    • Added a WHERE clause condition to exclude providers that have been soft-deleted, ensuring only active providers' models are listed.
Activity
  • The pull request was generated with Claude Code, indicating an AI-assisted development process.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@coderabbitai
Copy link

coderabbitai bot commented Feb 28, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

Configuration used: Repository UI (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 36e6b04 and ffb6c2c.

📒 Files selected for processing (1)
  • backend/src/adapters/upstream/openai.ts

📝 Walkthrough

Walkthrough

修改了 OpenAI 上游适配器中流式响应的使用量发送逻辑,改为仅在 chunk 的 data.usage 存在时发送使用量;同时在列举唯一系统名称时对 Providers 表做内连接并过滤已删除的提供商。

Changes

Cohort / File(s) Summary
OpenAI 使用量数据处理
backend/src/adapters/upstream/openai.ts
调整流式 chunk 处理:当 choices 为空但 data.usage 存在时发出使用量增量;在 finish_reason 的分支中仅在存在 data.usage 时附带 usage;移除对公共类型 InternalUsage 的导出/导入。
数据库查询优化
backend/src/db/index.ts
listUniqueSystemNames 增加与 ProvidersTable 的内连接,并在 where 子句中加入对未删除提供商的过滤条件。

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~15 minutes

Poem

兔子跳过清晨霜,
使用量只在有时放,
提供商已篩選净,
查询更稳路更长,
代码轻敲心欢畅 🐇

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main changes: fixing streaming usage chunk handling and soft-deleted provider filtering, matching the core objectives of improving token usage tracking for streaming requests and preventing deleted providers from appearing in the model registry.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/streaming-usage-and-soft-delete

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces two important fixes. First, it correctly handles streaming usage data from OpenAI-compatible providers that send it in a separate chunk, which resolves an issue with incorrect token counts. Second, it filters out models belonging to soft-deleted providers from the global model list, preventing them from appearing in the UI. The changes are well-implemented and address the described issues effectively. I have one suggestion to improve consistency in the adapter implementation for better maintainability.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@backend/src/adapters/upstream/openai.ts`:
- Around line 512-523: Remove the explicit stopReason: null in the usage-only
branch inside the generator that yields message deltas for OpenAI streaming (the
block checking data.usage). Instead of yielding a message_delta with
messageDelta: { stopReason: null }, omit the messageDelta field entirely and
yield only the usage object (so the yielded object contains type:
"message_delta" and usage: { inputTokens: data.usage.prompt_tokens,
outputTokens: data.usage.completion_tokens }). This matches the behavior in
openai-responses.ts and prevents overwriting a previously set stopReason.

ℹ️ Review info

Configuration used: Repository UI (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3d4016f and 36e6b04.

📒 Files selected for processing (2)
  • backend/src/adapters/upstream/openai.ts
  • backend/src/db/index.ts

…ting stopReason

Remove explicit stopReason: null from the usage-only message_delta
yield to prevent overwriting a previously set stopReason. This aligns
with the pattern used in openai-responses.ts for usage-only events.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@pescn pescn merged commit 3121a7b into main Feb 28, 2026
2 checks passed
@pescn pescn deleted the fix/streaming-usage-and-soft-delete branch February 28, 2026 04:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant