Preserve OpenAI cached_tokens + Wan 2.7 R2V media types by duanbing · Pull Request #10 · RouterBase/tensorzero

duanbing · 2026-05-28T23:54:27Z

RouterBase fork patches for the chat-LLM + media catalogue work upstream in RouterBase/RouterBase.

1. Preserve OpenAI `cached_tokens` through `Usage` (`6498b88`)

RouterBase routes chat LLMs (OpenAI / Anthropic / Gemini, all via Novita's OpenAI-compatible endpoint) through this gateway and needs the prompt-cache read count to bill cache reads at the discounted rate and show users their savings. Stock TZ normalizes provider usage to {input_tokens, output_tokens} and drops prompt_tokens_details.cached_tokens.

Usage gains cached_tokens: Option<u32> (ts-bindings + skip_serializing_if/default).
OpenAI provider parses prompt_tokens_details.cached_tokens in OpenAIUsage → Usage.
Threaded through Usage::zero() and the streaming / cross-inference aggregators (sum, None-as-0).
~38 files (mostly mechanical constructor updates). Anthropic/Bedrock/Vertex native paths leave it None (out of scope — our Anthropic models use the openai-compat path).

cargo check --package tensorzero-core (lib) clean; no new clippy warnings.

2. Wan 2.7 R2V media item types (`54277f0`)

Pre-existing fork commit (carried on this branch): fixes the Wan 2.7 reference-to-video request shape to match the upstream enum.

Notes

Consumed by RouterBase/RouterBase PR #103 (submodule pointer bumped to 6498b88).
The chat path forwards a per-user prompt_cache_key via extra_body; this PR only handles surfacing the cache usage back to the caller.

🤖 Generated with Claude Code

The Wan 2.7 R2V (`/v3/async/wan2.7-r2v`) endpoint requires each item in the `media` array to carry a `type` value from the enum: - `reference_image` - `reference_video` - `first_frame` We were sending `image` and `video`, which Novita rejects with the generic "failed to exec task" 500 — every R2V submission via the playground / legacy `image_urls`+`video_urls` shape was failing silently for that reason. Two changes in `build_body`: 1. Repack each `image_urls[]` URL as `{type: "reference_image", url}` and each `video_urls[]` URL as `{type: "reference_video", url}`. No way to express `first_frame` or per-item `reference_voice` from the legacy flat shape — callers who want those use the new pass-through path below. 2. Pass `media` through the allowed-fields whitelist for the R2V shape so direct API callers / a future media-editor UI can submit the rich shape (`[{type, url, reference_voice?}, ...]`) verbatim. The `!body.contains_key("media")` guard in the repack block ensures the pass-through wins when both shapes are present. Also cap the synthesised `media` array at 5 items to match Novita's documented ceiling (combined images+videos ≤ 5), so users who upload more get a deterministic truncate-from-front rather than a 422.

TensorZero normalized provider usage to {input_tokens, output_tokens} and dropped OpenAI's prompt_tokens_details.cached_tokens. RouterBase routes chat LLMs (incl. Claude/Gemini via Novita's OpenAI-compat endpoint) through this gateway and needs the prompt-cache read count to bill cache reads at the discounted rate and show users their savings. - Add `cached_tokens: Option<u32>` to Usage (ts-bindings + skip-if-none). - Parse prompt_tokens_details.cached_tokens in the OpenAI provider's OpenAIUsage → Usage conversion. - Thread through Usage::zero() and the streaming/cross-inference aggregators (sum, treating None as 0). Anthropic/Bedrock/Vertex native paths leave it None (out of scope; our Anthropic models use the openai-compat path). cargo check --package tensorzero-core (lib) clean; no new clippy warnings. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

github-actions · 2026-05-28T23:54:37Z

Thank you for your submission, we really appreciate it. Like many open-source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution. You can sign the CLA by just posting a Pull Request Comment same as the below format.

I have read the Contributor License Agreement (CLA) and hereby sign the CLA.

_{You can retrigger this bot by commenting recheck in this Pull Request.}_{Posted by the CLA Assistant Lite bot.}

duanbing and others added 2 commits May 20, 2026 13:37

duanbing merged commit 4a182b3 into main May 29, 2026
6 of 7 checks passed

duanbing deleted the novita/wan-r2v-media-types branch May 29, 2026 00:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Preserve OpenAI cached_tokens + Wan 2.7 R2V media types#10

Preserve OpenAI cached_tokens + Wan 2.7 R2V media types#10
duanbing merged 2 commits into
mainfrom
novita/wan-r2v-media-types

duanbing commented May 28, 2026

Uh oh!

github-actions Bot commented May 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

duanbing commented May 28, 2026

1. Preserve OpenAI cached_tokens through Usage (6498b88)

2. Wan 2.7 R2V media item types (54277f0)

Notes

Uh oh!

github-actions Bot commented May 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

1. Preserve OpenAI `cached_tokens` through `Usage` (`6498b88`)

2. Wan 2.7 R2V media item types (`54277f0`)