fix: apply cache_control for Claude 4.x models (not just claude-3)#55
fix: apply cache_control for Claude 4.x models (not just claude-3)#55sumleo wants to merge 2 commits into
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 44be866308
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| # convert to anthropic multipart format if model is claude-3 and cache_at is set | ||
| if model.to_s.include?("anthropic/claude-3") && cache_at && msg[:content].to_s.length > cache_at.to_i | ||
| # convert to anthropic multipart format if model is an anthropic claude and cache_at is set | ||
| if model.to_s.include?("anthropic/claude") && cache_at && msg[:content].to_s.length > cache_at.to_i |
There was a problem hiding this comment.
Preserve multimodal content before adding cache_control
When a Claude 4 request uses abbreviated multimodal content (for example { user: [{ type: "text", ... }, { type: "image_url", ... }] }) and cache_at is below the array's stringified length, this broadened guard now wraps the entire array as the text: value of a single cache-control block. That prevents MultimodalContentAdapter.translate in chat_completion from seeing the original image_url parts, so the image is dropped/serialized instead of sent; the old claude-3 guard did not affect Claude 4 vision requests.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Good catch - fixed in 805d8b9. The wrapping is now limited to plain String content, so multimodal arrays (image_url parts) pass through to MultimodalContentAdapter untouched. Added a spec asserting an array-valued Claude 4 message is left as-is.
|
Hi @obie, gentle nudge on this when you have a moment. It's a small, self-contained prompt-caching fix, and I'm happy to rebase or tweak anything if that would make review easier. Thanks for the project and your time! |
Problem
Raix::MessageAdapters::Base#contentonly emits the Anthropiccache_controlmultipart shape when the model id contains the literalanthropic/claude-3:The current Claude 4.x ids (
anthropic/claude-sonnet-4,anthropic/claude-opus-4-7,anthropic/claude-haiku-4-5, …) don't contain the substringclaude-3, so the guard never matches them. Settingcache_aton a Claude 4.x model silently does nothing: the content is sent as a plain string, nocache_controlblock is attached, and the prompt-cache discount is never applied. The gem already targets Claude 4.x elsewhere (e.g.lib/raix/configuration.rbreferences the Claude 4.7 family), so the caching path is the only place still pinned toclaude-3.Fix
Broaden the guard in
lib/raix/message_adapters/base.rbto match any Anthropic Claude model:anthropic/claudeis a prefix of every Anthropic Claude id, so this coversclaude-3and the*-4family alike. Non-Claude models still fall through untouched — behavior is unchanged for anything that isn't an Anthropic Claude.Tests
Added an example in
spec/raix/message_adapters/base_spec.rbthat stubs a Claude 4 id (anthropic/claude-sonnet-4) and asserts large content is wrapped withcache_control. It fails against the oldclaude-3-only guard (content stays a bare string) and passes with the fix. The existingclaude-3example is retained.CHANGELOG entry added under
## [Unreleased].