Skip to content

fix: apply cache_control for Claude 4.x models (not just claude-3)#55

Open
sumleo wants to merge 2 commits into
OlympiaAI:mainfrom
sumleo:fix/cache-control-claude-4
Open

fix: apply cache_control for Claude 4.x models (not just claude-3)#55
sumleo wants to merge 2 commits into
OlympiaAI:mainfrom
sumleo:fix/cache-control-claude-4

Conversation

@sumleo

@sumleo sumleo commented Jun 17, 2026

Copy link
Copy Markdown

Problem

Raix::MessageAdapters::Base#content only emits the Anthropic cache_control multipart shape when the model id contains the literal anthropic/claude-3:

if model.to_s.include?("anthropic/claude-3") && cache_at && msg[:content].to_s.length > cache_at.to_i

The current Claude 4.x ids (anthropic/claude-sonnet-4, anthropic/claude-opus-4-7, anthropic/claude-haiku-4-5, …) don't contain the substring claude-3, so the guard never matches them. Setting cache_at on a Claude 4.x model silently does nothing: the content is sent as a plain string, no cache_control block is attached, and the prompt-cache discount is never applied. The gem already targets Claude 4.x elsewhere (e.g. lib/raix/configuration.rb references the Claude 4.7 family), so the caching path is the only place still pinned to claude-3.

Fix

Broaden the guard in lib/raix/message_adapters/base.rb to match any Anthropic Claude model:

if model.to_s.include?("anthropic/claude") && cache_at && msg[:content].to_s.length > cache_at.to_i

anthropic/claude is a prefix of every Anthropic Claude id, so this covers claude-3 and the *-4 family alike. Non-Claude models still fall through untouched — behavior is unchanged for anything that isn't an Anthropic Claude.

Tests

Added an example in spec/raix/message_adapters/base_spec.rb that stubs a Claude 4 id (anthropic/claude-sonnet-4) and asserts large content is wrapped with cache_control. It fails against the old claude-3-only guard (content stays a bare string) and passes with the fix. The existing claude-3 example is retained.

CHANGELOG entry added under ## [Unreleased].

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 44be866308

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread lib/raix/message_adapters/base.rb Outdated
# convert to anthropic multipart format if model is claude-3 and cache_at is set
if model.to_s.include?("anthropic/claude-3") && cache_at && msg[:content].to_s.length > cache_at.to_i
# convert to anthropic multipart format if model is an anthropic claude and cache_at is set
if model.to_s.include?("anthropic/claude") && cache_at && msg[:content].to_s.length > cache_at.to_i

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve multimodal content before adding cache_control

When a Claude 4 request uses abbreviated multimodal content (for example { user: [{ type: "text", ... }, { type: "image_url", ... }] }) and cache_at is below the array's stringified length, this broadened guard now wraps the entire array as the text: value of a single cache-control block. That prevents MultimodalContentAdapter.translate in chat_completion from seeing the original image_url parts, so the image is dropped/serialized instead of sent; the old claude-3 guard did not affect Claude 4 vision requests.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch - fixed in 805d8b9. The wrapping is now limited to plain String content, so multimodal arrays (image_url parts) pass through to MultimodalContentAdapter untouched. Added a spec asserting an array-valued Claude 4 message is left as-is.

@sumleo

sumleo commented Jun 18, 2026

Copy link
Copy Markdown
Author

Hi @obie, gentle nudge on this when you have a moment. It's a small, self-contained prompt-caching fix, and I'm happy to rebase or tweak anything if that would make review easier. Thanks for the project and your time!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant