Skip to content

feat: inject anthropic prompt cache control#65

Merged
shudonglin merged 4 commits into
mainfrom
feat/anthropic-cache-control-injection
Jun 27, 2026
Merged

feat: inject anthropic prompt cache control#65
shudonglin merged 4 commits into
mainfrom
feat/anthropic-cache-control-injection

Conversation

@shudonglin

Copy link
Copy Markdown

⚠️ 提交说明 / PR Notice

Important

  • This description was written and checked for this PR. The code changes were AI-assisted and reviewed locally before submission.

📝 变更描述 / Description

Adds gateway-side Anthropic prompt cache control injection for Claude relay requests.

The relay now adds top-level cache_control only when the client request does not already define Anthropic cache control. The default TTL comes from ANTHROPIC_PROMPT_CACHE_TTL, and callers can override per request with x-anthropic-prompt-cache-ttl. auto selects 1h for evaluation, benchmark, batch, pipeline, and long-running workloads, otherwise 5m.

Gateway-only cache policy headers are stripped before upstream conversion so they do not leak to providers.

🚀 变更类型 / Type of change

  • 🐛 Bug 修复 (Bug fix) - 请关联对应 Issue,避免将设计取舍、理解偏差或预期不一致直接归类为 bug
  • ✨ 新功能 (New feature) - 重大特性建议先通过 Issue 沟通
  • ⚡ 性能优化 / 重构 (Refactor)
  • 📝 文档更新 (Documentation)

🔗 关联任务 / Related Issue

  • Closes # (none)

✅ 提交前检查项 / Checklist

  • 人工确认: 我已亲自整理并撰写此描述,没有直接粘贴未经处理的 AI 输出。
  • 非重复提交: 我已搜索现有的 Issues 与 PRs,确认不是重复提交。
  • Bug fix 说明: 若此 PR 标记为 Bug fix,我已提交或关联对应 Issue,且不会将设计取舍、预期不一致或理解偏差直接归类为 bug。
  • 变更理解: 我已理解这些更改的工作原理及可能影响。
  • 范围聚焦: 本 PR 未包含任何与当前任务无关的代码改动。
  • 本地验证: 已在本地运行并通过测试或手动验证,维护者可以据此复核结果。
  • 安全合规: 代码中无敏感凭据,且符合项目代码规范。

📸 运行证明 / Proof of Work

Local verification passed:

go test ./relay ./relay/channel ./relay/channel/claude -run 'TestApplyAnthropicPromptCacheControl|TestProcessHeaderOverride_PassthroughSkipsAnthropicPromptCachePolicyHeaders|TestConvertOpenAIRequest.*PromptCacheControl'

@shudonglin shudonglin marked this pull request as ready for review June 27, 2026 16:57
@shudonglin

Copy link
Copy Markdown
Author

E2E cache validation

Ran on 2026-06-28 from branch feat/anthropic-cache-control-injection:

go test -count=1 ./relay ./relay/channel ./relay/channel/claude -run 'TestApplyAnthropicPromptCacheControl|TestProcessHeaderOverride_PassthroughSkipsAnthropicPromptCachePolicyHeaders|TestConvertOpenAIRequest.*PromptCacheControl|TestClaudeAdaptorE2EInjectsPromptCacheControlAndForwardsUsage'

Result: passed (ok github.com/QuantumNous/new-api/relay, ok github.com/QuantumNous/new-api/relay/channel, ok github.com/QuantumNous/new-api/relay/channel/claude).

The E2E test uses a mock Anthropic upstream, so it validates gateway request/response behavior without spending real Anthropic tokens. It verifies top-level cache_control: {"type":"ephemeral","ttl":"1h"} is sent for long/eval workloads, gateway-only cache policy headers are stripped before upstream, client-supplied cache control is preserved, and Anthropic cache usage fields flow back through response handling.

@shudonglin shudonglin merged commit eaaa70f into main Jun 27, 2026
4 checks passed
@shudonglin shudonglin deleted the feat/anthropic-cache-control-injection branch June 27, 2026 23:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant