Skip to content

feat: recognize provider cost data from usage.cost_details block#439

Merged
mcowger merged 1 commit into
mainfrom
cost-devpass
May 19, 2026
Merged

feat: recognize provider cost data from usage.cost_details block#439
mcowger merged 1 commit into
mainfrom
cost-devpass

Conversation

@mcowger
Copy link
Copy Markdown
Owner

@mcowger mcowger commented May 19, 2026

Summary

Some providers include detailed cost breakdowns directly in the response usage object rather than via SSE : cost comment lines. This PR adds support for recognizing and using that cost data when available.

The new format

Providers may return a usage block like:

"usage": {
  "prompt_tokens": 23,
  "total_tokens": 66,
  "completion_tokens": 43,
  "cost": 0.00017465,
  "cost_details": {
    "total_cost": 0.00017465,
    "input_cost": 0.00002415,
    "output_cost": 0.0001505,
    "cached_input_cost": 0,
    "cache_write_input_cost": 0,
    "upstream_inference_cost": 0.00017465,
    "request_cost": 0,
    "web_search_cost": 0,
    "data_storage_cost": 0.00000106
  }
}

When cost_details is present, we use the provider's actual per-bucket breakdown (input_cost, output_cost, cached_input_cost, cache_write_input_cost) directly instead of proportionally distributing a total.

Changes

  • utils/usage-normalizer.ts: Add extractUsageCostDetails() — safely extracts cost_details from provider usage blocks; returns null when absent (providers that don't use this format are unaffected). Also updated normalizeOpenAIChatUsage() to extract cache_write_tokens from prompt_tokens_details.
  • utils/provider-cost.ts: Add applyUsageCostDetails() — applies per-bucket breakdown when available, falls back to proportional distribution otherwise.
  • services/inspectors/usage-logging.ts: Wire cost_details extraction into the streaming cost path (only applies if no SSE-reported cost was found).
  • services/response-handler.ts: Same for the non-streaming (unary) path.

Key design decisions

  • Fully optional/defensive: extractUsageCostDetails() returns null when usage.cost_details doesn't exist — providers that don't use this format are completely unaffected
  • SSE : cost comments take precedence: The !providerReportedCost guard ensures cost_details only applies when no SSE-reported cost was found
  • Per-bucket breakdown preferred: When the provider gives explicit input_cost/output_cost/etc., we use those directly instead of proportional splitting

Test plan

  • extractUsageCostDetails — extracts from the new format, falls back to usage.cost/usage.estimated_cost, returns null for missing/invalid data
  • applyUsageCostDetails — uses per-bucket breakdown, falls back to proportional, handles zero/null costs
  • normalizeOpenAIChatUsage — extracts cache_write_tokens from prompt_tokens_details
  • Precedence: SSE : cost comments > cost_details > calculated costs
  • All 1367 existing tests pass

Some providers include detailed cost breakdowns directly in the
response usage object (e.g. usage.cost_details.input_cost,
usage.cost_details.output_cost, etc.) rather than via SSE `: cost`
comment lines. When present, these per-bucket breakdowns are more
accurate than our proportional distribution from a total.

Changes:
- Add extractUsageCostDetails() to usage-normalizer for safe extraction
  of cost_details from provider usage blocks; returns null when absent
- Add applyUsageCostDetails() to provider-cost for applying the
  per-bucket breakdown when available, with proportional fallback
- Update normalizeOpenAIChatUsage() to extract cache_write_tokens from
  prompt_tokens_details (previously always 0)
- Wire cost_details extraction into both streaming (UsageInspector)
  and non-streaming (finalizeUsage) cost paths
- SSE `: cost` comments still take precedence over cost_details
- Comprehensive test coverage for all new functions and edge cases
@mcowger mcowger merged commit 3317d14 into main May 19, 2026
1 check passed
@mcowger mcowger deleted the cost-devpass branch May 19, 2026 05:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant