Skip to content

feat: capture provider cost details with three-tier dispatch#443

Open
harrisony wants to merge 4 commits into
mcowger:mainfrom
harrisony:main
Open

feat: capture provider cost details with three-tier dispatch#443
harrisony wants to merge 4 commits into
mcowger:mainfrom
harrisony:main

Conversation

@harrisony
Copy link
Copy Markdown
Contributor

@harrisony harrisony commented May 20, 2026

Summary

  • fix: add upstream_inference_cost fallback for total cost so BYOK requests (where usage.cost=0) report the real spend
  • refactor: stop aliasing upstream_inference_prompt_costinput_cost; keep upstream fields separate since upstream_inference_prompt_cost = input_cost + cached_input_cost (aliasing silently zeroed out costCached)
  • feat: add three-tier cost dispatch in applyUsageCostDetails — superset (per-bucket gateway fields) → normal (upstream prompt/completions split with cache ratio) → minimal (proportional fallback)
  • test: expand coverage with edge cases — zero-cost non-BYOK, OpenRouter markup (cost >> upstream sum), zero prompt tokens, heavy-cache-hit ratio split, end-to-end BYOK and non-BYOK flows

Test plan

  • All 1378 backend tests pass (bun run test:force-all)
  • Biome lint clean (bunx biome lint .)
  • Each commit independently passes lint + tests (verified via git rebase --exec)

🤖 Generated with Claude Code

harrisony and others added 4 commits May 19, 2026 21:54
…r total cost

The previous implementation used a single safeCost() call wrapping a ??
chain: safeCost(details.total_cost ?? usage?.cost ?? usage?.estimated_cost).
This had two problems:

- upstream_inference_cost was not considered at all
- The entire ?? chain was validated as one, so safeCost() could not
  distinguish which source provided the value

Restructure into an explicit three-step fallback, each validated
independently:
1. cost_details.total_cost (LLM gateway, most detailed)
2. usage.cost (direct cost for OpenRouter-style providers)
3. cost_details.upstream_inference_cost (upstream OpenRouter-style cost)

Steps 2→3 use || (falsy coalescing) to handle the OpenRouter quirk where
usage.cost and/or upstream_inference_cost may be populated depending on
the upstream provider and key type used (e.g. BYOK keys report 0 for
usage.cost, but upstream_inference_cost carries the actual provider cost).
… from standard fields

Previously upstream_inference_prompt_cost was aliased directly into
input_cost. However, these fields have different semantics:

  upstream_inference_prompt_cost = input_cost + cached_input_cost
  (i.e., the combined prompt cost including cached tokens)

Aliasing it into input_cost caused applyUsageCostDetails to zero out
costCached, silently merging the cached portion into costInput.

Changes:
- Stop aliasing upstream_inference_prompt_cost → input_cost and
  upstream_inference_completions_cost → output_cost
- Add same-tier aliasing: upstream_inference_input_cost →
  upstream_inference_prompt_cost and upstream_inference_output_cost →
  upstream_inference_completions_cost (same semantics, different
  provider naming conventions)
- Update extraction tests to assert upstream fields stay separate
- Add tests for BYOK fallback, Responses API input/output variants,
  and LLM Gateway field priority
Previously applyUsageCostDetails only had two branches: superset
(per-bucket breakdown) and minimal (proportional distribution).
Now that extractUsageCostDetails no longer aliases normal-tier
upstream_inference_prompt_cost into input_cost, the normal-tier case
needs explicit handling.

Three tiers:
1. Superset: input_cost/cached_input_cost/cache_write_input_cost present
   → use per-bucket breakdown directly
2. Normal: upstream_inference_prompt_cost/completions_cost present but
   no input-side superset fields → use upstream prompt/completions split,
   then distribute the prompt portion by Plexus's own cache ratio
3. Minimal: no breakdown at all → proportional distribution from
   previously calculated costs
Add tests for: zero-cost non-BYOK requests, OpenRouter markup (cost >> upstream
sum), zero prompt tokens, heavy-cache-hit ratio split, end-to-end BYOK and
non-BYOK extract+apply flows. Also normalise scientific notation literals and
add missing upstream_inference_* fields to existing superset fixtures.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant