You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
ccx web currently shows session-level aggregate token counts (Input / Output / Cache Read / Cache Write) in the info panel. That's not enough anymore. Users burning through Claude Code quota want to know which turn cost them, not "the whole conversation used 1.2M tokens."
Ship per-turn usage, sidechain-aware cache accounting, and cost attribution inside the web UI — and do it better than ccusage does from the CLI.
Why now
Quota is easier to burn in Claude Code than it used to be. Users are asking "where did my $40 go" and have no way to see it inside a conversation.
ccx already parses the full tree per-message. The data is there — we just don't surface it.
ccusage has proven the demand but is CLI-only, has no per-turn/per-tool granularity, and has accuracy bugs around sidechains and streaming partials (#913, #938). We can leapfrog.
Current state
internal/parser/types.go:63-78 — SessionStats holds only aggregateInputTokens, OutputTokens, CacheReadTokens, CacheCreateTokens. No per-message usage stored.
internal/parser/session.go:42-52 — full parser accumulates token stats but discards the per-message value after aggregation.
internal/parser/session.go:424-476 — quick parser does the same aggregate-only walk.
internal/web/templates.go:470-488 — info panel renders session-level totals only.
internal/web/templates.go:236-240 — session list shows totals per session.
Cost calculation was explicitly removed in a prior pass (CLAUDE.md line 196: "We removed cost estimation and lines-changed tracking"). Time to put it back — better this time.
The raw usage field is in every assistant message: input_tokens, output_tokens, cache_creation_input_tokens, cache_read_input_tokens. We already unmarshal it. We just throw it away.
All four bug classes come from blind JSONL summation without tree semantics. ccx already has the tree model and per-message parser — we can be accurate on day one.
Store per-message usage on the Message tree node (not just session-level aggregate)
Add TurnStats — aggregation per user turn (user message + all assistant responses/tool calls until the next user message)
Sidechain-aware accounting: when a sidechain replays cache_read from parent context, do NOT double-count. Subtract overlap via tree structure.
Embedded pricing table: Go map keyed by model name with {input, output, cache_create, cache_read} USD-per-million. Versioned, updatable, always-embedded — no runtime fetch. Start with Claude 3.5 Sonnet / 4 Sonnet / 4 Opus / Haiku 4.5.
Flag Codex schema divergence up front — don't conflate until a second pass lands Codex pricing.
Web UI
Info panel: new "Per-turn breakdown" expandable section
Summary
ccx web currently shows session-level aggregate token counts (Input / Output / Cache Read / Cache Write) in the info panel. That's not enough anymore. Users burning through Claude Code quota want to know which turn cost them, not "the whole conversation used 1.2M tokens."
Ship per-turn usage, sidechain-aware cache accounting, and cost attribution inside the web UI — and do it better than ccusage does from the CLI.
Why now
Current state
internal/parser/types.go:63-78—SessionStatsholds only aggregateInputTokens,OutputTokens,CacheReadTokens,CacheCreateTokens. No per-message usage stored.internal/parser/session.go:42-52— full parser accumulates token stats but discards the per-message value after aggregation.internal/parser/session.go:424-476— quick parser does the same aggregate-only walk.internal/web/templates.go:470-488— info panel renders session-level totals only.internal/web/templates.go:236-240— session list shows totals per session.CLAUDE.mdline 196: "We removed cost estimation and lines-changed tracking"). Time to put it back — better this time.The raw
usagefield is in every assistant message:input_tokens,output_tokens,cache_creation_input_tokens,cache_read_input_tokens. We already unmarshal it. We just throw it away.What ccusage does (and where it falls short)
--breakdown)--instances)Cost model:
LiteLLMPricingFetcher(runtime fetch or--offlinesnapshot). Cached reads billed at the discounted rate. Not hardcoded.Accuracy bugs ccx can avoid by construction:
/btwasides double-countcache_readbecause ccusage doesn't know what a sidechain isoutput_tokens, undercounting ~2.7×gpt-5.4-minimispriced asgpt-5via fuzzy match → 5× overcharge on Codex/usageAll four bug classes come from blind JSONL summation without tree semantics. ccx already has the tree model and per-message parser — we can be accurate on day one.
What we ship
Data layer
MessageUsageto persist per-message:input_tokens,output_tokens,cache_creation_input_tokens,cache_read_input_tokens,model,timestampMessagetree node (not just session-level aggregate)TurnStats— aggregation per user turn (user message + all assistant responses/tool calls until the next user message){input, output, cache_create, cache_read}USD-per-million. Versioned, updatable, always-embedded — no runtime fetch. Start with Claude 3.5 Sonnet / 4 Sonnet / 4 Opus / Haiku 4.5.Web UI
ukey or toolbar button)CLI
ccx usagecommand mirroring ccusage's axes (daily/session/project/model) on top of ccx's tree-aware model--format json|table--breakdown turn— the thing ccusage doesn't haveDocs
Acceptance criteria
sum(turn costs)within rounding; does NOT double-count cache_read/usagefor the same session within 5%; any gap documented under "Known deltas"--no-networkOut of scope (explicit)
ccx usage blockslater)Priority
P0 — crown jewel of v0.next. This is the one users will thank us for.
Prior art: https://github.com/ryoppippi/ccusage