Skip to content

fix(token-meter): dedup repeated msg_id — token accounting was ~2x inflated#165

Merged
azalio merged 4 commits into
mainfrom
fix/token-meter-msgid-dedup
Jun 9, 2026
Merged

fix(token-meter): dedup repeated msg_id — token accounting was ~2x inflated#165
azalio merged 4 commits into
mainfrom
fix/token-meter-msgid-dedup

Conversation

@azalio

@azalio azalio commented Jun 9, 2026

Copy link
Copy Markdown
Owner

Problem

MAP's token_accounting.json / est_cost_usd over-reported cost by ~2.1–2.4×.

Claude Code writes a single assistant turn as several JSONL lines (one per content / tool_use block), all sharing the same message.id and the same cumulative usage. _iter_new_usage deduped new turns only against the persisted seen_ids, so every repeated line in a read window was logged as a separate token event.

Confirmed on real sessions:

  • ai-sre .map/rca: token_log 2822 rows / 1409 distinct msg_id → reported $993.43 vs $477.80 fixed.
  • gecko-ristra: 1260 / 613$868.27 vs $422.76 fixed.

Independent ground-truth from the raw Claude Code transcript (keep-max per msg_id) corroborates the corrected figures.

Fix

  1. Write path (_iter_new_usage): dedup new_usages by msg_id within the read window, keeping the copy with the most total tokens (the figure the API bills) when a streaming partial and the final line disagree. seen_ids remains the cross-call safety net.
  2. Rollup path (_rebuild_token_accounting): dedup by msg_id at rollup too, so logs already written by the pre-fix runner self-heal on the next rebuild. event_count now reports distinct turns.
  3. Extracts _coerce_token_int helper (shared with _usage_token_total).

Prices in MODEL_TOKEN_PRICES were already correct — this is purely a token-count dedup bug.

Tests

5 regression tests added (write-path repeat counted once, keep-most-complete-copy, rollup of a dup-containing log). Negative-proof: NEW=1 row vs OLD=3 rows on a 3×-repeated turn.

  • make render-templates + make check-render clean (propagated to all generated trees)
  • full suite: 2264 passed, lint 0/0/0

🤖 Generated with Claude Code

azalio and others added 4 commits June 6, 2026 09:08
Claude Code writes one assistant turn as several JSONL lines (one per
content/tool_use block), all sharing the same message.id and the same
cumulative usage. _iter_new_usage deduped new turns only against the
persisted seen_ids, so every repeated line in a read window was logged
as a separate token event — roughly doubling est_cost on real sessions
(observed: ai-sre token_log 2822 rows / 1409 distinct msg_id -> $993
reported vs ~$415 deduped; gecko-ristra 1260/613 -> $868 vs ~$422).

Dedup new_usages by msg_id within the read window, keeping the copy with
the most total tokens (the figure the API bills) when a streaming partial
and the final line disagree. seen_ids stays as the cross-call safety net.

Adds _usage_token_total helper + 3 regression tests (counted-once,
keep-most-complete-copy). Rendered to all generated trees.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
_rebuild_token_accounting summed every token_log row, so logs already
written by the pre-fix runner (one turn split across rows) still rolled
up to ~2x cost even after the write-time dedup. Dedup by msg_id at rollup
as well (keep the most-complete copy per turn); event_count now reports
distinct turns. Extracts _coerce_token_int helper (shared with
_usage_token_total). Adds 2 regression tests (write-path repeat, rollup
of a dup-containing log).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@azalio azalio merged commit 9cbb0a8 into main Jun 9, 2026
6 checks passed
@azalio azalio deleted the fix/token-meter-msgid-dedup branch June 9, 2026 20:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant