fix(#44): retry grouping — hide startup errors from real turns by lis186 · Pull Request #61 · lis186/ccxray

lis186 · 2026-06-09T11:58:34Z

繁中摘要

Codex session 把啟動失敗的 502/429/499 retry（0.1 秒、無 output）跟真正的 API 對話混在同一個 turn list，使用者看到 66% failure rate 但實際 100% 成功。

這個 PR 在 client 端把 retry 從 real turn 分開：

Session card 顯示 2t 2r（2 real turns, 2 retries）而非 6t
Turn list 隱藏 retry card（drill-down 仍可查看）
全 retry session 顯示 No turns — 1 failed request (504) 而非空白沉默
Gap timing、compression detection、keyboard nav、cost efficiency 都跳過 retry

只改 client-side（2 個 production 檔 + 1 個測試檔），server 完全不動。Heuristic 在 27,219 筆真實 log 驗證 99.9% 準確。

Problem

Codex sessions mix startup 502/429/499 retries (0.1s, no output) with real API turns (85–249s, has output). Users see inflated failure rates and a cluttered turn list.

Expert consensus (Tufte/Norman/Charity Majors roundtable):

Tufte: proportional ink violation — 0.1s noise gets same visual weight as 249s real turn
Norman: conceptual model mismatch — user's "turn" ≠ system's "turn"
Charity Majors: alert fatigue — red !http badges lose meaning when 4/6 turns show them

Solution

Heuristic — isRetry = !isHttpStatusOk(status) && !(output_tokens > 0)

Validated against 27,219 real log entries with independent adversarial verification (6 auditors + 1 judge, 13 claims cross-checked). 99.9% accuracy; 26 edge cases deferred to Phase 2.

Session card — shows real turn count + retry badge:

BEFORE: gpt-5.5 · 6t          (4 of 6 are retries — misleading)
AFTER:  gpt-5.5 · 2t 2r       (2 real turns, 2 retries)

Turn list — retry entries hidden (still in allEntries for drill-down):

BEFORE                        AFTER
#1  ● !http   0.1s   —       #1  ○ 200   85s   $0.42
#2  ● !http   0.1s   —       #2  ○ 200  249s   $1.87
#3  ● !http   0.1s   —
#4  ○ 200     85s   $0.42    (2 retries accessible via session card badge)
#5  ● !http   0.1s   —
#6  ○ 200    249s   $1.87

Empty state — all-retry sessions explain instead of silence:

No turns — 3 failed requests (502 × 2, 429)

Downstream filters fixed:

Gap timing backward scan skips retries (no more 0.1s noise as gap baseline)
Compression detection backward scan skips retries (no false compaction from partial billing)
getVisibleTurnIndices excludes retries (keyboard nav skips hidden entries)
renderCostEfficiencyPanel filter aligned with sparkline (input_tokens > 0)

Files changed (3, client-side only — server untouched)

File	Change
`public/entry-rendering.js`	isRetry computation, retryCount counter, allEntries flag, gap timing + compression scan exclusion, early return to skip card rendering
`public/miller-columns.js`	Session card retry badge (`Nr`), getVisibleTurnIndices filter, cost efficiency filter alignment, `updateRetryEmptyState()` for all-retry sessions
`test/retry-grouping.test.js`	15 difference tests covering classification, counters, session card, keyboard nav, gap timing, compression, Claude regression, empty state

Verification

Layer	Result
Red→green TDD	12 tests written before implementation, confirmed red on old code, green after
Adversarial audit	6 independent auditors verified heuristic counts, blind spots, blast radius; 3/13 claims corrected
Full test suite	836 tests pass (0 fail), including all e2e
Browser smoke	Real Codex session `019e9225` (1t 1r) — retry hidden, badge visible
Claude regression	Session `ee89450f` (521t) — zero retries, completely unaffected
Keyboard nav	ArrowUp/Down confirmed to skip retry indices
Empty state	Session `019e929d` (0t 1r) — shows "No turns — 1 failed request (504)"

Known limitations (Phase 2)

15 entries with status=101 (WS upgrade shells + interrupted turns) bypass the heuristic — needs transport-aware classification
21 entries with status=200 + SSE error bypass — needs SSE event type detection
Specimen 2 entries (6 total, input_tokens > 0 but output_tokens = 0) leak through sparkline filter

Closes #44

🤖 Generated with Claude Code

Codex sessions mix 502/429/499 retries (0.1s, no output) with real API turns (85–249s, has output). Users see inflated failure rates and noisy turn lists. This separates retries from real turns client-side: - isRetry heuristic: !isHttpStatusOk(status) && !(output_tokens > 0) - Session card shows retry count badge (e.g. "4t 2r") - Turn list hides retry cards (still in allEntries for drill-down) - Gap timing and compression scans skip retries - Cost efficiency filter aligned with sparkline (input_tokens > 0) - Claude sessions completely unaffected (zero retry entries) Verified: 836 tests pass, browser smoke with real Codex sessions, keyboard nav confirmed to skip hidden retries. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

lis186 merged commit 0c26e83 into main Jun 9, 2026
4 checks passed

lis186 deleted the fix/44-retry-grouping branch June 9, 2026 12:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(#44): retry grouping — hide startup errors from real turns#61

fix(#44): retry grouping — hide startup errors from real turns#61
lis186 merged 1 commit into
mainfrom
fix/44-retry-grouping

lis186 commented Jun 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

lis186 commented Jun 9, 2026

繁中摘要

Problem

Solution

Files changed (3, client-side only — server untouched)

Verification

Known limitations (Phase 2)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant