feat: Rate-limit detection for OpenAI/Codex

## Problem

ccxray detects Claude rate-limit headers for quota ticker display. Codex traffic is primarily WebSocket, so rate-limit detection likely needs `ws-proxy` frame parsing or WS upgrade response header extraction, not just the HTTP-path `ratelimit-log.js`.

## Scope

- Identify where OpenAI rate-limit info appears (WS upgrade response headers? per-frame metadata?)
- Extend rate-limit detection to cover WS transport
- Feed into quota-ticker UI

## Before / After UI

```
BEFORE:
┌─────────────────────────────────────────────────────────┐
│  ccxray dashboard — quota ticker                        │
│                                                         │
│  Claude (opus-4):                                       │
│  ████████████░░░░░░░░  60,000 / 80,000 tokens           │
│  resets 2026-06-05T10:00:00Z                            │
│  ✓ "Can parallelize — 75% remaining"                    │
│                                                         │
│  Codex (o3):                                            │
│  ░░░░░░░░░░░░░░░░░░░░  — / — tokens                    │
│  (no rate-limit info available)                         │
│  ⚠ No data                                             │
└─────────────────────────────────────────────────────────┘

AFTER:
┌─────────────────────────────────────────────────────────┐
│  ccxray dashboard — quota ticker                        │
│                                                         │
│  Claude (opus-4):                                       │
│  ████████████░░░░░░░░  60,000 / 80,000 tokens           │
│  resets 2026-06-05T10:00:00Z                            │
│  ✓ "Can parallelize — 75% remaining"                    │
│                                                         │
│  Codex (o3):                                            │
│  ██████████████░░░░░░  7,200 / 10,000 RPM              │
│  resets 2026-06-05T09:01:00Z                            │
│  ✓ "Healthy — 72% remaining"                           │
└─────────────────────────────────────────────────────────┘
```

## Architecture

### Detection surface by provider

**Claude (HTTP) — working today:**
```
Claude Code
  → POST /v1/messages
  → Anthropic API responds with HTTP headers:
      anthropic-ratelimit-tokens-limit: 80000
      anthropic-ratelimit-tokens-remaining: 60000
      anthropic-ratelimit-tokens-reset: 2026-06-05T10:00:00Z
  → server/ratelimit-log.js captures headers from proxyRes
  → SSE broadcast to dashboard
  → public/quota-ticker.js renders progress bar
```

**Codex (WS) — needs investigation:**
```
Codex CLI
  → POST /v1/responses (Upgrade: websocket)
  → OpenAI API responds with 101 Switching Protocols
      ┌─────────────────────────────────────────────────┐
      │ WS upgrade response headers?                    │
      │   x-ratelimit-limit-requests: 10000             │
      │   x-ratelimit-remaining-requests: 7200          │
      │   x-ratelimit-reset-requests: 1s                │
      │   (unconfirmed — needs wire capture)             │
      └─────────────────────────────────────────────────┘
  → Per-frame metadata in WS messages?
      ┌─────────────────────────────────────────────────┐
      │ response.usage.rate_limit_info? (unconfirmed)   │
      │ response.done event metadata? (unconfirmed)     │
      └─────────────────────────────────────────────────┘
  → server/ws-proxy.js would need to extract from one or both
  → Feed into server/ratelimit-log.js (same capture/sample pattern)
  → public/quota-ticker.js renders (same UI, different labels)
```

**Key question: WHERE does OpenAI expose rate-limit info for WebSocket connections?**
- WS upgrade (101) response headers?
- Per-response metadata in WS frames?
- Separate REST endpoint (e.g. `/v1/rate_limits`)?

**Files involved:**
| File | Role |
|------|------|
| `server/ratelimit-log.js` | Capture + sample rate-limit data (currently HTTP-only) |
| `server/ws-proxy.js` | WS transport proxy — extraction point for Codex |
| `public/quota-ticker.js` | Dashboard UI rendering |
| `server/config.js` | `UPSTREAM_PROFILES` — could gain `rateLimitSource` field |

## Value

### For users
- Know when approaching Codex rate limits before hitting them
- Pace adjustment recommendations ("Can parallelize" / "Slow down") work for Codex too
- Unified rate-limit visibility across both providers in one dashboard

### For developers
- `UPSTREAM_PROFILES` could gain a `rateLimitSource` field per provider family
- `ratelimit-log.js` already has the capture/sample pattern — extend to WS frames
- Clean separation: detection (server) vs rendering (client) already exists

## Side Effects
- WS upgrade headers may not contain rate-limit info (needs wire capture investigation first)
- OpenAI rate-limit semantics may differ from Anthropic:
  - Per-minute (RPM/TPM) vs Anthropic's per-day with rolling window
  - Org-level vs project-level vs user-level limits
  - Separate limits for requests vs tokens vs images
- `quota-ticker.js` assumptions about token windows (5h rolling) may not apply to OpenAI
- ChatGPT-OAuth vs API-key Codex users may have different rate-limit visibility

## Open Questions
- Does Codex CLI itself surface rate-limit info anywhere (env var, stderr, config)?
- Is there a `/v1/rate_limits` endpoint for OpenAI API key users?
- For ChatGPT-OAuth Codex users, are rate limits even exposed via headers/frames?
- Do WS upgrade 101 responses carry the same `x-ratelimit-*` headers as regular REST responses?
- Should we do a wire capture (`CCXRAY_WS_DEBUG=1`) to observe actual headers/frames?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Rate-limit detection for OpenAI/Codex #47

Problem

Scope

Before / After UI

Architecture

Detection surface by provider

Value

For users

For developers

Side Effects

Open Questions

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

File	Role
`server/ratelimit-log.js`	Capture + sample rate-limit data (currently HTTP-only)
`server/ws-proxy.js`	WS transport proxy — extraction point for Codex
`public/quota-ticker.js`	Dashboard UI rendering
`server/config.js`	`UPSTREAM_PROFILES` — could gain `rateLimitSource` field

feat: Rate-limit detection for OpenAI/Codex #47

Description

Problem

Scope

Before / After UI

Architecture

Detection surface by provider

Value

For users

For developers

Side Effects

Open Questions

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions