Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
59 changes: 59 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,65 @@ All notable changes to the Claude Code OpenAI Wrapper project will be documented
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [2.9.7] - 2026-05-12

### Added

- Active CLI-auth health probe. When `CLAUDE_AUTH_METHOD=claude_cli`,
the lifespan schedules a periodic background coroutine that runs
the existing `claude_cli.verify_cli()` (a 1-turn
`query(prompt="Hello", max_turns=1)`) and updates a shared
`cli_health` state. Bounds the stale window between the bundled CLI
losing its session and a real chat request discovering it.
- Interval is configurable via `CLI_AUTH_PROBE_INTERVAL_SECONDS`
(default 600s / 10 min). Set to 0 to disable. Skipped entirely for
non-cli auth methods (API key / Bedrock / Vertex), which surface
upstream auth failures via the existing
`assistant_authentication_failed` -> 401 mapping.
- Probe results visible at `GET /v1/auth/status` under a new
`cli_health` block: `ok`, `last_probed_at`, `last_ok_at`,
`error_kind` (`auth_failure` | `unknown` | `null`),
`error_message`.

### Changed

- `POST /v1/chat/completions` and `POST /v1/messages` now return
**HTTP 401** with `error.type=authentication_error` and
`error.code=claude_cli_not_authenticated` when the latest CLI probe
failed, instead of letting the request fall through to a generic
502 from the SDK or 503 from the config check. OpenAI / Anthropic
client libraries route 401 as `AuthenticationError`, giving callers a
durable signal to roll keys or re-`/login` rather than retrying a
doomed request.
- `_build_sdk_error_response` (the
`ClaudeResultError.subtype=error_during_execution` path) now scans
`error_message` + `stderr_tail` for the same CLI-auth-failure markers
the probe uses (`not logged in`, `please run /login`,
`invalid api key`, `authentication_error`, `401`). On a match the
response is 401 + `authentication_error` and `cli_health` is seeded
failed so the next request fails fast without a round-trip.
- Auth-failure responses now bypass the global `http_exception_handler`
(which previously rewrote the body as `error.type=api_error`) by
returning `JSONResponse` directly. Required for OpenAI / Anthropic
clients to read the authentication signal.

### Tests

- `tests/test_auth_unit.py::TestProbeCliAuth` - three async unit tests
covering `probe_cli_auth()`: success (`mark_ok`), `Not logged in`
stderr (`auth_failure`), generic exception (`unknown`).
- `tests/test_endpoints.py::TestChatCompletionsCliHealthGate` and
`tests/test_anthropic_messages.py::TestAnthropicMessagesCliHealthGate`
- in-process TestClient assertions that both endpoints return 401
with `authentication_error` when `cli_health.ok=False`.
- `tests/test_error_path_unit.py::TestCliAuthFailureToFourOhOne` -
four tests for the stderr-marker mapping: 401 on `Not logged in`,
401 on `Invalid API key`, 502 regression guard on `connection
refused`, and a seeding test confirming a real request flips
`cli_health.ok` to False.
- Suite total: 673 passed, 31 skipped (was 664/31 on v2.9.6; +9 new
tests).

## [2.9.6] - 2026-05-11

### Changed
Expand Down
8 changes: 5 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,11 @@ OpenAI API-compatible wrapper for Claude Code. Drop it in front of any OpenAI cl

## Version

**Current:** 2.9.6
**Current:** 2.9.7

Highlights of recent releases (full history in [CHANGELOG.md](./CHANGELOG.md)):

- **2.9.7** - Active Claude-CLI auth health probe (10-minute default, configurable via `CLI_AUTH_PROBE_INTERVAL_SECONDS`). `/v1/chat/completions` and `/v1/messages` now return **HTTP 401** with `error.type=authentication_error` when the bundled CLI loses its session, so OpenAI / Anthropic client libraries route the failure as `AuthenticationError` instead of a transient 502/503. `/v1/auth/status` exposes the new `cli_health` block. Defense-in-depth: `error_during_execution` results whose stderr matches `Not logged in / Please run /login / Invalid API key` also map to 401 and seed `cli_health` failed.
- **2.9.6** - `claude-agent-sdk` 0.1.68 -> 0.1.81. urllib3 floor raised to 2.7.0 and `python-multipart` to 0.0.27 to close three HIGH Dependabot alerts. Pulled in upstream `RichardAtCT#46` so `/v1/models` returns Anthropic's live catalogue when `ANTHROPIC_API_KEY` is set (cached, with a short error TTL so transient outages do not stick for an hour). `check-sdk-version.yml` now opens a draft bump PR on drift instead of writing only to the job summary.
- **2.9.x** (earlier) - CodeQL hardening: sanitised error responses (no more `str(e)` to clients), `filter_content` rewrite against polynomial ReDoS, `/v1/debug/request` gated behind `DEBUG_MODE`/`VERBOSE`, workflow permissions pinned. Image trimmed via `poetry install --only main` and a real `.dockerignore`.
- **2.8.x** - Security dep bumps, breaker defaults loosened, CLI stderr capture, structured-log state unmasked.
Expand All @@ -17,7 +18,7 @@ Highlights of recent releases (full history in [CHANGELOG.md](./CHANGELOG.md)):

## Status

Production ready. **664 tests passing (31 skipped)**. Streaming works. Sessions work. JSON mode works. Function calling works. Tools are off by default for speed - pass `enable_tools: true` to turn them on. Auth supports API key, Bedrock, Vertex AI, and CLI.
Production ready. **673 tests passing (31 skipped)**. Streaming works. Sessions work. JSON mode works. Function calling works. Tools are off by default for speed - pass `enable_tools: true` to turn them on. Auth supports API key, Bedrock, Vertex AI, and CLI.

## Quick Start

Expand Down Expand Up @@ -189,6 +190,7 @@ Listed in roughly the order you will reach for them.
| `ANTHROPIC_MODELS_URL` | Override the live models endpoint. Point at a proxy or staging URL during testing. | `https://api.anthropic.com/v1/models` |
| `ANTHROPIC_VERSION` | `anthropic-version` header sent to the Models API. | `2023-06-01` |
| `ANTHROPIC_BETA` / `ANTHROPIC_BETA_HEADER` | Optional `anthropic-beta` header forwarded to the Models API for beta-gated features. | - |
| `CLI_AUTH_PROBE_INTERVAL_SECONDS` | Background CLI-auth probe cadence when `CLAUDE_AUTH_METHOD=claude_cli`. Each probe is a 1-turn `query` (~$0.001 at Sonnet pricing); a failure flips `cli_health.ok` so `/v1/chat/completions` and `/v1/messages` return 401 instead of letting the SDK fail loudly. Set `0` to disable. Ignored for non-cli auth methods. | `600` (10 min) |
| `DEBUG_MODE` | Enable debug logging and unlock `/v1/debug/request` | `false` |
| `VERBOSE` | Same unlock effect on `/v1/debug/request` | `false` |
| `CORS_ORIGINS` | Allowed CORS origins (JSON array) | `["*"]` |
Expand Down Expand Up @@ -451,7 +453,7 @@ With `json_object` mode, the wrapper adds system prompt instructions for JSON ou
## Testing

```bash
# Run the full test suite (664 tests, ~3 s on a laptop)
# Run the full test suite (673 tests, ~3 s on a laptop)
poetry run pytest tests/

# Quick endpoint test (server must be running)
Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[tool.poetry]
name = "claude-code-openai-wrapper"
version = "2.9.6"
version = "2.9.7"
description = "OpenAI API-compatible wrapper for Claude Code"
authors = ["Richard Atkinson <richardatk01@gmail.com>"]
readme = "README.md"
Expand Down
2 changes: 1 addition & 1 deletion src/__init__.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
"""Claude Code OpenAI Wrapper - A FastAPI-based OpenAI-compatible API for Claude Code."""

__version__ = "2.9.6"
__version__ = "2.9.7"
105 changes: 105 additions & 0 deletions src/auth.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
import os
import logging
from dataclasses import dataclass, field
from datetime import datetime, timezone
from typing import Optional, Dict, Any, Tuple
from fastapi import HTTPException, Request
from fastapi.security import HTTPBearer, HTTPAuthorizationCredentials
Expand Down Expand Up @@ -284,3 +286,106 @@ def get_claude_code_auth_info() -> Dict[str, Any]:
"status": auth_manager.auth_status,
"environment_variables": list(auth_manager.get_claude_code_env_vars().keys()),
}


# Markers the Claude CLI emits to stderr (or wraps in SDK exceptions) when its
# stored session is missing or expired. Compared case-insensitively against the
# concatenation of an exception's str() and any captured stderr_tail.
_CLI_AUTH_FAILURE_MARKERS = (
"not logged in",
"please run /login",
"invalid api key",
"authentication_error",
"401",
)


def _classify_probe_error(blob: str) -> str:
lowered = (blob or "").lower()
if any(marker in lowered for marker in _CLI_AUTH_FAILURE_MARKERS):
return "auth_failure"
return "unknown"


@dataclass
class CliHealth:
"""Latest observed health of the Claude CLI auth path.

The probe loop (run only when auth_method == 'claude_cli') refreshes this
on an interval; the chat / messages handlers consult `ok` to short-circuit
with HTTP 401 before round-tripping through the SDK.
"""

ok: bool = True
last_probed_at: Optional[datetime] = None
last_ok_at: Optional[datetime] = None
error_kind: Optional[str] = None
error_message: Optional[str] = None

def mark_ok(self) -> None:
now = datetime.now(timezone.utc)
self.ok = True
self.last_probed_at = now
self.last_ok_at = now
self.error_kind = None
self.error_message = None

def mark_failed(self, kind: str, message: str) -> None:
self.ok = False
self.last_probed_at = datetime.now(timezone.utc)
self.error_kind = kind
# Trim to keep logs and /v1/auth/status compact.
self.error_message = (message or "")[:500]

def as_dict(self) -> Dict[str, Any]:
return {
"ok": self.ok,
"last_probed_at": self.last_probed_at.isoformat() if self.last_probed_at else None,
"last_ok_at": self.last_ok_at.isoformat() if self.last_ok_at else None,
"error_kind": self.error_kind,
"error_message": self.error_message,
}


cli_health = CliHealth()


async def probe_cli_auth(cli=None) -> bool:
"""Run a 1-turn CLI probe and update `cli_health`.

Reuses `claude_cli.verify_cli()` (which already issues a short
`query(prompt="Hello", max_turns=1)`); on any exception, classifies the
failure as auth_failure if the marker set matches, else unknown.

`cli` is the ClaudeCodeCLI instance to probe. When omitted, lazy-resolves
the module-level singleton from `src.main` so a periodic probe exercises
exactly the same instance that real requests use. The parameter is
primarily there for tests, which inject a mock.

Returns True when the probe succeeded, False otherwise. Never raises.
"""
if cli is None:
# Lazy import - src.main imports src.auth at module load.
from src import main as _main # noqa: WPS433 - intentional lazy import

cli = _main.claude_cli

try:
ok = await cli.verify_cli()
if ok:
cli_health.mark_ok()
logger.info("cli_auth_probe_ok")
return True
cli_health.mark_failed("unknown", "verify_cli returned False")
logger.warning("cli_auth_probe_failed kind=unknown reason=verify_cli_returned_false")
return False
except Exception as exc: # noqa: BLE001 - the probe must never propagate
message = str(exc)
kind = _classify_probe_error(message)
cli_health.mark_failed(kind, message)
logger.warning(
"cli_auth_probe_failed kind=%s error=%s",
kind,
message[:200].replace("\n", " "),
)
return False
Loading
Loading