Skip to content

fix(v2.9.7): active Claude-CLI auth probe + 401 on cli auth failure#18

Merged
ttlequals0 merged 3 commits into
mainfrom
fix/v2.9.7-cli-auth-probe
May 13, 2026
Merged

fix(v2.9.7): active Claude-CLI auth probe + 401 on cli auth failure#18
ttlequals0 merged 3 commits into
mainfrom
fix/v2.9.7-cli-auth-probe

Conversation

@ttlequals0
Copy link
Copy Markdown
Owner

Summary

  • Active CLI-auth health probe: when CLAUDE_AUTH_METHOD=claude_cli, the lifespan now runs a periodic background coroutine that calls the existing claude_cli.verify_cli() (a 1-turn query(prompt="Hello", max_turns=1)) and updates a shared cli_health state. Default interval 600s, configurable via CLI_AUTH_PROBE_INTERVAL_SECONDS, set 0 to disable. Skipped for non-cli auth methods.
  • POST /v1/chat/completions and POST /v1/messages now return HTTP 401 with error.type=authentication_error and error.code=claude_cli_not_authenticated when the most recent probe failed. OpenAI / Anthropic client libraries route 401 as AuthenticationError, giving callers a durable signal instead of a transient 502/503.
  • /v1/auth/status exposes the new cli_health block: ok, last_probed_at, last_ok_at, error_kind (auth_failure | unknown | null), error_message.
  • Defense-in-depth: _build_sdk_error_response scans stderr_tail + error_message for known CLI-auth markers (not logged in, please run /login, invalid api key, authentication_error, 401). On a match it returns 401 instead of 502 and seeds cli_health failed so the next request fails fast.
  • Auth-failure responses bypass the global http_exception_handler (which would rewrite bodies to error.type=api_error) by returning JSONResponse directly.

Version

2.9.7 (bumped in src/__init__.py and pyproject.toml).

Test plan

  • Full suite: 673 passed, 31 skipped (was 664/31 on v2.9.6; +9 new tests).
  • Manual repro of broken auth (CLAUDE_CONFIG_DIR=\$(mktemp -d)): both /v1/chat/completions and /v1/messages return HTTP 401 with the OpenAI-shaped body before any SDK round-trip; /v1/auth/status shows cli_health.ok=false.
  • Healthy path: /v1/auth/status returns cli_health.ok=true with timestamps populated.

Docker

Image ttlequals0/claude-code-openai-wrapper:2.9.7 already pushed (sha256:39fc12f1dd5fa15b8752a384f03b839f89684d356bc5de40ab05f477975ca22f); :latest repointed. Trivy: 14 HIGH/CRITICAL CVEs, all unfixed Debian base packages, identical set to v2.9.6 per prior triage. Portainer webhook fired.

Note: future builds should run PR -> CodeQL green -> build/push, not the other way around. The image was published ahead of CodeQL this time; saved as feedback so the next release waits on the scanner.

Periodic background coroutine probes claude-agent-sdk via existing
verify_cli() (1-turn Hello query) when CLAUDE_AUTH_METHOD=claude_cli.
Default interval 600s, configurable via CLI_AUTH_PROBE_INTERVAL_SECONDS,
0 to disable. /v1/auth/status exposes new cli_health block (ok,
last_probed_at, last_ok_at, error_kind, error_message).

POST /v1/chat/completions and POST /v1/messages now return HTTP 401 with
error.type=authentication_error and code=claude_cli_not_authenticated
when the most recent probe failed, instead of letting the request reach
the SDK and surface as 502 or fall through to the 503 config check.
OpenAI / Anthropic client libraries route 401 as AuthenticationError.

Defense-in-depth: _build_sdk_error_response now scans stderr_tail +
error_message for known CLI-auth markers (not logged in, please run
/login, invalid api key, authentication_error, 401). On a match it
returns 401 instead of 502 and seeds cli_health failed so the next
request fails fast.

Auth-failure responses bypass the global http_exception_handler (which
rewrites bodies to error.type=api_error) by returning JSONResponse
directly, so the authentication_error literal reaches clients.

Tests: 673 passing, 31 skipped (+9 from v2.9.6 baseline of 664/31).
- TestProbeCliAuth: 3 async tests for probe classification
- TestChatCompletionsCliHealthGate + TestAnthropicMessagesCliHealthGate:
  in-process TestClient assertions on the 401 surface
- TestCliAuthFailureToFourOhOne: 4 stderr-mapping tests including a
  502 regression guard and a cli_health-seeding test
@ttlequals0 ttlequals0 merged commit 4d7f8b4 into main May 13, 2026
6 checks passed
@ttlequals0 ttlequals0 deleted the fix/v2.9.7-cli-auth-probe branch May 13, 2026 02:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant