fix(v2.9.7): active Claude-CLI auth probe + 401 on cli auth failure#48
Closed
ttlequals0 wants to merge 49 commits into
Closed
fix(v2.9.7): active Claude-CLI auth probe + 401 on cli auth failure#48ttlequals0 wants to merge 49 commits into
ttlequals0 wants to merge 49 commits into
Conversation
- Add response_format parameter for OpenAI-compatible JSON mode - Add ModelService for dynamic model fetching from Anthropic API - Add claude-opus-4-5-20251101 model to supported models - Add JSON extraction and enforcement methods to MessageAdapter - Update docker-compose.yml to use published image - Bump version to 2.3.0
Claude Code SDK was ignoring JSON_MODE_INSTRUCTION in the system prompt and returning conversational text instead of JSON. Added JSON_PROMPT_SUFFIX constant that is now appended to the user prompt alongside the system prompt instruction, ensuring the model follows JSON output requirements. Changes: - Add JSON_PROMPT_SUFFIX constant to message_adapter.py - Append suffix to user prompt in both streaming and non-streaming paths - Update log messages to reflect dual-prompt approach - Bump version to 2.3.1
- Updated JSON_MODE_INSTRUCTION with explicit first/last character rules - Added explicit prohibition of markdown code blocks in instructions - Updated JSON_PROMPT_SUFFIX with more concise output format - Added log_json_structure() helper for debugging JSON responses - Added boundary and structure logging in streaming/non-streaming paths
…-models Add JSON response format support and dynamic model fetching
- Improve JSON mode instructions with numbered rules and explicit prohibition of preambles - Add COMMON_PREAMBLES constant with 19 common Claude preambles - Implement balanced brace/bracket matching algorithm that handles escaped quotes and braces inside strings correctly - Add JsonExtractionResult dataclass and extract_json_with_metadata() for detailed extraction tracking - Add enforce_json_format_with_metadata() for metadata-enabled JSON enforcement - Add _log_extraction_diagnostics() for debugging extraction failures - Create optional request deduplication cache with LRU eviction and TTL - Add cache management endpoints: GET /v1/cache/stats, POST /v1/cache/clear - Update version to 2.4.0 - Add comprehensive unit tests for all new functionality The JSON extraction priority order is now: 1. Pure JSON (fast path) 2. Preamble removal + parse 3. Markdown code block extraction 4. Balanced brace/bracket matching 5. First-to-last fallback
- Add POST /v1/models/refresh to refresh models from Anthropic API at runtime - Add GET /v1/models/status for service observability (source, count, last refresh) - Track model source (api/fallback) and last refresh timestamp in ModelService - Add comprehensive unit tests for refresh functionality Version 2.4.1
- Model refresh now respects CLAUDE_AUTH_METHOD configuration - Only 'anthropic' auth supports dynamic API fetch; others use static fallback - Added auth_method field to /v1/models/refresh and /v1/models/status responses - Updated CLAUDE_MODELS: added claude-opus-4-6, removed claude-opus-4-5-20250929 - Added model status/refresh endpoint cards to landing page UI - Comprehensive unit tests for all auth methods
feat: JSON extraction improvements, request cache, and dynamic model refresh
…5.0) - Add model metadata (context windows, output limits) and pricing from source - Add claude-sonnet-4-6 and re-enable 3.x models confirmed supported - Expand tool registry from 15 to 33 tools matching actual inventory - Add retry module with exponential backoff and Opus-to-Sonnet fallback - Add cost tracker with per-session accumulation and auto-cleanup - Add X-Claude-Effort and X-Claude-Thinking header support - Add model-specific max_tokens validation - Extract shared options-building helper for streaming/non-streaming paths - Rewrite README, trim historical migration docs
feat: v2.5.0 - models, tools, pricing from open-sourced Claude Code
- Replace generic landing page with clean utilitarian design - Fix GitHub URL to ttlequals0/claude-code-openai-wrapper - Fix OpenAPI docs version (was hardcoded 1.0.0, now dynamic) - Add all 25 endpoints to landing page grouped by category - Drop Pico CSS, use DM Sans + JetBrains Mono typography - Bump version to 2.5.1
…date feat: redesign landing page and update API docs (v2.5.1)
- Add JSON response mode documentation with usage example - Expand API endpoints table from 14 to 25 entries, grouped by category - Fix Installation git clone URL (was RichardAtCT, now ttlequals0) - Bump version reference to 2.5.1
- Fix SDK version reference (removed pinned version, installed is 0.1.26) - Fix production command (main.py does not exist, use claude-wrapper) - Fix test command path (tests/test_endpoints.py not test_endpoints.py) - Fix MAX_TIMEOUT units in Docker table (ms not seconds, 600000 not 300) - Add missing env vars to config table (DEBUG_MODE, CORS_ORIGINS, etc.) - Update temperature/top_p limitation (now applied via system prompt) - Tighten prose, remove AI-ish phrasing - Sync pyproject.toml version to 2.5.1
….5.2) - Remove BashOutput, KillShell, SlashCommand (not in Claude Code registry) - Add Brief, Config, ListPeers, REPL, Sleep, Monitor, SendUserFile, PushNotification, ListMcpResources, ReadMcpResource, VerifyPlanExecution - Tool count: 33 -> 41, verified against Claude Code src/tools.ts
fix: remove fake tools, add missing real tools (v2.5.2)
…2.6.0) - OpenAI function calling simulation via system prompt injection and response parsing (tools/tool_choice parameters, multi-turn support) - JSON schema in response_format (type=json_schema with schema definition) - Real-time streaming markdown fence stripping (JsonFenceStripper) - CPU watchdog for Docker/Linux (WATCHDOG_ENABLED=true to enable) - New models: ToolCall, FunctionCall, ToolDefinition, JsonSchema - Message model extended with tool role, tool_calls, tool_call_id
- Extract duplicated JSON schema instructions to MessageAdapter.JSON_SCHEMA_TEMPLATE - Remove no-op fence_str=fence assignments in JsonFenceStripper - Fix filter_content(None) to return "" instead of None (type safety) - Fix greedy bare JSON regex in parse_tool_calls (use json.loads validation) - Add log when tools + json_mode both active in streaming - Add precise return type annotation to parse_tool_calls - Add tests: json_schema model, dict message conversion, nested array parsing
…-json-schema feat: function calling, JSON schema, fence stripping, watchdog (v2.6.0)
…87;187;187m �[39msupported�[38;2;187;187;187m �[39mmodel�[38;2;187;187;187m �[39mlist�[38;2;102;102;102m,�[39m�[38;2;187;187;187m �[39madd�[38;2;187;187;187m �[39mClaude�[38;2;187;187;187m �[39mOpus�[38;2;187;187;187m �[39m�[38;2;102;102;102m4.7�[39m�[38;2;187;187;187m �[39m�[38;2;102;102;102m(�[39mv2�[38;2;102;102;102m.�[39m�[38;2;102;102;102m7.0�[39m�[38;2;102;102;102m)�[39m Align�[38;2;187;187;187m �[39mCLAUDE_MODELS�[38;2;102;102;102m,�[39m�[38;2;187;187;187m �[39mMODEL_METADATA�[38;2;102;102;102m,�[39m�[38;2;187;187;187m �[39mMODEL_PRICING�[38;2;102;102;102m,�[39m�[38;2;187;187;187m �[39mand�[38;2;187;187;187m �[39mMODEL_FALLBACK_MAP �[38;2;170;34;255;01mwith�[39;00m�[38;2;187;187;187m �[39mthe�[38;2;187;187;187m �[39mAnthropic�[38;2;187;187;187m �[39mmodels�[38;2;187;187;187m �[39mdocs�[38;2;187;187;187m �[39m�[38;2;170;34;255;01mas�[39;00m�[38;2;187;187;187m �[39mof�[38;2;187;187;187m �[39m�[38;2;102;102;102m2026�[39m�[38;2;102;102;102m-�[39m�[38;2;102;102;102m04�[39m�[38;2;102;102;102m-�[39m�[38;2;102;102;102m16�[39m�[38;2;102;102;102m.�[39m�[38;2;187;187;187m �[39mRemove�[38;2;187;187;187m �[39mthree�[38;2;187;187;187m �[39mmodels�[38;2;187;187;187m �[39malready retired�[38;2;187;187;187m �[39mat�[38;2;187;187;187m �[39mthe�[38;2;187;187;187m �[39mAPI�[38;2;187;187;187m �[39mand�[38;2;187;187;187m �[39madd�[38;2;187;187;187m �[39mthe�[38;2;187;187;187m �[39m�[38;2;170;34;255;01mnew�[39;00m�[38;2;187;187;187m �[39mflagship�[38;2;187;187;187m �[39mOpus�[38;2;187;187;187m �[39m�[38;2;102;102;102m4.7�[39m�[38;2;102;102;102m.�[39m �[38;2;102;102;102m-�[39m�[38;2;187;187;187m �[39mAdd�[38;2;187;187;187m �[39mclaude�[38;2;102;102;102m-�[39mopus�[38;2;102;102;102m-�[39m�[38;2;102;102;102m4�[39m�[38;2;102;102;102m-�[39m�[38;2;102;102;102m7�[39m�[38;2;187;187;187m �[39m�[38;2;102;102;102m(�[39m�[38;2;102;102;102m1�[39mM�[38;2;187;187;187m �[39mcontext�[38;2;102;102;102m,�[39m�[38;2;187;187;187m �[39m�[38;2;102;102;102m128�[39mK�[38;2;187;187;187m �[39mmax�[38;2;187;187;187m �[39moutput�[38;2;102;102;102m,�[39m�[38;2;187;187;187m �[39m$5�[38;2;102;102;102m/�[39m$25�[38;2;187;187;187m �[39mper�[38;2;187;187;187m �[39mMTok�[38;2;102;102;102m)�[39m �[38;2;102;102;102m-�[39m�[38;2;187;187;187m �[39mRemove�[38;2;187;187;187m �[39mretired�[38;2;102;102;102m:�[39m�[38;2;187;187;187m �[39mclaude�[38;2;102;102;102m-�[39m�[38;2;102;102;102m3�[39m�[38;2;102;102;102m-�[39m�[38;2;102;102;102m7�[39m�[38;2;102;102;102m-�[39msonnet�[38;2;102;102;102m-�[39m�[38;2;102;102;102m20250219�[39m�[38;2;102;102;102m,�[39m�[38;2;187;187;187m �[39mclaude�[38;2;102;102;102m-�[39m�[38;2;102;102;102m3�[39m�[38;2;102;102;102m-�[39m�[38;2;102;102;102m5�[39m�[38;2;102;102;102m-�[39msonnet�[38;2;102;102;102m-�[39m�[38;2;102;102;102m20241022�[39m�[38;2;102;102;102m,�[39m �[38;2;187;187;187m �[39mclaude�[38;2;102;102;102m-�[39m�[38;2;102;102;102m3�[39m�[38;2;102;102;102m-�[39m�[38;2;102;102;102m5�[39m�[38;2;102;102;102m-�[39mhaiku�[38;2;102;102;102m-�[39m�[38;2;102;102;102m20241022�[39m �[38;2;102;102;102m-�[39m�[38;2;187;187;187m �[39mFix�[38;2;187;187;187m �[39mcontext�[38;2;187;187;187m �[39mwindow�[38;2;187;187;187m �[39mto�[38;2;187;187;187m �[39m�[38;2;102;102;102m1�[39mM�[38;2;187;187;187m �[39m�[38;2;170;34;255;01mfor�[39;00m�[38;2;187;187;187m �[39mopus�[38;2;102;102;102m-�[39m�[38;2;102;102;102m4�[39m�[38;2;102;102;102m-�[39m�[38;2;102;102;102m7�[39m�[38;2;102;102;102m,�[39m�[38;2;187;187;187m �[39mopus�[38;2;102;102;102m-�[39m�[38;2;102;102;102m4�[39m�[38;2;102;102;102m-�[39m�[38;2;102;102;102m6�[39m�[38;2;102;102;102m,�[39m�[38;2;187;187;187m �[39msonnet�[38;2;102;102;102m-�[39m�[38;2;102;102;102m4�[39m�[38;2;102;102;102m-�[39m�[38;2;102;102;102m6�[39m �[38;2;102;102;102m-�[39m�[38;2;187;187;187m �[39mFix�[38;2;187;187;187m �[39mmax�[38;2;187;187;187m �[39moutput�[38;2;187;187;187m �[39mto�[38;2;187;187;187m �[39m�[38;2;102;102;102m32�[39mK�[38;2;187;187;187m �[39m�[38;2;170;34;255;01mfor�[39;00m�[38;2;187;187;187m �[39mopus�[38;2;102;102;102m-�[39m�[38;2;102;102;102m4�[39m�[38;2;102;102;102m-�[39m�[38;2;102;102;102m1�[39m�[38;2;102;102;102m-�[39m�[38;2;102;102;102m20250805�[39m�[38;2;187;187;187m �[39mand�[38;2;187;187;187m �[39mopus�[38;2;102;102;102m-�[39m�[38;2;102;102;102m4�[39m�[38;2;102;102;102m-�[39m�[38;2;102;102;102m20250514�[39m �[38;2;102;102;102m-�[39m�[38;2;187;187;187m �[39mFix�[38;2;187;187;187m �[39mmax�[38;2;187;187;187m �[39moutput�[38;2;187;187;187m �[39mto�[38;2;187;187;187m �[39m�[38;2;102;102;102m64�[39mK�[38;2;187;187;187m �[39m�[38;2;170;34;255;01mfor�[39;00m�[38;2;187;187;187m �[39msonnet�[38;2;102;102;102m-�[39m�[38;2;102;102;102m4�[39m�[38;2;102;102;102m-�[39m�[38;2;102;102;102m6�[39m�[38;2;187;187;187m �[39m�[38;2;102;102;102m(�[39msynchronous�[38;2;187;187;187m �[39mMessages�[38;2;187;187;187m �[39mAPI�[38;2;102;102;102m)�[39m �[38;2;102;102;102m-�[39m�[38;2;187;187;187m �[39mSync�[38;2;187;187;187m �[39m�[38;2;102;102;102m.�[39m�[38;2;187;68;68menv�[39m�[38;2;102;102;102m.�[39m�[38;2;187;68;68mexample�[39m�[38;2;187;187;187m �[39mDEFAULT_MODEL�[38;2;187;187;187m �[39m�[38;2;170;34;255;01mwith�[39;00m�[38;2;187;187;187m �[39mcode�[38;2;187;187;187m �[39m�[38;2;170;34;255;01mdefault�[39;00m�[38;2;187;187;187m �[39m�[38;2;102;102;102m(�[39mclaude�[38;2;102;102;102m-�[39msonnet�[38;2;102;102;102m-�[39m�[38;2;102;102;102m4�[39m�[38;2;102;102;102m-�[39m�[38;2;102;102;102m6�[39m�[38;2;102;102;102m)�[39m �[38;2;102;102;102m-�[39m�[38;2;187;187;187m �[39mUpdate�[38;2;187;187;187m �[39mlanding�[38;2;102;102;102m-�[39mpage�[38;2;187;187;187m �[39mquickstart�[38;2;187;187;187m �[39mand�[38;2;187;187;187m �[39mdebug�[38;2;187;187;187m �[39mexample�[38;2;187;187;187m �[39mto�[38;2;187;187;187m �[39mclaude�[38;2;102;102;102m-�[39msonnet�[38;2;102;102;102m-�[39m�[38;2;102;102;102m4�[39m�[38;2;102;102;102m-�[39m�[38;2;102;102;102m6�[39m
The existing rule only ignored test_roocode_compatibility.py.hypothesis/. Pytest-hypothesis regenerates .hypothesis/ at the repo root on every run, making it repeatedly show up as untracked.
Refresh supported model list, add Claude Opus 4.7 (v2.7.0)
…y user] as response content
Fixes the class of SDK failures where ResultMessage.subtype != "success"
fell through parse_claude_message and the synthetic
UserMessage('[Request interrupted by user]') was returned verbatim as the
assistant response. Raises ClaudeResultError instead, translated by the
HTTP layer to finish_reason="length" (error_max_turns) or a status-coded
error body (assistant rate_limit/billing/auth, generic SDK failure).
Also raises the OpenAI-compat max_turns default from 1 to 3 (env-configurable),
drops the max_tokens -> max_thinking_tokens misleading remap (opt-in via
env var), pins the SDK exactly, adds a circuit breaker and /healthz/deep
end-to-end probe, structured completion_result log line, multi-stage
Dockerfile dev/prod targets, BUILD_INFO stamp, and 13 new regression tests.
Upstream consumer affected was MinusPod; see that project's 2.0.12 notes
for the parallel defensive changes.
…ture CLI stderr 2.8.0 surfaced error_during_execution correctly but opened a tighter secondary problem: the breaker tripped on a 5/10 intra-episode failure burst, cascading into 503s for verification windows. Also discovered the R5 structured-log extras were being dropped by the plain-text formatter so circuit_breaker_open and completion_result logs shipped to Loki empty. Three fixes, no new behavior: - Inline key=value fields into log message strings via a _kv helper so default-formatter installs see the structured data. - Raise CircuitBreakerConfig defaults to min_requests_for_trip=20 and failure_ratio_threshold=0.75. Add env-var overrides for every knob plus WRAPPER_CIRCUIT_BREAKER_ENABLED kill switch. - Install stderr callback on ClaudeAgentOptions, ring-buffer 40 lines, emit + attach to ResultMessage dict on non-success. Propagates through ClaudeResultError.stderr_tail so the HTTP error path logs the real subprocess failure reason instead of just "num_turns=2". 12 new unit tests. Full suite 640 passed, 31 skipped.
Adds security-floor pins for starlette, urllib3, cryptography, pyjwt, authlib, mcp, and nltk -- each is transitive via fastapi or claude-agent-sdk but required a newer version than the parent's ceiling allowed. Widened fastapi to >=0.119 to admit starlette 0.49.x (for CVE-2025-62727). Clears 2 CRITICAL + 18 HIGH from the trivy scan against 2.8.1. Remaining findings are nltk XML CVEs with no published fix and Debian base-image packages that need a debian:13 rebase. No code change. 640 tests pass on the new deps.
47 patch releases worth of CLI and subprocess-handling fixes. Direct motivation is the silent `error_during_execution` rate observed against 2.8.2 in production (num_turns=2, usage.input_tokens=0, stderr empty — CLI dying before reaching Claude). Notable fixes in the range: - 0.1.52 control_cancel_request handling for hook callbacks - 0.1.53 string-prompt deadlock fix - 0.1.57 thinking-config serialization (direct vs max_thinking_tokens) - 0.1.60 setting_sources=[] no longer dropped - 0.1.51 ResultMessage.errors field now populated on failure - Bundled Claude CLI 2.0.72 -> 2.1.118 (46 versions) Full suite 640 passed, 31 skipped. No test changes required.
…parsing feat(2.8.0): stop error_max_turns from leaking interrupt sentinel as response content
Closes all 10 code-scanning alerts open on main:
Workflows
- Remove .github/workflows/claude-code-review.yml; the pull_request_target
+ checkout(head.sha) shape was flagged untrusted-checkout/high.
- Pin .github/workflows/ci.yml to permissions: {contents: read}.
Error responses (py/stack-trace-exposure)
- _build_assistant_error_response returns static subtype-keyed messages
via new _safe_assistant_error_message helper; raw err.errors/str(err)
stay in server logs only.
- generate_streaming_response error chunk is now a generic
"Streaming failed" string.
- Chat-completions and Anthropic-messages 500 HTTPException details
are generic strings; the exception is already logged.
- /v1/debug/request is gated behind DEBUG_MODE or VERBOSE and emits
only type(e).__name__ for the json-parse and outer-except paths.
filter_content (py/polynomial-redos)
- Replace <tag>.*?</tag> regex stripping with a linear str.find-based
helper so unterminated tags cannot trigger quadratic scanning even
before adding a length guard.
- Add 1MB input length guard for defence in depth.
- Rewrite the image/base64 regex with fixed upper bounds instead of
lazy quantifiers + lookahead.
Tests
- tests/test_redos_safety.py: six pathological inputs each complete in
under 1s (previously seconds-to-minutes), plus behavioural
regression coverage for the tag-stripping and image replacement.
Full suite: 650 passed, 31 skipped.
…repo Dockerfile + .dockerignore - poetry install now --only main so dev deps (black, bandit, pytest, mypy, safety) don't ship in the runtime image. Removes the one fixable Trivy HIGH (CVE-2026-32274 black < 26.3.1). - .dockerignore excludes .git, .venv, .hypothesis, .pytest_cache, tests, docs, .env*, editor cruft. Image drops from 1.18 GB to 775 MB and BUILD_INFO stamp now succeeds at build time. - 7 remaining Trivy HIGHs are in the Debian 13.4 base (ncurses, nghttp2, systemd); all have no upstream fix. Accepted risk until python:3.12-slim rebases. .github/workflows/ci.yml - timeout-minutes: 15 and fail-fast: false on the test matrix. - poetry check --lock step catches lockfile drift pre-merge (the exact failure mode that produced the 2.9.0 SDK bump). - Replaced deprecated `safety check` with pip-audit (non-blocking). - Added a Docker smoke-build job on every PR so Dockerfile regressions surface before release. .github/workflows/claude.yml - Repo-specific tool allowlist: read-only gh/git + poetry run pytest / black --check / bandit. No write commands (no gh pr create, no gh pr merge, no git push, no editor invocations). - Documented why the contains() gate on user-controlled event body fields is safe.
CI's `poetry run black --check src tests` was failing on 18 files (16 pre-existing plus src/main.py and src/message_adapter.py touched on this branch). Running `black` with the repo config (line-length=100) to bring everything in line so the linting step gates going forward. No behavioural changes; full pytest still 650 passed, 31 skipped.
Additional commits landed on this branch after the initial 2.9.1 commit (Docker --only main, .dockerignore, ci.yml hardening, claude.yml allowlist, black reformat). Bumping version so the deployed image surfaces 2.9.2 on /version and the landing page, and CI's lockfile / docker smoke gates ship under that tag.
CI has no push/deploy role for Docker - images are built and pushed locally, so a smoke build inside CI just burns runner minutes without gating anything the local flow doesn't already cover.
Without this, docker compose up (what Portainer runs on webhook redeploy) reuses the locally cached :latest layer and the updated image sits on Docker Hub unused. The Portainer stack's own compose config needs to match for the running stack to pick this up.
After 2.9.2 switched the Docker image to poetry install --only main,
the first chat completion raised at SDK connect:
File ".../claude_agent_sdk/_internal/transport/subprocess_cli.py",
line 413, in connect
from opentelemetry import propagate
ModuleNotFoundError: No module named 'opentelemetry'
The SDK does an unconditional opentelemetry.propagate import, but
opentelemetry-api is declared on PyPI only as an optional [otel]
extra. Previous images accidentally had it via dev-group transitives;
--only main correctly dropped it.
Pinning claude-agent-sdk = {version = "0.1.65", extras = ["otel"]} so
opentelemetry-api 1.41.1 resolves into the runtime image.
fix(2.9.1): close CodeQL alerts for stack-trace exposure and ReDoS
README - Version 2.7.0 -> 2.9.3 with 2.8.x and 2.9.x highlights. - Test count 566 -> 650 passing (31 skipped). - Add missing env vars: VERBOSE, WRAPPER_DEFAULT_MAX_TURNS, WRAPPER_MAP_MAX_TOKENS_TO_THINKING, MAX_REQUEST_SIZE, REQUEST_CACHE_TTL/MAX_SIZE, CLAUDE_WRAPPER_HOST, UVICORN_WORKERS, WATCHDOG_*, explicit API_KEY row, Bedrock/Vertex vars. - Add /healthz/deep endpoint row. - Note /v1/debug/request is gated behind DEBUG_MODE/VERBOSE. - Add /v1/sessions/* rate-limit row with env var. - Docker Compose example now matches docker-compose.yml (pull_policy, build target, container_name, healthcheck). - Drop stale limitation claim that OpenAI-style function calling is unsupported - 2.6.0 added it and the section below documents it. - X-Enable-Cache header row added. docs/ - Delete docs/MIGRATION_STATUS.md and docs/UPGRADE_PLAN.md. Both described work that shipped; no current reader needs them.
Two complementary mechanisms to catch claude-agent-sdk drift. .github/dependabot.yml - Weekly pip (Poetry) and github-actions scans, grouped minor/patch so the review queue stays short. - commit-message prefixes 'chore(deps)' and 'chore(ci)' for clean history. Release notes surface in the PR body automatically. .github/workflows/check-sdk-version.yml - Cron (Mondays 14:00 UTC) plus workflow_dispatch. - Reads the claude-agent-sdk pin from pyproject.toml, fetches the latest PyPI version, and opens (or updates) an issue if the pin lags. Catches the case where Dependabot PRs pile up unreviewed. - No event-payload interpolation; only schedule/dispatch triggers and explicit step outputs piped through env vars.
docs: audit README against current state, drop stale docs/
chore: monitor claude-agent-sdk drift via Dependabot + weekly sentinel
* fix(2.9.4): close all seven open Dependabot alerts Bumps: - black 24.10.0 -> 26.3.1 CVE-2026-32274 (high, dev only) - filelock 3.20.1 -> 3.29.0 CVE-2026-22701 (medium, dev only) - requests 2.32.4 -> 2.33.1 CVE-2026-25645 (medium, runtime) - pytest 8.4.1 -> 9.0.3 CVE-2025-71176 (medium, dev only) - python-multipart 0.0.22 -> 0.0.26 CVE-2026-40347 (medium, runtime) - python-dotenv 1.1.1 -> 1.2.2 CVE-2026-28684 (medium, runtime) - pygments 2.19.2 -> 2.20.0 CVE-2026-4539 (low, transitive) Secondary: - pytest-asyncio ^0.23 -> ^1.3.0 (pytest 9 requires it). - 3 test files reformatted by black 26 so the lint gate passes: tests/test_redos_safety.py, tests/test_function_calling_unit.py, tests/test_session_complete.py. Full suite: 650 passed, 31 skipped under pytest 9.0.3. Supersedes PR #10 (Dependabot's grouped bump) - that PR's CI was red on black 26 formatting; this consolidates the fix plus adds the Pygments transitive that Dependabot did not surface. * chore: refresh retired model reference in compat report The /v1/compatibility response suggested claude-3-5-haiku as a 'more focused response' alternative when temperature is passed - that model was retired in 2.7.0. Point at claude-haiku-4-5-20251001, the current FAST_MODEL.
…#14) Issues are disabled on this repo, so the weekly check-sdk-version workflow has been failing at `gh issue create` whenever the pin falls behind PyPI (run 25001796671). Replace the issue step with a GITHUB_STEP_SUMMARY write; the existing `::warning::` annotation still surfaces drift on the run page. Drop the now-unused `issues: write` permission.
Per the weekly SDK drift check (now passing after PR #14). The 0.1.68 release adds an explicit `sniffio >= 1.0.0` dependency, which the lock picks up as 1.3.1; otherwise no transitive movement. The `[otel]` extra stays on the pin because the SDK still imports `opentelemetry.propagate` unconditionally. Bumps version to 2.9.5 and rolls in the prior CI-only change from PR #14. Tests: 650 passed, 31 skipped (unchanged from v2.9.4).
…models from upstream, SDK-drift auto-PR (#17) * feat: dynamically refresh Anthropic model list (RichardAtCT#46) * feat: dynamically refresh Anthropic model list * fix: harden /v1/models cache and resolve default model live - Lock + double-check refresh path so concurrent requests at TTL expiry don't stampede the Anthropic Models API. - Use a short MODEL_LIST_ERROR_TTL_SECONDS (default 60s) for the fallback cache so transient outages don't suppress live discovery for a full hour. - Populate `created` (unix timestamp) on both live and fallback /v1/models entries to match OpenAI's model object schema. - Resolve DEFAULT_MODEL at startup by picking the latest Sonnet from the live Models API; honor explicit DEFAULT_MODEL env override. * docs: clarify ANTHROPIC_API_KEY is optional for live model discovery - README: expand env vars table with ANTHROPIC_API_KEY (optional), DEFAULT_MODEL, FAST_MODEL, CLAUDE_MODELS_OVERRIDE, and the model list cache/timeout knobs. Rewrite the Supported Models section to explain the live-vs-static behavior and refresh the catalog around Claude 4.6 family. Bump model examples to claude-sonnet-4-6. - .env.example: add a Model Discovery (optional) block documenting ANTHROPIC_API_KEY, CLAUDE_MODELS_OVERRIDE, and the cache TTLs; comment out DEFAULT_MODEL so live resolution drives it by default. - main.py: log a single explicit info line at startup when live discovery is disabled (no ANTHROPIC_API_KEY) so operators see whether the dynamic path activated. - tests: cover the new disabled-path log and update the env-key gate in the existing resolve_default_model test. * chore(v2.9.6): SDK 0.1.81 bump, urllib3/python-multipart sec fixes, SDK-drift workflow auto-PR - claude-agent-sdk 0.1.68 -> 0.1.81 (13 patch releases since v2.9.5). - python-multipart ^0.0.26 -> ^0.0.27 (GHSA-pp6c-gr5w-3c5g, supersedes Dependabot PR #16). - urllib3 security floor >=2.6.3 -> >=2.7.0 (GHSA-qccp-gfcp-xxvc, GHSA-mf9v-mfxr-j63j). - check-sdk-version.yml opens a draft chore/sdk-bump-<latest> PR on drift instead of only writing to the run summary. Permissions widened to contents: write + pull-requests: write; idempotent by head branch; fallback summary still fires. Lockfile regenerated locally with Poetry 2.3.4. Full suite at 664 passed, 31 skipped (+14 from upstream test_dynamic_models.py picked up in the prior cherry-pick). * docs(readme): bump to v2.9.6, document new model-discovery env vars, tighten supported-models intro - Version 2.9.3 -> 2.9.6 in header and docker pin example - Test count 650 -> 664 in Status and Testing sections - Add 2.9.6 highlight bullet covering SDK 0.1.81, urllib3/python-multipart sec fixes, upstream PR RichardAtCT#46 dynamic-models sync, and check-sdk-version auto-PR - Add ANTHROPIC_MODELS_URL, ANTHROPIC_VERSION, ANTHROPIC_BETA/ANTHROPIC_BETA_HEADER rows to the env var table (advanced overrides for the new live-discovery path) - Tighten the Supported Models intro paragraph (was 3 dense sentences) --------- Co-authored-by: Richard A <richardatk01@gmail.com>
Periodic background coroutine probes claude-agent-sdk via existing verify_cli() (1-turn Hello query) when CLAUDE_AUTH_METHOD=claude_cli. Default interval 600s, configurable via CLI_AUTH_PROBE_INTERVAL_SECONDS, 0 to disable. /v1/auth/status exposes new cli_health block (ok, last_probed_at, last_ok_at, error_kind, error_message). POST /v1/chat/completions and POST /v1/messages now return HTTP 401 with error.type=authentication_error and code=claude_cli_not_authenticated when the most recent probe failed, instead of letting the request reach the SDK and surface as 502 or fall through to the 503 config check. OpenAI / Anthropic client libraries route 401 as AuthenticationError. Defense-in-depth: _build_sdk_error_response now scans stderr_tail + error_message for known CLI-auth markers (not logged in, please run /login, invalid api key, authentication_error, 401). On a match it returns 401 instead of 502 and seeds cli_health failed so the next request fails fast. Auth-failure responses bypass the global http_exception_handler (which rewrites bodies to error.type=api_error) by returning JSONResponse directly, so the authentication_error literal reaches clients. Tests: 673 passing, 31 skipped (+9 from v2.9.6 baseline of 664/31). - TestProbeCliAuth: 3 async tests for probe classification - TestChatCompletionsCliHealthGate + TestAnthropicMessagesCliHealthGate: in-process TestClient assertions on the 401 surface - TestCliAuthFailureToFourOhOne: 4 stderr-mapping tests including a 502 regression guard and a cli_health-seeding test
Author
|
Opened against the wrong fork; will reopen against ttlequals0/claude-code-openai-wrapper. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
CLAUDE_AUTH_METHOD=claude_cli, the lifespan now runs a periodic background coroutine that calls the existingclaude_cli.verify_cli()(a 1-turnquery(prompt="Hello", max_turns=1)) and updates a sharedcli_healthstate. Default interval 600s, configurable viaCLI_AUTH_PROBE_INTERVAL_SECONDS, set 0 to disable. Skipped for non-cli auth methods.POST /v1/chat/completionsandPOST /v1/messagesnow return HTTP 401 witherror.type=authentication_erroranderror.code=claude_cli_not_authenticatedwhen the most recent probe failed. OpenAI / Anthropic client libraries route 401 asAuthenticationError, giving callers a durable signal instead of a transient 502/503./v1/auth/statusexposes the newcli_healthblock:ok,last_probed_at,last_ok_at,error_kind(auth_failure|unknown|null),error_message._build_sdk_error_responsescansstderr_tail+error_messagefor known CLI-auth markers (not logged in,please run /login,invalid api key,authentication_error,401). On a match it returns 401 instead of 502 and seedscli_healthfailed so the next request fails fast.http_exception_handler(which would rewrite bodies toerror.type=api_error) by returningJSONResponsedirectly.Version
2.9.7(bumped insrc/__init__.pyandpyproject.toml).Test plan
TestProbeCliAuth(3): success /Not logged instderr / generic exception classification.TestChatCompletionsCliHealthGate+TestAnthropicMessagesCliHealthGate(2): in-process TestClient assertions on the 401 surface.TestCliAuthFailureToFourOhOne(4): stderr-mapping including a 502 regression guard and a real request seedscli_health.CLAUDE_CONFIG_DIR=$(mktemp -d) HOME=$EMPTY CLAUDE_AUTH_METHOD=claude_cli: both/v1/chat/completionsand/v1/messagesreturn HTTP 401 with the OpenAI-shaped body before any SDK round-trip;/v1/auth/statusshowscli_health.ok=false./v1/auth/statusreturnscli_health.ok=truewithlast_ok_atpopulated.Docker
Image
ttlequals0/claude-code-openai-wrapper:2.9.7already pushed (sha256:39fc12f1dd5fa15b8752a384f03b839f89684d356bc5de40ab05f477975ca22f);:latestrepointed to the same digest. Trivy: 14 HIGH/CRITICAL CVEs, all unfixed Debian base-OS packages, identical to v2.9.6's set per prior triage. Portainer webhook fired.Order-of-operations note: the build-and-push doc puts PR creation as Step 10 (after image push); for this repo the better sequence is PR -> CodeQL green -> build/push, so CodeQL gates the image. Saved as feedback for future runs.