feat: API timeout config, Retry-After support, and configurable retry#2816
Open
TheArchitectit wants to merge 10 commits into
Open
feat: API timeout config, Retry-After support, and configurable retry#2816TheArchitectit wants to merge 10 commits into
TheArchitectit wants to merge 10 commits into
Conversation
…e retry - Add TimeoutConfig to HTTP client builder with connect_timeout (30s) and request_timeout (5min) defaults, configurable via CLAW_API_CONNECT_TIMEOUT and CLAW_API_REQUEST_TIMEOUT env vars - Add with_timeout() builder to both AnthropicClient and OpenAiCompatClient for per-client timeout configuration - Parse Retry-After header on 429 responses and use it to override exponential backoff delay when present - Add ApiTimeoutConfig to runtime config with apiTimeout settings in ~/.claw/settings.json (connectTimeout, requestTimeout, maxRetries) - Add retry_after field to ApiError::Api for propagating rate limit backoff hints through the retry pipeline
Some providers/proxies return HTTP 400 with bodies like "no parseable body" or "connection reset" during transient network blips. These are not real bad requests — they're gateway errors wearing a 400 mask. Detect known gateway error phrases in 400 response bodies and mark them as retryable so the existing exponential backoff handles them.
Some OpenAI-compat backends (e.g. glm-5.1-fast) return 400 with "no parseable body" when the request payload is too large to parse, rather than a proper context_length_exceeded error. Without this marker, is_context_window_error() returns false and the auto-compact retry loop never triggers — the user just sees an opaque 400 error. 💘 Generated with Crush Assisted-by: GLM 5.1 FP8 via Crush <crush@charm.land>
Some OpenAI-compatible providers (e.g., GLM-5) omit the `id` field in streaming and non-streaming responses. Adding #[serde(default)] allows the parser to accept these responses instead of failing with "missing field `id`". Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds scripts/install.sh that builds the release binary and links it to ~/.local/bin/claw. Run after code changes to update the CLI. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
When a provider returns HTML (e.g., error page, wrong endpoint) instead of JSON in an SSE stream, provide a clear error message instead of hanging or failing with a cryptic parse error. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
When a provider returns a JSON error (e.g., {"error":{"message":"..."}})
without SSE framing (no "data:" prefix), the SSE parser was silently
ignoring it and hanging. Now detects and surfaces these errors.
Also handles HTML responses that lack SSE framing.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Some providers (GLM, DeepSeek) emit reasoning tokens in `reasoning_content` or nested `thinking.content` fields instead of `content`. Added support for these fields so reasoning models work correctly. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The final streaming chunk from some providers contains only finish_reason and usage, with no delta field. Made it optional to prevent parse errors. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
When preserve_recent_messages == 0, raw_keep_from equals messages.len(), causing index out of bounds when accessing session.messages[k]. Added k >= session.messages.len() check to prevent panic. Reason: Compaction with preserve_recent_messages=0 triggered OOB access when checking for tool-use/tool-result pair preservation at boundary. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
9ab2ecb to
1c54c0d
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
TimeoutConfigto HTTP client builder withconnect_timeout(30s default) andrequest_timeout(5min default)CLAW_API_CONNECT_TIMEOUTandCLAW_API_REQUEST_TIMEOUTenv varswith_timeout()builder to bothAnthropicClientandOpenAiCompatClientRetry-Afterheader on 429 responses and respect it over exponential backoffapiTimeoutconfig block to~/.claw/settings.jsonwithconnectTimeout,requestTimeout, andmaxRetriesfieldsretry_afterfield toApiError::Apifor propagating rate-limit backoff hintsis_retryable_400()to detect transient gateway 400 errors (not real bad requests)"no parseable body"toCONTEXT_WINDOW_ERROR_MARKERSProblem this solves
Hung API calls
Before: a request to a slow/unresponsive API endpoint would block indefinitely. No timeout, no way to configure one.
Fix:
TimeoutConfigwith sensible defaults (30s connect, 5min request) and env var / settings.json overrides.Ignored Retry-After headers
Before: on 429 (rate-limited) responses, the client always used exponential backoff, ignoring the provider's
Retry-Afterheader. This caused unnecessary delays or premature retries.Fix:
parse_retry_after()extracts the header value, and the retry loop respects it over exponential backoff when present.Transient gateway 400 errors treated as fatal
Before: some providers (especially OpenAI-compat backends like glm-5.1-fast) return 400 with bodies like
"HTTP 400 from backend (no parseable body)"or"connection reset by peer"— these are not real bad requests. They're transient gateway errors caused by the backend being overwhelmed or unable to parse an oversized payload. The client treated all 400s as fatal, immediately failing the request.Fix:
is_retryable_400()checks 400 response bodies for transient gateway error signatures and marks them as retryable, so the retry loop can attempt the request again.Context window overflow disguised as 400
Before: when a request exceeds the model's context window, some OpenAI-compat backends can't even parse the oversized payload and return 400
"no parseable body"instead of a propercontext_length_exceedederror. Without recognizing this as a context overflow,is_context_window_error()returns false and the auto-compact retry loop (#2808) never triggers — the user sees an opaque 400 with no recovery path.Fix: Added
"no parseable body"toCONTEXT_WINDOW_ERROR_MARKERSsois_context_window_error()correctly identifies these disguised context overflow errors. This enables the progressive auto-compact retry loop (PR #2808) to kick in and shrink the session until it fits.After
Retry-Afterheader use the provider's suggested delaysettings.json:{ "apiTimeout": { "connectTimeout": 30, "requestTimeout": 300, "maxRetries": 8 } }Test plan
cargo test --workspace— all tests pass (1 pre-existing env-specific failure inlsp_discovery)cargo build --release— clean buildRetry-Afterheader is respected on 429settings.jsonapiTimeout overrides defaultsis_retryable_400()correctly classifies transient gateway 400s💘 Generated with Crush