Skip to content

feat: add per-credential response-header-timeout and treat 524 as transient error#2060

Open
shellus wants to merge 3 commits intorouter-for-me:mainfrom
shellus:feat/response-header-timeout
Open

feat: add per-credential response-header-timeout and treat 524 as transient error#2060
shellus wants to merge 3 commits intorouter-for-me:mainfrom
shellus:feat/response-header-timeout

Conversation

@shellus
Copy link

@shellus shellus commented Mar 11, 2026

Summary

  • Add response-header-timeout config field to claude-api-key entries (per-credential, in seconds)
  • Only limits wait time for the upstream to start responding — streaming responses are not affected
  • Add HTTP 524 (Cloudflare timeout) to the transient error list in conductor
  • Wrap Go net.Error timeout as statusErr{code: 504} so conductor properly applies 1-minute cooldown on timeout failures

Motivation

Upstreams behind Cloudflare CDN have a ~100s default 524 timeout. Without per-credential timeout control, each failing credential wastes 2+ minutes before failover triggers. With response-header-timeout: 15, failover triggers within ~15 seconds. The 504 wrapping ensures the failing credential enters cooldown, so subsequent requests skip it immediately.

Config example

claude-api-key:
  - api-key: sk-xxx
    base-url: https://example.com
    response-header-timeout: 15  # 15s to start responding, then no limit

Files changed

File Change
internal/config/config.go Add ResponseHeaderTimeout field to ClaudeKey
internal/watcher/synthesizer/config.go Pass timeout to Auth.Attributes
internal/runtime/executor/proxy_helpers.go Apply ResponseHeaderTimeout to http.Transport
sdk/cliproxy/auth/conductor.go Add 524 to transient error list
internal/runtime/executor/claude_executor.go Wrap net.Error timeout as statusErr{code: 504}

Test plan

  • Configure response-header-timeout: 15 on an upstream behind Cloudflare
  • Verify failover triggers at ~15s instead of ~100s when upstream is unresponsive
  • Verify the timed-out credential enters 1-minute cooldown (next request skips it)
  • Verify streaming responses are not interrupted by this timeout
  • Verify upstreams without response-header-timeout behave as before (no timeout)

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly improves the resilience and responsiveness of the proxy when dealing with slow or unresponsive upstream services, particularly those behind Cloudflare. By allowing per-credential configuration of response header timeouts and treating Cloudflare's 524 timeout as a transient error, the system can detect and recover from issues much faster, reducing overall request latency during credential failures from several minutes to under a minute.

Highlights

  • Per-credential response header timeout: Introduced a response-header-timeout configuration field for claude-api-key entries, allowing users to specify how long to wait for the first response header from upstream HTTP clients. This timeout does not apply once streaming begins.
  • HTTP 524 as transient error: Added HTTP status code 524 (Cloudflare timeout) to the list of transient errors, enabling standard cooldown and retry mechanisms for these failures.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • internal/config/config.go
    • Added ResponseHeaderTimeout field to the ClaudeKey struct.
  • internal/runtime/executor/proxy_helpers.go
    • Imported strconv package.
    • Implemented logic to read response_header_timeout from authentication attributes.
    • Applied the ResponseHeaderTimeout to the http.Transport when a proxy is used.
    • Applied the ResponseHeaderTimeout to a new http.Transport when no proxy transport is set.
  • internal/watcher/synthesizer/config.go
    • Added logic to convert ResponseHeaderTimeout from ClaudeKey to a string and store it in authentication attributes if greater than 0.
  • sdk/cliproxy/auth/conductor.go
    • Included HTTP status code 524 in the list of transient upstream errors.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@shellus shellus force-pushed the feat/response-header-timeout branch from 977edce to b5b601a Compare March 11, 2026 08:52
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a per-credential response-header-timeout and adds HTTP 524 to the list of transient errors. The changes are generally well-implemented, but I found a scenario where the new timeout is not applied correctly. Specifically, if an http.Transport is provided via context, the response-header-timeout is ignored. I've suggested a fix to address this. The addition of HTTP 524 is correct.

Note: Security Review did not run due to the size of the PR.

Comment on lines +77 to +82
// Apply ResponseHeaderTimeout when no proxy transport was set.
if respHeaderTimeout > 0 && httpClient.Transport == nil {
httpClient.Transport = &http.Transport{
ResponseHeaderTimeout: respHeaderTimeout,
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The current logic for applying ResponseHeaderTimeout doesn't handle the case where an http.Transport is already set on the httpClient from the context. The timeout is only applied if httpClient.Transport is nil.

This means if a transport is provided via ctx.Value("cliproxy.roundtripper"), the per-credential timeout will be ignored, which is likely not the intended behavior.

I suggest modifying this block to also handle cases where an *http.Transport is already present. It's important to clone the existing transport before modifying it to avoid side effects on shared transports.

	// Apply ResponseHeaderTimeout if set.
	if respHeaderTimeout > 0 {
		if transport, ok := httpClient.Transport.(*http.Transport); ok {
			// Clone the transport to avoid modifying a shared instance.
			clonedTransport := transport.Clone()
			clonedTransport.ResponseHeaderTimeout = respHeaderTimeout
			httpClient.Transport = clonedTransport
		} else if httpClient.Transport == nil {
			httpClient.Transport = &http.Transport{
				ResponseHeaderTimeout: respHeaderTimeout,
			}
		}
	}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 6aeb2aa. Now the code clones the existing transport (if it's *http.Transport) and applies ResponseHeaderTimeout to the clone, avoiding modification of shared instances. Thanks for catching this!

@shellus shellus force-pushed the feat/response-header-timeout branch from b5b601a to 6aeb2aa Compare March 11, 2026 09:25
shellus added 2 commits March 11, 2026 17:25
…nsient error

- Add ResponseHeaderTimeout config field to ClaudeKey for per-upstream
  timeout control (in seconds). Only limits wait for first response
  header; streaming responses are unaffected.
- Pass timeout value through Auth.Attributes to executor layer.
- Apply ResponseHeaderTimeout to http.Transport in proxy_helpers.go,
  covering proxy, context-RoundTripper, and bare-transport code paths.
- Add 524 (Cloudflare timeout) to transient error list in conductor.go.
- Wrap Go net.Error timeout as statusErr{code: 504} in claude_executor.go
  so conductor properly applies 1-minute cooldown on timeout failures,
  preventing repeated 15s waits on the same failing upstream.
Copy link
Collaborator

@luispater luispater left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the improvement here. The intent makes sense, but I think this still has two blocking issues:

  • Blocking: the new transient handling is only wired into applyAuthFailureState, while normal request execution records failures through the model-scoped branch in MarkResult. That branch still only treats 408/500/502/503/504 as transient and still hard-codes a 1 minute cooldown. In practice, 524 will not trigger cooldown for normal Claude requests, and transient-error-cooldown will be ignored as well.
  • Blocking: in newProxyAwareHTTPClient, when response_header_timeout is set and there is no proxy/context transport, the code creates a zero-value http.Transport instead of cloning the default transport. That drops existing default behavior such as ProxyFromEnvironment, dial/TLS timeouts, HTTP/2, and idle connection settings.

I’d recommend fixing those two paths and adding targeted tests for model-scoped cooldown handling plus the no-proxy transport construction.

@xkonjin
Copy link

xkonjin commented Mar 11, 2026

Code Review

Good intent on per-credential timeouts and treating 524 as transient. Found two blocking issues.

Blocking Issues

1. Transient 524 handling not wired for normal request path

The new transient error logic only applies to applyAuthFailureState:

// new transient error added here
if statusCode == 524 || ... {
    ...
}

But normal Claude requests record failures through the model-scoped branch in MarkResult, which still hardcodes:

if statusCode == 408 || statusCode == 500 || ... || statusCode == 504 {
    // 524 is NOT in this list
    ...
}

In practice:

  • Normal Claude requests with 524 will NOT trigger cooldown
  • The configured transient-error-cooldown setting is ignored for normal execution

2. Zero-value Transport drops defaults

In newProxyAwareHTTPClient:

if responseHeaderTimeout > 0 {
    return &http.Client{
        Timeout: totalTimeout,
        Transport: &http.Transport{
            ResponseHeaderTimeout: responseHeaderTimeout,
        },
    }
}

When no proxy/context transport exists, this creates a zero-value http.Transport, dropping:

  • ProxyFromEnvironment environment proxy
  • Default dial/TLS timeouts
  • HTTP/2 support
  • Idle connection pooling

Instead, clone http.DefaultTransport and set the field:

if responseHeaderTimeout > 0 {
    transport := http.DefaultTransport.(*http.Transport).Clone()
    transport.ResponseHeaderTimeout = responseHeaderTimeout
    return &http.Client{
        Timeout:   totalTimeout,
        Transport: transport,
    }
}

Tests

Add:

  • Test that model-scoped MarkResult treats 524 as transient and applies configured cooldown
  • Test that custom response-header-timeout with no proxy preserves ProxyFromEnvironment

Summary

Request changes before merge. The transient handling needs full coverage, and the transport construction should preserve default behavior.

- Add 524 to transient error list in MarkResult model-scoped branch
- Add transient_error_cooldown config support in model-scoped branch
- Use http.DefaultTransport.Clone() instead of zero-value Transport

Fixes code review feedback from PR router-for-me#2060
@shellus shellus force-pushed the feat/response-header-timeout branch from e0de114 to 0c70af4 Compare March 12, 2026 01:58
@shellus
Copy link
Author

shellus commented Mar 12, 2026

Thanks for the detailed review! Both issues have been fixed in the latest commit.

Issue A: 524 not covered in model-scoped path

Fixed in conductor.go:1668

Added 524 to the transient error list and implemented transient_error_cooldown config support in the model-scoped branch, matching the behavior in applyAuthFailureState.

case 408, 500, 502, 503, 504, 524:
    if quotaCooldownDisabledForAuth(auth) {
        state.NextRetryAfter = time.Time{}
    } else {
        cooldown := 1 * time.Minute
        if auth.Attributes != nil {
            if v, ok := auth.Attributes["transient_error_cooldown"]; ok {
                if secs, err := strconv.Atoi(v); err == nil && secs > 0 {
                    cooldown = time.Duration(secs) * time.Second
                }
            }
        }
        next := now.Add(cooldown)
        state.NextRetryAfter = next
    }

Now both normal Claude requests (99%) and auth requests (1%) properly handle 524 with configurable cooldown.

Issue B: Zero-value Transport drops defaults

Fixed in proxy_helpers.go:84

Replaced zero-value http.Transport with http.DefaultTransport.Clone() to preserve default configurations:

} else if httpClient.Transport == nil {
    transport := http.DefaultTransport.(*http.Transport).Clone()
    transport.ResponseHeaderTimeout = respHeaderTimeout
    httpClient.Transport = transport
}

This ensures ProxyFromEnvironment, connection pooling, HTTP/2, and default timeouts are preserved.


Both fixes are now live in the PR branch. Ready for re-review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants