Skip to content

fix(web_fetch): switch to wreq Firefox emulation to bypass TLS bot detection#428

Merged
khj809 merged 3 commits into
developfrom
feat/web-fetch-wreq-emulation
Jun 11, 2026
Merged

fix(web_fetch): switch to wreq Firefox emulation to bypass TLS bot detection#428
khj809 merged 3 commits into
developfrom
feat/web-fetch-wreq-emulation

Conversation

@khj809

@khj809 khj809 commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Replace reqwest::Client with wreq::Client using Emulation::Firefox135 in web_fetch
  • Add stream feature to wreq dependency to enable bytes_stream() support
  • Update network integration test URL to AccuWeather and fix the corresponding body assertion

Background

The reqwest crate's rustls backend produces a distinctive TLS ClientHello fingerprint that Akamai and similar WAFs detect and block at the handshake level — before any HTTP-layer headers are even inspected. This caused web_fetch to fail with a transport-level error (HTTP/2 INTERNAL_ERROR or ECONNRESET) on sites protected by Akamai, even when using a browser-like User-Agent string.

The web_search Bing engine already solved the same problem using wreq with Emulation::Firefox135, which replicates the Firefox TLS handshake (cipher suites, extensions, GREASE values) to pass fingerprint-based bot detection. This PR applies the same fix to web_fetch.

Changes

File Change
Cargo.toml Add features = ["stream"] to wreq for bytes_stream()
src/tool/impl/builtins/web_fetch.rs Swap reqwest::Clientwreq::Client with Firefox emulation; remove dead USER_AGENT constant
src/tool/impl/builtins/web_fetch.rs (tests) Update network test URL to AccuWeather (분당구) and fix body assertion

Test plan

  • All 14 unit tests in web_fetch pass (cargo test -p ailoy web_fetch)
  • Network integration test network_single_fetch_returns_200_and_body passes against AccuWeather with --include-ignored
  • Network integration test network_single_fetch_html_returns_raw_markup still passes

🤖 Generated with Claude Code

khj809 and others added 2 commits June 11, 2026 14:05
…tection

reqwest's rustls backend has a distinctive TLS fingerprint that Akamai and
similar WAFs block at the handshake level. Switching to wreq with
Emulation::Firefox135 replicates the Firefox TLS ClientHello, matching
the approach already used by the Bing web_search engine.

Also updates the network integration test URL to AccuWeather (분당구) and
adds the `stream` feature to wreq for bytes_stream() support.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@khj809 khj809 self-assigned this Jun 11, 2026
@khj809 khj809 requested a review from nuri-yoo June 11, 2026 05:07
Comment thread src/tool/impl/builtins/web_fetch.rs
wreq's ClientBuilder defaults to redirect::Policy::none(), unlike
reqwest which follows up to 10 redirects by default. Add explicit
Policy::limited(10) to match the previous behavior.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

@nuri-yoo nuri-yoo left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@khj809 khj809 merged commit 6bb4254 into develop Jun 11, 2026
@khj809 khj809 deleted the feat/web-fetch-wreq-emulation branch June 11, 2026 09:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants