Add DataDome server-side validation and multi-backend routing#470
Add DataDome server-side validation and multi-backend routing#470
Conversation
|
|
||
| // DataDome server-side validation (if enabled) | ||
| // This validates requests at the edge before they reach the origin | ||
| match crate::integrations::datadome::validate_request_server_side(settings, &req) { |
There was a problem hiding this comment.
❓ This runs on every proxied request — CSS, JS, images, fonts all trigger a DataDome HTTP call (200ms timeout). Most bot protection validates only document/navigation requests. Is this intentional? If so, worth a comment explaining why.
|
|
||
| // DataDome server-side validation (if enabled) | ||
| // This validates requests at the edge before they reach the origin | ||
| match crate::integrations::datadome::validate_request_server_side(settings, &req) { |
There was a problem hiding this comment.
🔧 P1 — Validation runs on every request including static assets
validate_request_server_side is called unconditionally — images, CSS, JS, fonts all trigger a blocking HTTP call to DataDome (200ms default timeout). This:
- Adds latency to every static asset
- Multiplies DataDome API usage/cost
- Could hit DataDome rate limits
DataDome validation is typically needed only for document/HTML requests.
Fix: Filter by request path or accept header to skip known static assets, or at minimum only validate navigational requests.
c9e3ce1 to
5cace16
Compare
|
@ChristianPavilonis To fix and get ready for production |
| }) | ||
| }) | ||
| }) | ||
| .transpose()?; |
There was a problem hiding this comment.
🔧 wrench — Invalid path_regex is detected too late and then silently disables routing.
validate_path_pattern (settings.rs) only checks path_prefix/path_regex mutual exclusivity. Regex syntax is compiled here lazily, and BackendRouter::new is invoked per-request in handle_publisher_request with .ok() swallowing errors. A typo in a customer's backends.toml will:
- Pass
build.rsparsing - Pass runtime
Settings::validate() - Log an error on every request
- Silently fall through to default origin → routing broken, app keeps serving
This violates CLAUDE.md: "config-derived regex/pattern compilation must not use panic-prone expect()/unwrap(); invalid enabled config should surface as startup/config errors" and "Invalid enabled integrations/providers must not be silently logged-and-disabled during startup or registration."
Fix: validate regex syntax inside validate_path_pattern (try Regex::new(r).map_err(...)), and/or construct a BackendRouter inside Settings::validate() so bad config → startup failure. Then in handle_publisher_request the router build becomes infallible (or unwraps an already-validated router).
|
|
||
| // DataDome server-side validation (if enabled) | ||
| // This validates requests at the edge before they reach the origin | ||
| match crate::integrations::datadome::validate_request_server_side(settings, &req) { |
There was a problem hiding this comment.
🔧 wrench — DataDome validation runs on every publisher request, including static assets.
handle_publisher_request is invoked for every proxied path — HTML, CSS, JS, images, fonts. With server_side_enabled=true every one of those enters validate_request_server_side. Unsampled requests still pay JSON-deserialize + validate + Arc-alloc; sampled requests pay a 200 ms round-trip to api-fastly.datadome.co per asset.
This is the same finding from the 2026-03-20 and 2026-04-14 reviews at this exact line — still not addressed.
Fix: filter to navigational/document requests before invoking (skip common static extensions, or use Sec-Fetch-Dest/Accept: image/*). Document the selected filter so operators know what is and isn't protected.
| req: &Request, | ||
| ) -> Result<bool, Report<TrustedServerError>> { | ||
| match build(settings) { | ||
| Some(integration) => integration.validate_request(req), |
There was a problem hiding this comment.
🔧 wrench — Per-request DataDome config deserialization + Arc allocation.
Every validate_request_server_side call goes through build(settings) → serde_json::from_value::<DataDomeConfig>(raw.clone()) + config.validate() + Arc::new(...). Prior OnceLock caching was removed (rightly, for test isolation) but nothing replaced it. Combined with the finding in publisher.rs (validation running on every asset), each asset request pays the full deserialize/validate cost.
Fix: build the integration once during get_settings() (or at handler init if you introduce a HandlerContext) and pass &DataDomeIntegration into validate_request_server_side instead of &Settings.
| log::warn!("[datadome] server_side_enabled = true has no effect when enabled = false; server-side validation is inactive"); | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
🔧 wrench — The misconfiguration warning re-deserializes on every disabled-but-configured request.
When enabled=false and a raw [integrations.datadome] section exists, this Ok(None) branch still runs serde_json::from_value::<DataDomeConfig>(raw.clone()) per request to decide whether to log a warning. Move this to startup (one-time pass in get_settings() or Settings::validate()). A warning logged once at boot is what operators want; logging it per-request floods the log and keeps paying deserialize cost.
| validation_req.set_header( | ||
| "x-datadome-params:clientid", | ||
| urlencoding::encode(datadome_cookie).as_ref(), | ||
| ); |
There was a problem hiding this comment.
❓ question — x-datadome-params:clientid is still URL-encoded while ip is raw.
Commit 01e02e2c changed ip to raw per DataDome's contract, but left clientid wrapped in urlencoding::encode. Since extract_datadome_cookie already returns just the token value (no ; or ,), the encoding is likely unnecessary and could mismatch what DataDome expects.
Please link the DataDome spec line that requires encoding for clientid but not ip, or drop the urlencoding::encode.
| # | ||
| # This file is merged into the embedded binary config at build time by | ||
| # crates/trusted-server-core/build.rs. It is separate from trusted-server.toml to keep | ||
| # customer-specific configuration out of the shared application template. |
There was a problem hiding this comment.
📌 out of scope — backends.toml commits ~170 Arena Group publisher domains to the OSS repo.
Third time this has been flagged. Customer business data (newsweek.com, nytimes.com, washingtonpost.com, etc. mapped to raven-public.prod.saymedia.com) is visible to anyone reading the public repo. Either move to a Fastly config/secret store loaded at runtime, or add a documented private overlay file (e.g., backends.private.toml git-ignored).
If the team has decided this is fine, a comment at the top of the file saying so would close the loop.
| } | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
📝 note — Test gap: validate_request_inner HTTP path has no coverage.
All existing tests exit early (disabled, missing-key, sample=0, sample=100+missing-key). None exercises the actual HTTP request/response flow: header construction, 200→allow, 403→block, error→fail-open.
Viceroy can host a mock backend for this. Without it, a regression like lowercasing x-datadome-params:Key or dropping the method header would ship silently.
| .next() | ||
| .unwrap_or("api-js.datadome.co") | ||
| // split() on &str always yields at least one element, so this is unreachable | ||
| .unwrap_or(url) |
There was a problem hiding this comment.
⛏ nitpick — .unwrap_or(url) with "unreachable" comment is misleading.
Prefer .expect("split always yields at least one element") — clippy/future-you wouldn't wonder why the fallback was chosen.
| settings.proxy.certificate_check, | ||
| ) | ||
| .map_err(|e| { | ||
| log::error!("Failed to build backend router: {:?}", e); |
There was a problem hiding this comment.
⛏ nitpick — {:?} on Report<_> prints the full debug backtrace; {err} (Display) is usually what you want in edge logs. Same on line 343. Fine if you're sinking to Fastly real-time logging with structured output — confirm before changing.
| /crates/integration-tests/browser/test-results/ | ||
| /crates/integration-tests/browser/playwright-report/ | ||
| /crates/integration-tests/browser/.browser-test-state.json | ||
|
|
There was a problem hiding this comment.
⛏ nitpick — Global *.vcl ignore with no .vcl files in tree. Scope to a specific directory or drop until there's content.
|
@ChristianPavilonis Please split into 2 prs |
38bb4d4 to
84f091b
Compare
No description provided.