Add Sourcepoint first-party CMP integration#625
Add Sourcepoint first-party CMP integration#625ChristianPavilonis wants to merge 20 commits intomainfrom
Conversation
… through first-party proxy
… URLs to first-party paths
…visibility, improve docs
aram356
left a comment
There was a problem hiding this comment.
Summary
Adds a well-structured Sourcepoint CMP first-party proxy with four rewriting layers (HTML attributes, JS body, runtime config trap, client-side DOM guard). The implementation follows existing integration patterns closely, has 14 Rust + 6 JS tests, and ships disabled by default. A few items need attention before merge.
Blocking
🔧 wrench
take_body_str()panics on non-UTF-8 upstream responses:response.take_body_str()at line 400 will panic if the upstream CDN returns invalid UTF-8 with a JS Content-Type. Read bytes and attempt conversion with a fallback instead. (sourcepoint.rs:400)- CI failure —
format-typescript: Prettier check is failing ontest/integrations/sourcepoint/script_guard.test.ts. Must be fixed before merge.
❓ question
- Redirect
Locationheaders not rewritten: The JS body rewrite is gated onstatus == 200. If the CDN returns a 3xx redirect with aLocationpointing tocdn.privacy-mgmt.com, the browser follows it back to the third-party CDN, defeating the proxy. (sourcepoint.rs:394)
Non-blocking
🤔 thinking
- Upstream response headers forwarded verbatim:
Set-Cookieand other headers fromcdn.privacy-mgmt.compass through to the client on the non-rewritten path. Same as Didomi, but worth confirming this is intended. (sourcepoint.rs:418) - Regex matches mismatched quotes: The CDN URL pattern can match
'url"(single open, double close). Extremely unlikely in minified JS but could produce malformed output. (sourcepoint.rs:65)
♻️ refactor
- Redundant
rewrite_sdkguard: Checked in bothhandles_attribute()andrewrite()— the second is unreachable. (sourcepoint.rs:437) - Head injector config property names: Hardcoded Sourcepoint config keys could go stale. A test asserting known property names appear in the generated script would catch omissions. (
sourcepoint.rs:467)
🌱 seedling
- No body size limit: The body is read entirely into memory with no upper bound. A
Content-Lengthcheck would prevent OOM from unexpected large responses. (sourcepoint.rs:400) - Missing validation test for invalid
cdn_origin: No test proves thatregister()fails whencdn_originis not a valid URL. (sourcepoint.rs:153)
⛏ nitpick
- Manual
[sourcepoint]log prefix: Other integrations rely on thelogcrate's module path rather than a manual bracketed prefix. (sourcepoint.rs:355)
CI Status
- cargo fmt: PASS
- cargo test: PASS
- vitest: PASS
- format-typescript: FAIL (Prettier issue in
script_guard.test.ts) - format-docs: PASS
- browser integration tests: PASS
- integration tests: PASS
- CodeQL: FAIL (likely infra)
prk-Jr
left a comment
There was a problem hiding this comment.
Reviewed commit a211eb0. Three blocking issues require fixes before merge; the rest are non-blocking.
Blocking (must fix before merge)
B1 — CI: format-typescript still failing on current HEAD
The latest commit (a211eb02) re-introduced a Prettier violation in crates/js/lib/test/integrations/sourcepoint/script_guard.test.ts. CI's format-typescript check will fail on this commit. Fix with:
cd crates/js/lib && npm run format -- --write
then commit the result.
B2 — SSRF via cdn_origin (inline comment on line 96)
See inline comment. A syntactically valid but host-unrestricted cdn_origin allows proxying to any IP including 169.254.169.254 (cloud metadata). Needs a custom host validator.
B3 — PUT/PATCH unreachable (inline comment on line 337)
routes() only registers GET and POST; the PUT/PATCH arms in handle() are dead code. Either register the methods or remove the arms.
Important (non-blocking)
I1 — Single-quoted '/unified/' missed by SP_ORIGIN_UNIFIED_PATTERN (inline on line 81) — fix is a one-character change to the regex character class.
I2 — BackendConfig::from_url per request (inline on line 357) — pre-compute in new() following the pattern in aps.rs / prebid.rs.
I3 — PR checklist says "Uses tracing macros" but codebase uses log crate
The checklist item reads "Uses tracing macros (not println!)" but CLAUDE.md specifies log. The source correctly uses log::info! throughout. The checklist entry should read "Uses log macros (not println!)" to avoid confusion for future contributors.
I4 — JS-rewrite path drops all upstream headers (inline on line 403) — Response::new() discards Vary, CORS, and security headers. Start from response.clone_without_body() instead.
I5 — No test for build_target_url with a path-bearing cdn_origin (inline on line 337) — set_path silently drops any path prefix in cdn_origin. Should be tested and either fixed or documented as an explicit constraint.
Suggestions (non-blocking)
S1 — URL normalisation logic duplicated across Rust and TypeScript
parse_sourcepoint_url in sourcepoint.rs and normalizeSourcepointUrl in script_guard.ts implement identical protocol-relative URL normalisation in parallel. Not a defect, but a cross-reference comment in each file would prevent future fixes being applied to only one side.
S2 — Redundant rewrite_sdk guard in rewrite() (inline on line 437) — unreachable because handles_attribute already gates on it. Either remove with a comment or document the intentional belt-and-suspenders.
S3 — No test for single-quoted CDN URLs (inline on line 81)
S4 — Unchecked WASM build in PR checklist
The test plan has an unchecked - [ ] WASM build item, but CI's WASM build passes. The checkbox should be checked to accurately reflect build status.
S5 — register missing # Examples doc section (inline on line 314) — CLAUDE.md requires this for all public API functions.
- Replace #[validate(url)] with a custom validator that restricts cdn_origin to *.privacy-mgmt.com hosts, preventing SSRF via arbitrary origins (e.g. cloud metadata endpoints). - Remove unreachable PUT/PATCH arms from the request body match since routes() only registers GET and POST. - Fix Prettier formatting in script_guard.test.ts. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Replace take_body_str() with take_body_bytes() + String::from_utf8() to avoid panicking on non-UTF-8 upstream responses. - Rewrite Location headers on 3xx redirects that point to cdn.privacy-mgmt.com so browsers stay on the first-party proxy. - Preserve upstream CORS headers on the JS-rewrite path instead of discarding them when building a fresh Response. - Extend SP_ORIGIN_UNIFIED_PATTERN to match both single- and double-quoted "/unified/" chunk paths, preserving the original quote character in the replacement. - Normalise log prefixes from [sourcepoint] to Sourcepoint: for consistency with APS/Prebid style. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Remove redundant rewrite_sdk check from rewrite() since handles_attribute() already gates on it; update test to verify the guard at the handles_attribute level. - Add # Examples section to register() per documentation standards. - Add tests for cdn_origin validation (rejects non-privacy-mgmt.com hosts, accepts valid origins). - Add test for single-quoted origin+'/unified/' rewrite pattern. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Review feedback addressedAll blocking and important findings have been fixed across three commits. Replies posted on each inline thread. Blocking (fixed)
Important non-blocking (fixed)
Suggestions (fixed)
Intentionally skipped (with rationale in thread replies)
All CI checks pass locally: |
Refactor normalizeSourcepointUrl to remove the bare-domain startsWith check that triggered CodeQL "Incomplete URL substring sanitization" alerts. The host === exact match was already the security boundary; now the normalization layer no longer references the CDN hostname at all, eliminating the static analysis finding. Add a Content-Length guard (5 MB) before reading upstream response bodies into memory for JavaScript rewriting, preventing unbounded memory consumption from unexpectedly large responses. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
aram356
left a comment
There was a problem hiding this comment.
Summary
Well-executed revision — every prior blocker is addressed and the four-layer rewriting architecture (HTML attrs / JS body / window._sp_ trap / client DOM guard) is cleanly separated. Two remaining correctness issues in the upstream response handling need fixes before merge: the body-size guard is bypassed when Content-Length is absent, and the UTF-8 fallback path silently drops upstream headers. A few non-blocking refinements around redirect rewriting, config scope, and test coverage follow.
Blocking
🔧 wrench
- Body-size guard bypassed on chunked/unknown-length responses —
MAX_REWRITE_BODY_SIZEonly applies whenContent-Lengthis present; otherwisetake_body_bytes()reads unbounded (sourcepoint.rs:456-470) - UTF-8 fallback drops upstream headers — the fallback constructs a fresh
Responsewith only status + Content-Type, discarding Set-Cookie, CORS, Cache-Control, Vary, etc., diverging from the passthrough path (sourcepoint.rs:480-488)
Non-blocking
♻️ refactor
- Redirect
Locationrewrite is a literalString::replace— misses protocol-relative//cdn.privacy-mgmt.com/...values (sourcepoint.rs:430-438) validate_cdn_origindoes not check scheme —ftp://cdn.privacy-mgmt.compasses validation and only fails at the first proxied request (sourcepoint.rs:135-150)
🤔 thinking
- Config/rewriter host scope mismatch —
validate_cdn_originaccepts*.privacy-mgmt.combut rewriters hardcodecdn.privacy-mgmt.com; configuring a different subdomain silently half-enables the integration (sourcepoint.rs:143, 69, 197, 598) - Rewritten-JS path hard-codes
Cache-Control— diverges from the passthrough path'sapply_cache_headerspolicy without explanation (sourcepoint.rs:498-501) - Property trap only catches reassignment of
window._sp_— nested mutation (window._sp_.config.X = ...) is not patched; alsos.replace()in the JS helper is first-occurrence only (sourcepoint.rs:568-600)
🌱 seedling
copy_headersforwardsAuthorization— credential-leak footgun if publishers reuse bearer tokens across origins (sourcepoint.rs:222-233)- No direct test for
handle()— redirect rewriting, UTF-8 fallback, Content-Length guard, Content-Type gating all lack unit coverage (sourcepoint.rs:372-522)
👍 praise
- Solid response to the prior review — every prior 🔧/❓ addressed with well-chosen fixes; four-layer rewriting architecture is cleanly separated and well documented
CI Status
- cargo fmt: PASS
- cargo clippy: PASS
- cargo test: PASS
- vitest: PASS
- format-typescript: PASS
- format-docs: PASS
- browser integration tests: PASS
- CodeQL: PASS
…t rewrite - Skip JS rewrite when Content-Length is absent (chunked/unknown) to prevent unbounded memory reads - Reuse original response on UTF-8 fallback to preserve upstream headers - Extract redirect Location rewriting into a URL-parsing helper that handles absolute, protocol-relative, and relative redirects - Tighten cdn_origin validation to exactly cdn.privacy-mgmt.com and reject non-HTTP(S) schemes - Drop Authorization from forwarded headers to prevent credential leakage - Document Cache-Control divergence and property trap limitations - Add 7 new tests for validation and redirect rewriting
aram356
left a comment
There was a problem hiding this comment.
Summary
Round-3 blockers are all correctly addressed — UTF-8 fallback now reuses the original response (preserving headers), Content-Length-missing short-circuits the rewrite before take_body_bytes, rewrite_redirect_location handles absolute / protocol-relative / relative via Url::join, validate_cdn_origin checks scheme and pins the host, and Authorization is excluded from forwarded headers. No remaining 🔧/❓ findings. Submitting as change-request per author request; the items below are refinements (🤔/♻️/🌱/⛏) rather than correctness blockers — each can be resolved or deferred to a follow-up at the author's discretion.
Non-blocking
♻️ refactor
apply_cache_headersinvoked at five return sites — redirect branch, CL-too-large branch, CL-missing branch, UTF-8-fallback branch, and the fallthrough (sourcepoint.rs:476, 505, 513, 528, 567). Restructuringhandle()around a single exit point (or wrapping the proxiedResponsein a post-handle adapter) would remove the per-branch bookkeeping. Low priority.
🌱 seedling
- No direct test for
handle()(sourcepoint.rs:414-569) — redirect rewriting, UTF-8 fallback, CL-missing skip, CL-too-large skip, and non-JS passthrough branches all lack unit coverage. Acknowledged in the round-3 response; tracking here for follow-up. - CSP interaction not documented (
docs/guide/integrations/sourcepoint.md) — publishers withContent-Security-Policy: script-src https://cdn.privacy-mgmt.comwill see rewritten first-party scripts blocked the moment they enable the integration. A note under "HTML Rewriting" calling out that CSP must allow the first-party host would save an incident report.
CI Status
- cargo fmt: PASS
- cargo clippy: PASS
- cargo test: PASS
- vitest: PASS
- format-typescript: PASS
- format-docs: PASS
- browser integration tests: PASS
- CodeQL: PASS
|
Implemented Sourcepoint follow-up updates and pushed commit. What I changed:
I also resolved remaining open review threads and re-requested review from @prk-Jr. Validation:
Commit: e2fdb0c |
prk-Jr
left a comment
There was a problem hiding this comment.
Summary
The Sourcepoint integration covers the expected rewrite layers, but I found two blocking issues around cookie semantics plus one follow-up consistency concern in the JS guard.
Blocking
🔧 wrench
-
Sourcepoint cookie state never makes the round trip: the proxy does not forward any
Cookieheader upstream, so Sourcepoint cookies previously set on the first-party endpoint are never sent back to Sourcepoint. That breaks the mainbaseEndpoint/ CNAME flow and authenticated-consent-style cookie-backed state. (crates/trusted-server-core/src/integrations/sourcepoint.rs:256) -
Cookie-setting Sourcepoint responses are still marked public-cacheable: both the passthrough path and the rewritten-JS path can return
Cache-Control: public, max-age=...even when upstream sendsSet-Cookie. If Sourcepoint uses server-set cookies on the first-party endpoint, that risks caching user-specific consent state. (crates/trusted-server-core/src/integrations/sourcepoint.rs:269,crates/trusted-server-core/src/integrations/sourcepoint.rs:377)
Non-blocking
🤔 thinking
- JS and Rust host checks diverge on explicit ports: the client-side DOM guard compares
URL.host, while the Rust rewriter compares hostname-only. That creates inconsistent behavior for URLs likehttps://cdn.privacy-mgmt.com:443/.... (crates/js/lib/src/integrations/sourcepoint/script_guard.ts:35,crates/trusted-server-core/src/integrations/sourcepoint.rs:228)
CI Status
- fmt: PASS
- clippy: FAIL
- rust tests: FAIL
- js tests: PASS
- integration artifact prep: FAIL
|
Addressed the latest review round in commit 1a40868 and pushed to feature/sourcepoint-integration. What changed:
Validation:
I replied to and resolved the remaining review threads and re-requested review from @prk-Jr and @aram356. |
…ab/trusted-server into feature/sourcepoint-integration
aram356
left a comment
There was a problem hiding this comment.
Summary
Four-layer rewriting architecture (HTML attrs / JS body regex / window._sp_ trap / client DOM guard) is well-structured, and most prior 🔧 / ❓ findings have been addressed across rounds 1–4 (SSRF guard, UTF-8 safety, redirect rewriting, cookie allowlist, no-store on cookie responses, hostname-only host checks, scheme/host/path/query/fragment validation). However, a stale merge from main introduced a hard build break that fails three CI jobs.
Blocking
🔧 wrench
- Build break:
IntegrationProxy::handlesignature drifted from the trait —#581addedservices: &RuntimeServicesto the trait; the merge from main did not update Sourcepoint's impl.cargo fmt,cargo test, andprepare integration artifactsall fail withE0050. (sourcepoint.rs:540-549) - Use platform-agnostic
client_ipfromRuntimeServices—copy_headerscalls Fastly-specificoriginal_req.get_client_ip_addr(). WithRuntimeServicesnow plumbed through, switch toservices.client_info().client_ipto match Didomi's pattern and keeptrusted-server-coreportable. (sourcepoint.rs:269-272)
Non-blocking
🤔 thinking
- Property-trap config field allowlist is hardcoded and silently goes stale — only
baseEndpoint,mmsDomain,wrapperAPIOrigin,cmpOrigin, andmetricUrlare patched. New Sourcepoint fields would silently leak. A snapshot test asserting the exact patched set would force review when the list changes. (sourcepoint.rs:735-742) auth_cookie_nameis unvalidated — empty string, whitespace, or;/=characters are accepted. Empty string especially: a malformed cookie pair=valueparses to name"", whichauth_cookie_name = ""would forward. Add#[validate](non-empty trimmed, length ≤ 64,[A-Za-z0-9_-]). (sourcepoint.rs:124-129)
♻️ refactor
is_likely_javascript_pathmatches non-JS API paths —/wrapper/v2/*is JSON but is treated as JS-likely, forcingAccept-Encoding: identityfor every CMP API hit. Tightening the heuristic (e.g. only.jsand/unified/) preserves the optimisation only where applicable. (sourcepoint.rs:431-433)
📌 out of scope (carryover from earlier rounds, still open)
- No direct unit test for
handle()— redirect rewriting, UTF-8 fallback,Content-Lengthmissing/too-large branches still lack coverage. apply_cache_headersinvoked at five return sites inhandle()— restructuring around a single exit point would remove the per-branch bookkeeping.- CSP interaction undocumented — publishers with
Content-Security-Policy: script-src https://cdn.privacy-mgmt.comwill see rewritten first-party scripts blocked the moment they enable rewriting. A note indocs/guide/integrations/sourcepoint.mdunder "HTML Rewriting" would save an incident report.
CI Status
- cargo fmt: FAIL (caused by trait-signature drift)
- cargo test: FAIL (caused by trait-signature drift)
- prepare integration artifacts: FAIL (caused by trait-signature drift)
- vitest: PASS
- format-typescript: PASS
- format-docs: PASS
- CodeQL: PASS
- Analyze (rust / actions / javascript-typescript): PASS
aram356
left a comment
There was a problem hiding this comment.
Summary
Fifth-round review against main. CI is fully green and the prior 84 review comments across 4 rounds have all been addressed. New findings below are incremental hardening — one cache-poisoning concern that depends on upstream Cache-Control behavior and several non-blocking quality observations.
Blocking
🔧 wrench
- Cache poisoning risk: forwarded
Cookienot reflected inVaryfor default cache headers (sourcepoint.rs:386)
Non-blocking
🤔 thinking
- Inline
<script>IIFE has notry/catch— can crash publisher hydration (sourcepoint.rs:784) - JS-rewrite success path leaves stale
Vary: Accept-Encoding(sourcepoint.rs:499) BackendConfig::from_urlcalled per-request even thoughcdn_originis now static (sourcepoint.rs:593)apply_cache_headersis a no-op for redirects withoutSet-Cookie(sourcepoint.rs:637)
♻️ refactor
- Duplicated
Set-Cookie → private, no-storepolicy betweenapply_cache_headersandrewrite_javascript_response(sourcepoint.rs:492) parse_sourcepoint_url's bare-host branch uses literalstarts_with(sourcepoint.rs:516)
🌱 seedling
validate_auth_cookie_namedoes not reject reserved RFC 6265 attribute names (sourcepoint.rs:235)is_likely_javascript_pathdoes not include.mjs(sourcepoint.rs:467)
CI Status
- fmt: PASS
- clippy: PASS
- rust tests: PASS
- js tests: PASS
- CodeQL: PASS
- browser/integration tests: PASS
aram356
left a comment
There was a problem hiding this comment.
Summary
All previously-flagged blockers are addressed. Multi-layer rewriting architecture is sound, validation is tight (host pinned, scheme/path/query/fragment all rejected), cookie hygiene is correct (allowlist + private, no-store on Set-Cookie), and 41 unit tests cover the slices. Approving with non-blocking observations the author should consider.
Blocking
❓ question
expect()on backend creation in constructor: panics on non-NameInUseFastly errors; diverges from peer integrations and CLAUDE.md startup-error guidance (sourcepoint.rs:255-256)- OPTIONS / HEAD methods not registered: CORS preflight will 404 against
/integrations/sourcepoint/cdn/*; needs empirical confirmation that Sourcepoint's API surface doesn't trigger preflight (sourcepoint.rs:636-638)
Non-blocking
🤔 thinking
- JSON responses are passthrough:
is_javascript_responsegates rewriting on JS content-types only; Sourcepoint message-API responses may contain CDN URLs in JSON fields that leak through (sourcepoint.rs:710-713) Content-Length-absent responses skip rewriting: chunked / unknown-length bodies silently bail; safe but coverage depends on upstream behavior (sourcepoint.rs:733-740)- Pre-computed backend diverges from peer integrations:
permutive,lockr,datadome,apsall callBackendConfig::from_urlinline at request time. Sourcepoint pre-computes innewwith no Fastly Compute perf benefit. Either align or document (linked to Q1) - Head-injector trap silently swallows install errors:
try{...}catch(e){}discards everything; consent traffic could leak to the third-party CDN with no observability (sourcepoint.rs:828-852)
♻️ refactor
copy_headersreturns an unlabeled bool: a named enum makes the intent (forwarded_cookies) self-documenting (sourcepoint.rs:315-351)- Error messages duplicate the integration name:
TrustedServerError::Integrationalready carriesintegration: SOURCEPOINT_INTEGRATION_ID; messages like "Failed to build Sourcepoint target URL" repeat what the type already says is_javascript_responseusescontains: a CT liketext/plain; charset=javascript-utf8would trigger rewriting; stricter parsing would be more defensive (very minor)
🌱 seedling
- Cookie allowlist is hardcoded (sourcepoint.rs:57-69): future PR could expose this as a list in config so new Sourcepoint cookies don't require a code release
is_likely_javascript_pathcovers only.js,.mjs,/unified/: maintained list; document as such (sourcepoint.rs:497-499)- No end-to-end test for
handle(): redirect rewriting, UTF-8 fallback, Content-Length skip, cookie filtering, and Cache-Control logic are individually unit-covered, but no test exercises the full path (hard to mockRequest::send) - Property trap doesn't catch nested mutation (sourcepoint.rs:818-824): documented limitation; a
Proxyono.configwould catch nested writes - 5 MB body cap is hardcoded: could be configurable per-integration (sourcepoint.rs:50)
⛏ nitpick
# Examplesdoctest isignored: doesn't compile and doesn't really demonstrate usage; either replace with a real doctest or omit since every other integration'sregisterfollows the same shape (sourcepoint.rs:609-613)- Comment block mixes "what" and "why" (sourcepoint.rs:325-331): the Authorization-omission rationale sits inside the Accept-Encoding paragraph; splitting separates two distinct decisions
📌 out of scope
- HEAD method support: some publishers do
fetch(url, {method: 'HEAD'})for asset checks; track separately if real traffic shows the need - JSON body rewriting: future PR could parse JSON and rewrite known URL-bearing fields, mirroring how the head-injector trap rewrites runtime config
CI Status
- cargo fmt: PASS
- cargo clippy: PASS
- cargo test: PASS (41 Sourcepoint tests pass locally)
- vitest: PASS
- format-typescript: PASS
- format-docs: PASS
- browser integration tests: PASS
- integration tests: PASS
- CodeQL: PASS
| impl SourcepointIntegration { | ||
| fn new(config: Arc<SourcepointConfig>) -> Arc<Self> { | ||
| let backend_name = BackendConfig::from_url(&config.cdn_origin, true) | ||
| .expect("should configure Sourcepoint backend from validated origin"); |
There was a problem hiding this comment.
❓ question — expect() here will panic if Backend::builder().finish() returns anything other than NameInUse (host-with-port length, transient platform errors). validate_cdn_origin ensures URL parsing succeeds, but it can't guarantee the Fastly registration call. CLAUDE.md says invalid enabled config should surface as startup/config errors, not panics. Peer integrations (permutive, lockr, datadome, aps) defer backend creation to request handling and propagate errors via Result.
Since Fastly Compute reinitializes per request, there's no perf benefit to pre-computing in the constructor. Suggest changing new to return Result, propagating up through build() → register().
| } | ||
|
|
||
| fn routes(&self) -> Vec<IntegrationEndpoint> { | ||
| vec![self.get("/cdn/*"), self.post("/cdn/*")] |
There was a problem hiding this comment.
❓ question — Only GET and POST are registered. CORS preflight (OPTIONS) requests will 404 and block any cross-origin XHR that requires preflight. Was OPTIONS verified unnecessary against Sourcepoint's actual API surface? The /wrapper/v2/messages POST with Content-Type: application/json and custom headers would normally trigger a preflight.
| if response.get_status() == StatusCode::OK | ||
| && self.config.rewrite_sdk | ||
| && Self::is_javascript_response(&response) | ||
| { |
There was a problem hiding this comment.
🤔 thinking — is_javascript_response only triggers rewriting on javascript/ecmascript content types. Sourcepoint's /wrapper/v2/messages and similar endpoints respond with JSON that may contain cdn.privacy-mgmt.com URLs in fields like mmsDomain. Those URLs flow through to the client; only the runtime trap and DOM guard catch them.
The head-injector doc already mentions reporting missed fields, which suggests this is partially expected — but explicitly documenting "JSON bodies are not rewritten" would set the right expectation for future maintainers.
| None => { | ||
| log::warn!( | ||
| "Sourcepoint: no Content-Length for {path}, \ | ||
| skipping rewrite to avoid unbounded memory read (reason: missing_content_length)" |
There was a problem hiding this comment.
🤔 thinking — Chunked / unknown-length responses silently bail out of body rewriting. Memory-safe, but coverage now depends on upstream behavior. A code comment that this is a known coverage gap (relying on the trap and DOM-guard backstops) would help the next maintainer. log::info! may also be more appropriate than warn! here since this is the safe path, not a degraded one.
| "get:function(){{return v}},", | ||
| "set:function(n){{v=p(n)}}", | ||
| "}});", | ||
| "}}catch(e){{}}}})();", |
There was a problem hiding this comment.
🤔 thinking — try{...}catch(e){} discards every error. If the trap fails to install, consent traffic leaks to the third-party CDN with zero observability for the publisher.
A single console.warn('[ts:sourcepoint] _sp_ trap install failed', e) in the catch keeps the script best-effort while making install failures debuggable.
| client_ip: Option<IpAddr>, | ||
| original_req: &Request, | ||
| proxy_req: &mut Request, | ||
| ) -> bool { |
There was a problem hiding this comment.
♻️ refactor — The bool return is "did we forward Sourcepoint cookies upstream?" which then gates cache policy at call sites. A named enum (enum CookieForwarding { None, Forwarded }) makes the intent self-documenting. Minor.
| /// is a conservative preflight — false negatives just mean we skip the | ||
| /// `Accept-Encoding: identity` optimisation for that request. | ||
| fn is_likely_javascript_path(path: &str) -> bool { | ||
| path.ends_with(".js") || path.ends_with(".mjs") || path.starts_with("/unified/") |
There was a problem hiding this comment.
🌱 seedling — If Sourcepoint introduces another chunk path convention beyond /unified/, the Accept-Encoding: identity optimization silently turns off for those paths. Correctness is preserved (the post-arrival is_javascript_response check still gates rewriting), it's just a quiet perf regression. A short comment noting this is a maintained list helps maintenance.
| // initial assignment is not caught. The JS body regex rewriter | ||
| // covers that case for string literals in bundled code. | ||
| // - `s.replace()` replaces only the first occurrence per call, which | ||
| // is fine for the current set of scalar URL config fields. |
There was a problem hiding this comment.
🌱 seedling — Documented limitation: window._sp_.config.baseEndpoint = "..." after the initial assignment isn't trapped. A Proxy on o.config would catch nested writes; the JS body regex covers most real-world cases so this can wait.
| /// Maximum response body size (5 MB) that will be read into memory for | ||
| /// JavaScript rewriting. Responses larger than this are passed through | ||
| /// unmodified to avoid unbounded memory consumption. | ||
| const MAX_REWRITE_BODY_SIZE: u64 = 5 * 1024 * 1024; |
There was a problem hiding this comment.
🌱 seedling — 5 MB body cap is a hardcoded constant. Could be configurable per-integration if a publisher needs larger CMP bundles. Not urgent.
| /// | ||
| /// # Examples | ||
| /// | ||
| /// ```ignore |
There was a problem hiding this comment.
⛏ nitpick — The ignored doctest doesn't compile and doesn't really demonstrate usage. CLAUDE.md says public APIs need examples; an ignored snippet meets the letter but not the spirit. Either replace with a real doctest or omit (every other integration's register follows the same shape).
Summary
cdn.privacy-mgmt.comandgeo.privacymanager.iotraffic through Trusted Server, eliminating third-party requests for consent management assets.Changes
crates/trusted-server-core/src/integrations/sourcepoint.rsrewrite_script_content), head injector (window._sp_property trap), Accept-Encoding scoping, backend selection, and 14 unit testscrates/trusted-server-core/src/integrations/mod.rssourcepointmodule and wire it into the integration registrycrates/js/lib/src/integrations/sourcepoint/index.tscrates/js/lib/src/integrations/sourcepoint/script_guard.ts<script>/<link>elements pointing at Sourcepoint CDN and rewrites them to first-party pathscrates/js/lib/test/integrations/sourcepoint/script_guard.test.tsdocs/guide/integrations/sourcepoint.mddocs/guide/integrations-overview.mdtrusted-server.toml[integrations.sourcepoint]config stanza (disabled by default)Closes
Closes #145
Closes #344
Closes #345
Test plan
cargo test --workspacecargo clippy --workspace --all-targets --all-features -- -D warningscargo fmt --all -- --checkcd crates/js/lib && npx vitest runcd crates/js/lib && npm run formatcd docs && npm run formatcargo build --package trusted-server-adapter-fastly --release --target wasm32-wasip1fastly compute serveChecklist
unwrap()in production code — useexpect("should ...")logmacros (notprintln!)