Skip to content

e2e: hermetic ADK cassettes, matchFileSnapshot migration, seinfeld gzip fix#1966

Open
Stephen Belanger (Qard) wants to merge 7 commits intomainfrom
t3code/e2e-followup
Open

e2e: hermetic ADK cassettes, matchFileSnapshot migration, seinfeld gzip fix#1966
Stephen Belanger (Qard) wants to merge 7 commits intomainfrom
t3code/e2e-followup

Conversation

@Qard
Copy link
Copy Markdown
Contributor

Summary

  • fix(seinfeld): buildResponse was preserving content-encoding: gzip while serving already-decoded body bytes (undici decompresses at the HTTP layer). Callers like Google ADK would attempt a second gunzip and get incorrect header check. Fixed by stripping content-encoding, transfer-encoding, and content-length from replayed/recorded responses. Also switches handleRecord to return buildResponse() instead of realResponse.clone() for non-binary-draft bodies, avoiding empty-body issues from double-tee'd streams.

  • feat(e2e/google-adk): Records hermetic cassettes for both ADK variants (0.6.1 and 1.0.0). Each cassette has two Gemini entries — call 0 returns a functionCall for get_weather, call 1 returns the final answer. A per-scenario cassette-filter.mjs ignores the ?key= query param and all body fields (volatile functionCallId UUIDs), so matching relies solely on callIndex.

  • refactor(e2e): Adds a matchFileSnapshot wrapper in helpers/file-snapshot.ts that is a no-op in canary mode. All scenario test files and assertions modules are migrated from toMatchFileSnapshot to the new helper, so canary runs skip snapshot comparison for non-deterministic live API responses.

  • chore(e2e): Restores DRAIN_DELAY_MS to 2000ms and removes the temporary onRecord debug callback from cassette-preload.mjs. Also adds the installRecordModeGuard function that prevents premature cassette flush during multi-step ADK tool-call flows.

Test plan

  • pnpm run test:e2e:hermetic -t "google adk" passes (32/32) without any Google API key
  • Running hermetic twice produces identical output (no snapshot drift)
  • All wiped cassettes restored from git — hermetic suite for other scenarios unaffected
  • matchFileSnapshot migration verified: zero remaining toMatchFileSnapshot calls in e2e/scenarios/

🤖 Generated with Claude Code

Stephen Belanger and others added 7 commits May 6, 2026 16:38
- Remove dotenv loading from vitest.setup.ts — mise handles .env loading
- Drop the dotenv dev-dependency from e2e/package.json
- Simplify test:e2e:record to an inline env-var prefix; delete the
  record-cassettes.mjs wrapper script that did nothing beyond that
- Delete dev-packages/seinfeld/LICENSE (repo-level license applies)
- Delete dev-packages/seinfeld/scripts/migrate-from-legacy.mjs — the
  one-time migration it performed is complete and the file is unreferenced
- Clarify the format versioning rationale in seinfeld/src/format/v1.ts
- Sort JSON keys when writing cassette files (file-store.ts) so
  re-recordings produce deterministic diffs and snapshot comparisons
  are not confused by non-deterministic key insertion order

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…faults

- Replace global cassette-filters.mjs registry with per-scenario cassette-filter.mjs
  files; cassette-preload.mjs now dynamically imports them from the scenario dir
- Default redact to 'paranoid' in seinfeld recorder (was opt-in)
- Gate provider key placeholder injection on replay mode only (not record/passthrough)
- Delete obsolete cassette-filters.mjs and record-cassettes.mjs helper scripts

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Node.js undici decompresses gzip/deflate at the HTTP layer before
passing the body to MSW handlers. The stored body bytes are therefore
already plain JSON/text. buildResponse was preserving the original
content-encoding header, which caused callers (e.g. Google ADK) to
attempt a second gunzip of already-decoded bytes, producing a zlib
"incorrect header check" error and making the response unreadable.

Fix: strip content-encoding, transfer-encoding, and content-length
from the Response built by buildResponse (both replay and record
return paths).

Also switch handleRecord to return buildResponse() instead of
realResponse.clone() for non-binary-draft bodies. After
recordResponseDraft() tees the body stream, clone() can return an
empty body on some Node versions.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
DRAIN_DELAY_MS was temporarily raised to 15000ms during ADK cassette
debugging. The root cause (gzip content-encoding bug in seinfeld) is
now fixed, so restore the original 2-second drain delay.

Also remove the temporary onRecord stderr callback that was added for
diagnostics.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Record cassettes for both ADK versions (0.6.1 and 1.0.0) and update
snapshots to match. The cassette filter ignores query params (Google
API key) and all body fields (volatile functionCall IDs), relying on
callIndex alone for stable matching.

Both variants now produce two cassette entries: call 0 returns a
functionCall for get_weather, call 1 returns the final answer.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The new matchFileSnapshot wrapper in helpers/file-snapshot.ts is a
no-op in canary mode (where snapshot comparison is skipped because live
API responses are non-deterministic). All scenario test files and
assertions modules are migrated to use the new helper.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant