diff --git a/.agents/skills/e2e-tests/SKILL.md b/.agents/skills/e2e-tests/SKILL.md index 5127fdc46..339021626 100644 --- a/.agents/skills/e2e-tests/SKILL.md +++ b/.agents/skills/e2e-tests/SKILL.md @@ -53,6 +53,7 @@ Cassettes mock provider HTTP responses (OpenAI, Anthropic, ...) so scenarios tha Then run again in `BRAINTRUST_E2E_CASSETTE_MODE=replay` with no provider keys to confirm the cassette is sufficient. - Volatile fields in request bodies (e.g. AI-SDK `experimental_generateMessageId`) need a per-scenario filter. Add the scenario name and a `FilterSpec` to `e2e/helpers/cassette-filters.mjs`. The cassette layer is backed by `@braintrust/seinfeld` (`dev-packages/seinfeld`); the preload entry point is `e2e/helpers/cassette-preload.mjs`. +- **After recording, always inspect the cassette for leaked API keys before committing.** The recorder redacts common patterns (`paranoid` preset), but confirm that no real keys appear in request headers, response bodies, or any URL query parameters. If a key slips through, remove the cassette, add a custom `redact` rule, and re-record. ## Preferred Patterns diff --git a/dev-packages/seinfeld/LICENSE b/dev-packages/seinfeld/LICENSE deleted file mode 100644 index 61b436e2f..000000000 --- a/dev-packages/seinfeld/LICENSE +++ /dev/null @@ -1,21 +0,0 @@ -MIT License - -Copyright (c) 2026 Stephen Belanger - -Permission is hereby granted, free of charge, to any person obtaining a copy -of this software and associated documentation files (the "Software"), to deal -in the Software without restriction, including without limitation the rights -to use, copy, modify, merge, publish, distribute, sublicense, and/or sell -copies of the Software, and to permit persons to whom the Software is -furnished to do so, subject to the following conditions: - -The above copyright notice and this permission notice shall be included in all -copies or substantial portions of the Software. - -THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR -IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, -FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE -AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER -LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, -OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE -SOFTWARE. diff --git a/dev-packages/seinfeld/README.md b/dev-packages/seinfeld/README.md index c3edcea90..cb46eb788 100644 --- a/dev-packages/seinfeld/README.md +++ b/dev-packages/seinfeld/README.md @@ -5,24 +5,16 @@ Generic VCR/cassette library for Node.js, built on [MSW](https://mswjs.io). Reco ## Features - **Normalizers** (always-on, lossy) transform requests before matching. They strip volatile fields like `Authorization` headers, dynamic IDs (`experimental_generateMessageId`), or query nonces so two structurally-identical requests still match across runs. Their output is internal — never serialized. -- **Redactors** (opt-in) transform what gets persisted to disk. They mask credentials before the cassette hits version control. Disabled by default; cassettes contain the real on-the-wire bytes unless you opt in. +- **Redactors** transform what gets persisted to disk. They mask credentials before the cassette hits version control. The `'paranoid'` preset is applied by default; pass `redact: []` to disable. ## Security note -> **Cassettes contain real request and response bytes by default, including `Authorization` headers.** This is the safer default for fidelity (downstream consumers see real responses) but it means you must either (a) enable redaction, (b) write a custom `RedactionConfig`, or (c) add cassette files to `.gitignore` if they may contain credentials. - -Three body-redaction gaps are worth knowing: +Three body-redaction gaps are worth knowing even with the default `'paranoid'` preset: 1. **Non-canonical content-type** — some servers return JSON with `Content-Type: text/plain`. `redactBodyFields` covers this because seinfeld attempts to parse `text` bodies as JSON before masking. 2. **SSE event data** — streaming endpoints (OpenAI, Anthropic) emit JSON in `data:` lines. `redactBodyFields` applies to parseable `data:` lines; `redactBodyText` handles non-JSON SSE content. 3. **Plain-text credentials** — form-encoded bodies, XML, or log-like text are opaque to field-path rules. Use `redactBodyText` with a regex. -For cassettes committed to version control, use the `'paranoid'` preset, which covers all three paths: - -```ts -createCassette({ name: "demo", redact: "paranoid" }); -``` - `'paranoid'` redacts credential headers, common credential field names at any JSON depth (`apiKey`, `token`, `secret`, `password`, `authorization`), and Bearer / `sk-` style tokens in text bodies. To detect misconfigurations at record time, add `strict: true`: diff --git a/dev-packages/seinfeld/scripts/migrate-from-legacy.mjs b/dev-packages/seinfeld/scripts/migrate-from-legacy.mjs deleted file mode 100644 index 3df98346b..000000000 --- a/dev-packages/seinfeld/scripts/migrate-from-legacy.mjs +++ /dev/null @@ -1,300 +0,0 @@ -#!/usr/bin/env node -// @ts-check -/** - * Migrate cassette files from the legacy e2e VCR format to the seinfeld format. - * - * Usage: - * node dev-packages/seinfeld/scripts/migrate-from-legacy.mjs [glob-pattern-or-dir] - * - * Example: - * # Migrate all cassettes in the e2e scenarios directory - * node dev-packages/seinfeld/scripts/migrate-from-legacy.mjs e2e/scenarios - * - * # Migrate a single file - * node dev-packages/seinfeld/scripts/migrate-from-legacy.mjs e2e/scenarios/openai-instrumentation/__cassettes__/openai-v6.json - * - * Legacy format: - * { version, scenario, variantKey, createdAt, entries: [{ key, request: { method, url, headers, bodyEncoding, body, bodyHash }, response: { status, statusText, headers, bodyEncoding, body?, chunks? } }] } - * - * Seinfeld format: - * { version: 1, meta: { createdAt, seinfeldVersion }, entries: [{ id, matchKey, callIndex, recordedAt, request: { method, url, headers, body: BodyPayload }, response: { status, statusText, headers, body: BodyPayload } }] } - */ - -import { createHash } from "node:crypto"; -import { promises as fs } from "node:fs"; -import * as path from "node:path"; - -const SEINFELD_VERSION = "0.0.0"; - -const args = process.argv.slice(2); -if (args.length === 0) { - console.error( - "Usage: migrate-from-legacy.mjs [...more-paths]", - ); - process.exit(1); -} - -let converted = 0; -let skipped = 0; -let errors = 0; - -for (const inputPath of args) { - const stat = await fs.stat(inputPath).catch(() => null); - if (!stat) { - console.error(`Not found: ${inputPath}`); - errors++; - continue; - } - if (stat.isDirectory()) { - for await (const file of walkCassettes(inputPath)) { - await migrateFile(file); - } - } else { - await migrateFile(inputPath); - } -} - -console.log( - `\nDone: ${converted} converted, ${skipped} skipped, ${errors} errors.`, -); -if (errors > 0) process.exit(1); - -// ---- migration ----------------------------------------------------------- - -async function migrateFile(filePath) { - const raw = await fs.readFile(filePath, "utf8"); - let parsed; - try { - parsed = JSON.parse(raw); - } catch { - console.error(` SKIP (invalid JSON): ${filePath}`); - skipped++; - return; - } - - if (!isLegacyFormat(parsed)) { - console.log(` SKIP (already seinfeld or unknown format): ${filePath}`); - skipped++; - return; - } - - try { - const converted_data = convertCassette(parsed); - const output = JSON.stringify(converted_data, null, 2) + "\n"; - await fs.writeFile(filePath, output, "utf8"); - console.log(` OK: ${filePath}`); - converted++; - } catch (err) { - console.error( - ` ERROR: ${filePath}: ${err instanceof Error ? err.message : String(err)}`, - ); - errors++; - } -} - -/** - * Returns true if the file looks like a legacy cassette (has `scenario`/`variantKey` top-level keys). - * Returns false if it already has seinfeld's `meta` field. - * - * @param {unknown} data - */ -function isLegacyFormat(data) { - if (!data || typeof data !== "object") return false; - const d = /** @type {Record} */ (data); - // Already converted to seinfeld format - if (d["meta"] !== undefined) return false; - // Legacy: has `scenario` or `variantKey` or flat `createdAt` at top level - return ( - typeof d["scenario"] === "string" || - typeof d["variantKey"] === "string" || - typeof d["createdAt"] === "string" - ); -} - -/** - * Convert a legacy cassette object to seinfeld format. - * - * @param {Record} legacy - * @returns {Record} - */ -function convertCassette(legacy) { - const createdAt = - typeof legacy["createdAt"] === "string" - ? legacy["createdAt"] - : new Date().toISOString(); - - const rawEntries = Array.isArray(legacy["entries"]) ? legacy["entries"] : []; - - /** @type {Map} */ - const callCounts = new Map(); - - const entries = rawEntries.map((raw) => { - const entry = /** @type {Record} */ (raw); - const req = /** @type {Record} */ (entry["request"] ?? {}); - const res = /** @type {Record} */ ( - entry["response"] ?? {} - ); - - const method = String(req["method"] ?? "GET").toUpperCase(); - const url = String(req["url"] ?? ""); - const matchKey = computeMatchKey(method, url); - const callIndex = bumpCallCount(callCounts, matchKey); - - const requestBody = convertBody( - req["bodyEncoding"], - req["body"], - req["chunks"], - ); - const responseBody = convertBody( - res["bodyEncoding"], - res["body"], - res["chunks"], - ); - - const id = makeEntryId(matchKey, callIndex, requestBody); - - return { - id, - matchKey, - callIndex, - recordedAt: createdAt, - request: { - method, - url, - headers: req["headers"] ?? {}, - body: requestBody, - }, - response: { - status: res["status"] ?? 200, - statusText: res["statusText"] ?? "OK", - headers: res["headers"] ?? {}, - body: responseBody, - }, - }; - }); - - return { - version: 1, - meta: { createdAt, seinfeldVersion: SEINFELD_VERSION }, - entries, - }; -} - -/** - * Convert legacy bodyEncoding/body/chunks to a seinfeld BodyPayload. - * - * @param {unknown} encoding - * @param {unknown} body - * @param {unknown} chunks - * @returns {Record} - */ -function convertBody(encoding, body, chunks) { - if (!encoding || encoding === "empty" || (body == null && !chunks)) { - return { kind: "empty" }; - } - switch (encoding) { - case "json": { - if (body == null) return { kind: "empty" }; - return { kind: "json", value: body }; - } - case "utf8": - case "text": { - const val = String(body ?? ""); - if (val === "") return { kind: "empty" }; - return { kind: "text", value: val }; - } - case "base64": { - return { kind: "base64", value: String(body ?? "") }; - } - case "sse-chunks": { - // Legacy SSE: arbitrary byte-stream fragments, each base64-encoded. - // The fragments may NOT align with SSE event boundaries — a single SSE - // event can span multiple chunks. Seinfeld's chunk array must contain - // complete SSE events (one entry per \n\n-terminated event). - // Fix: concatenate all decoded fragment bytes first, then split on \n\n. - const rawChunks = Array.isArray(chunks) - ? chunks - : Array.isArray(body) - ? body - : []; - const fullText = rawChunks - .map((c) => { - if (typeof c === "string") return c; - const obj = /** @type {Record} */ (c); - if (obj["encoding"] === "base64" && typeof obj["data"] === "string") { - return Buffer.from(String(obj["data"]), "base64").toString("utf8"); - } - return String(obj["data"] ?? c); - }) - .join(""); - // Split on \n\n (SSE event separator) and drop trailing empty entry. - const parts = fullText.replace(/\r\n/g, "\n").split("\n\n"); - if (parts.length > 0 && parts[parts.length - 1] === "") parts.pop(); - return { kind: "sse", chunks: parts }; - } - default: - // Unknown encoding — fall back to empty - return { kind: "empty" }; - } -} - -/** - * Compute seinfeld matchKey: "METHOD host/path". - * - * @param {string} method - * @param {string} url - */ -function computeMatchKey(method, url) { - try { - const parsed = new URL(url); - return `${method.toUpperCase()} ${parsed.host}${parsed.pathname}`; - } catch { - return `${method.toUpperCase()} ${url}`; - } -} - -/** - * Bump and return the current call count for a matchKey. - * - * @param {Map} counts - * @param {string} key - */ -function bumpCallCount(counts, key) { - const current = counts.get(key) ?? 0; - counts.set(key, current + 1); - return current; -} - -/** - * Compute a seinfeld entry ID: sha256(matchKey + '\n' + callIndex + '\n' + JSON.stringify(body)).slice(0, 16) - * - * @param {string} matchKey - * @param {number} callIndex - * @param {unknown} body - */ -function makeEntryId(matchKey, callIndex, body) { - const raw = `${matchKey}\n${callIndex}\n${JSON.stringify(body)}`; - return createHash("sha256").update(raw).digest("hex").slice(0, 16); -} - -/** - * Recursively walk a directory and yield paths matching *.json inside __cassettes__ dirs. - * - * @param {string} dir - * @returns {AsyncGenerator} - */ -async function* walkCassettes(dir) { - const entries = await fs.readdir(dir, { withFileTypes: true }); - for (const entry of entries) { - const fullPath = path.join(dir, entry.name); - if (entry.isDirectory()) { - yield* walkCassettes(fullPath); - } else if ( - entry.isFile() && - entry.name.endsWith(".json") && - path.basename(path.dirname(fullPath)) === "__cassettes__" - ) { - yield fullPath; - } - } -} diff --git a/dev-packages/seinfeld/src/format/v1.ts b/dev-packages/seinfeld/src/format/v1.ts index 67b34bee7..43b768d95 100644 --- a/dev-packages/seinfeld/src/format/v1.ts +++ b/dev-packages/seinfeld/src/format/v1.ts @@ -3,8 +3,11 @@ import { z } from "zod"; /** * Zod schema for cassette format version 1. * - * Validates cassettes at load time so corrupt files fail loudly rather than - * mysteriously silent at match time. + * Cassette files carry a `version` field so that: (a) load-time validation + * fails loudly on corrupt files rather than silently at match time, and + * (b) a future breaking change can introduce v2 while the library still reads + * older cassettes without ambiguity. See `format/index.ts` for the dispatch + * logic and the rule for when to bump the version. */ export const CURRENT_FORMAT_VERSION = 1 as const; diff --git a/dev-packages/seinfeld/src/recorder.ts b/dev-packages/seinfeld/src/recorder.ts index e22ffcb2c..0b8b77aeb 100644 --- a/dev-packages/seinfeld/src/recorder.ts +++ b/dev-packages/seinfeld/src/recorder.ts @@ -348,7 +348,7 @@ export interface CassetteOptions { store?: CassetteStore; /** Filter spec (matching-only normalization). */ filters?: FilterSpec; - /** Redaction spec (applied to persisted bytes). Defaults to none. */ + /** Redaction spec (applied to persisted bytes). Defaults to `'paranoid'`. Pass `[]` to disable. */ redact?: RedactionSpec; /** Hosts to intercept. Other hosts pass through. Defaults to all hosts. */ hosts?: Array; @@ -403,7 +403,7 @@ export function createCassette(options: CassetteOptions): Cassette { options.store ?? createJsonFileStore({ rootDir: DEFAULT_CASSETTE_DIR }), matcher: options.matcher ?? createDefaultMatcher(), filters: options.filters, - redact: options.redact, + redact: options.redact ?? "paranoid", hosts: options.hosts, passthroughHosts: options.passthroughHosts, threshold: resolveThreshold(options.externalBlobThreshold), diff --git a/dev-packages/seinfeld/src/store/file-store.ts b/dev-packages/seinfeld/src/store/file-store.ts index 07193dc93..edb88d4dd 100644 --- a/dev-packages/seinfeld/src/store/file-store.ts +++ b/dev-packages/seinfeld/src/store/file-store.ts @@ -98,9 +98,10 @@ export function createJsonFileStore( async save(name, cassette) { const path = pathFor(name); await mkdir(dirname(path), { recursive: true }); + const sorted = sortKeys(cassette); const json = pretty - ? JSON.stringify(cassette, null, 2) - : JSON.stringify(cassette); + ? JSON.stringify(sorted, null, 2) + : JSON.stringify(sorted); const content = pretty ? json + "\n" : json; // Write atomically: temp file + rename so partial writes are never visible. const tmp = `${path}.tmp-${process.pid}-${randomBytes(4).toString("hex")}`; @@ -163,6 +164,19 @@ export function createJsonFileStore( }; } +/** Recursively sort object keys so cassette files are deterministic across recording runs. */ +function sortKeys(value: unknown): unknown { + if (Array.isArray(value)) return value.map(sortKeys); + if (value !== null && typeof value === "object") { + return Object.fromEntries( + Object.keys(value as Record) + .sort() + .map((k) => [k, sortKeys((value as Record)[k])]), + ); + } + return value; +} + /** * Assert that `resolved` lies within `rootDir`. Uses `path.relative` so the * check is path-semantic rather than string-prefix-based (avoids false passes diff --git a/e2e/README.md b/e2e/README.md index ce1f05a81..4b9490e53 100644 --- a/e2e/README.md +++ b/e2e/README.md @@ -178,7 +178,7 @@ unset ANTHROPIC_API_KEY OPENAI_API_KEY GEMINI_API_KEY COHERE_API_KEY GROQ_API_KE pnpm --filter=@braintrust/js-e2e-tests run test:e2e:hermetic ``` -If a scenario records but later replay fails because of volatile fields in the request body (e.g. AI-SDK's generated message ids), add or update the filter for that scenario in `e2e/helpers/cassette-filters.mjs`, then re-record. +If a scenario records but later replay fails because of volatile fields in the request body (e.g. AI-SDK's generated message ids), add or update `/cassette-filter.mjs` for that scenario, then re-record. ### In-scope scenarios diff --git a/e2e/helpers/cassette-filters.mjs b/e2e/helpers/cassette-filters.mjs deleted file mode 100644 index 8afb53b2e..000000000 --- a/e2e/helpers/cassette-filters.mjs +++ /dev/null @@ -1,94 +0,0 @@ -/** - * Per-scenario seinfeld filter specs. Each entry maps a scenario/normalizer - * name to a seinfeld FilterSpec, which is passed to createCassette({ filters }). - * - * Wildcard paths: `*` matches one segment, `**` matches any number. - */ - -const AI_SDK_VOLATILE_FIELDS = { - ignoreBodyFields: [ - // Ignore all body fields — deterministic call order makes callIndex - // the sole discriminator, which is stable across SDK releases. - "**", - // AI SDK volatile fields (change per-run) - "experimental_generateMessageId", - "messageId", - "messages.*.id", - "messages.*.experimental_messageId", - "input.*.id", - "input.*.experimental_messageId", - // OpenAI Responses API fields added as defaults in newer client versions. - // These don't affect request semantics but change between SDK releases. - "store", - "background", - "truncation", - "instructions", - "moderation", - "reasoning", - "reasoning.effort", - "reasoning.summary", - "safety_identifier", - "service_tier", - "text", - "text.format", - "text.format.type", - "text.verbosity", - "metadata", - "top_logprobs", - "top_p", - "presence_penalty", - "frequency_penalty", - "parallel_tool_calls", - "max_tool_calls", - "prompt_cache_key", - "prompt_cache_retention", - "previous_response_id", - "user", - "include", - ], -}; - -const MISTRAL_VOLATILE_FIELDS = { - normalizeRequest: (req) => { - if ( - req.body.kind !== "json" || - req.body.value === null || - typeof req.body.value !== "object" || - Array.isArray(req.body.value) - ) { - return req; - } - const value = /** @type {Record} */ (req.body.value); - if ( - typeof value["name"] === "string" && - /** @type {string} */ (value["name"]).startsWith("braintrust-e2e-") - ) { - return { - ...req, - body: { - kind: "json", - value: { ...value, name: "braintrust-e2e-" }, - }, - }; - } - return req; - }, -}; - -const OPENROUTER_VOLATILE_FIELDS = { - ignoreBodyFields: [ - // Ignore all body fields — deterministic call order makes callIndex - // the sole discriminator, which is stable across SDK releases. - "**", - ], -}; - -export const CASSETTE_FILTERS = { - default: "default", - "ai-sdk": ["default", AI_SDK_VOLATILE_FIELDS], - "ai-sdk-instrumentation": ["default", AI_SDK_VOLATILE_FIELDS], - "ai-sdk-otel-export": ["default", AI_SDK_VOLATILE_FIELDS], - "mistral-instrumentation": ["default", MISTRAL_VOLATILE_FIELDS], - "openrouter-agent-instrumentation": ["default", OPENROUTER_VOLATILE_FIELDS], - "openrouter-instrumentation": ["default", OPENROUTER_VOLATILE_FIELDS], -}; diff --git a/e2e/helpers/cassette-preload.mjs b/e2e/helpers/cassette-preload.mjs index ea8ceda0e..54e153dd6 100644 --- a/e2e/helpers/cassette-preload.mjs +++ b/e2e/helpers/cassette-preload.mjs @@ -8,20 +8,22 @@ * BRAINTRUST_E2E_CASSETTE_MODE — replay | record | passthrough * BRAINTRUST_E2E_CASSETTE_VARIANT — variant key (cassette name, no extension) * BRAINTRUST_E2E_MOCK_HOST — host:port of the Braintrust mock server (always passthrough) - * BRAINTRUST_E2E_CASSETTE_NORMALIZER — name of the request-body filter to use * * The preload exits silently if the cassette path env var is not set, so * it's safe to install for scenarios that haven't migrated yet (the * harness only sets the env vars for opted-in scenarios). + * + * Per-scenario request-body filters live in `/cassette-filter.mjs` + * (optional). The file should export a named `filter` conforming to the + * seinfeld `FilterSpec` type. If absent, the seinfeld `"default"` preset is used. */ +import * as path from "node:path"; import { createCassette, createJsonFileStore } from "@braintrust/seinfeld"; -import { CASSETTE_FILTERS } from "./cassette-filters.mjs"; const CASSETTE_DIR = process.env.BRAINTRUST_E2E_CASSETTE_PATH; const MODE_RAW = process.env.BRAINTRUST_E2E_CASSETTE_MODE ?? "replay"; const VARIANT_KEY = process.env.BRAINTRUST_E2E_CASSETTE_VARIANT ?? "default"; const MOCK_HOST = process.env.BRAINTRUST_E2E_MOCK_HOST; -const NORMALIZER_NAME = process.env.BRAINTRUST_E2E_CASSETTE_NORMALIZER; if (CASSETTE_DIR) { await bootCassettePreload(CASSETTE_DIR); @@ -32,8 +34,7 @@ if (CASSETTE_DIR) { */ async function bootCassettePreload(cassetteDir) { const mode = resolveMode(MODE_RAW); - const filters = - CASSETTE_FILTERS[NORMALIZER_NAME ?? ""] ?? CASSETTE_FILTERS["default"]; + const filters = await loadScenarioFilter(cassetteDir); const passthroughHosts = MOCK_HOST ? [MOCK_HOST] : []; const cassette = createCassette({ @@ -41,7 +42,6 @@ async function bootCassettePreload(cassetteDir) { mode, store: createJsonFileStore({ rootDir: cassetteDir }), filters, - redact: "paranoid", passthroughHosts, onMiss: (req) => { process.stderr.write(`[cassette] MISS: ${req.method} ${req.url}\n`); @@ -62,6 +62,28 @@ async function bootCassettePreload(cassetteDir) { }); } +/** + * Try to load a per-scenario cassette filter from `/cassette-filter.mjs`. + * Falls back to the seinfeld `"default"` preset if the file is absent. + * + * @param {string} cassetteDir Absolute path to the __cassettes__ directory. + * @returns {Promise} + */ +async function loadScenarioFilter(cassetteDir) { + // cassetteDir is /__cassettes__ — parent is the scenario root. + const scenarioDir = path.resolve(cassetteDir, ".."); + const filterPath = path.join(scenarioDir, "cassette-filter.mjs"); + try { + const mod = await import(filterPath); + if (mod.filter !== undefined) { + return mod.filter; + } + } catch { + // File absent or not a valid module — fall through to default. + } + return "default"; +} + /** * @param {string} raw * @returns {import('@braintrust/seinfeld').CassetteMode} diff --git a/e2e/helpers/scenario-harness.ts b/e2e/helpers/scenario-harness.ts index 4fee45c2f..897aa84f9 100644 --- a/e2e/helpers/scenario-harness.ts +++ b/e2e/helpers/scenario-harness.ts @@ -49,12 +49,6 @@ export interface ScenarioCassetteConfig { * Defaults to `runContext.variantKey ?? "default"`. */ variantKey?: string; - /** - * Name of the request-body normalizer registered in - * `e2e/helpers/cassette/normalizers/index.mjs`. Falls back to a - * scenario-name based lookup if omitted. - */ - normalizerName?: string; } export interface ScenarioRunContext { @@ -271,20 +265,15 @@ interface CassetteWiring { cassetteDir: string; variantKey: string; mockHost: string; - normalizerName?: string; } function getCassetteEnv(wiring: CassetteWiring): Record { - const env: Record = { + return { BRAINTRUST_E2E_CASSETTE_PATH: wiring.cassetteDir, BRAINTRUST_E2E_CASSETTE_MODE: process.env[CASSETTE_MODE_ENV] ?? "replay", BRAINTRUST_E2E_CASSETTE_VARIANT: wiring.variantKey, BRAINTRUST_E2E_MOCK_HOST: wiring.mockHost, }; - if (wiring.normalizerName) { - env.BRAINTRUST_E2E_CASSETTE_NORMALIZER = wiring.normalizerName; - } - return env; } /** @@ -679,15 +668,18 @@ export async function withScenarioHarness( return {}; } - const normalizerName = config.normalizerName ?? scenarioName; + const isReplayMode = !isRecordingMode && cassetteModeRaw !== "passthrough"; return { - ...getProviderKeyPlaceholders(), + // Only inject placeholder keys in replay mode. In record mode the + // subprocess needs the real provider keys to make live API calls; + // injecting a fake key causes a confusing "invalid key" error instead + // of the clear "missing key" error the SDK would otherwise produce. + ...(isReplayMode ? getProviderKeyPlaceholders() : {}), ...getCassetteEnv({ cassetteDir: path.dirname(cassettePath), variantKey, mockHost: urlToHostHeader(server.url), - normalizerName, }), }; }; diff --git a/e2e/package.json b/e2e/package.json index 8e4e52a4a..a1b43d598 100644 --- a/e2e/package.json +++ b/e2e/package.json @@ -8,7 +8,7 @@ "test:e2e:hermetic": "vitest run --tags-filter=hermetic", "test:e2e:canary": "node ./scripts/run-canary-tests-docker.mjs", "test:e2e:update": "vitest run --update", - "test:e2e:record": "node ./scripts/record-cassettes.mjs" + "test:e2e:record": "BRAINTRUST_E2E_CASSETTE_MODE=record vitest run" }, "devDependencies": { "@braintrust/langchain-js": "workspace:^", @@ -19,7 +19,6 @@ "@opentelemetry/sdk-trace-base": ">=1.9.0", "@types/node": "^20.10.5", "braintrust": "workspace:^", - "dotenv": "^17.2.3", "msw": "^2.6.6", "tsx": "^4.21.0", "typescript": "5.4.4", diff --git a/e2e/scenarios/ai-sdk-instrumentation/cassette-filter.mjs b/e2e/scenarios/ai-sdk-instrumentation/cassette-filter.mjs new file mode 100644 index 000000000..8052b33b9 --- /dev/null +++ b/e2e/scenarios/ai-sdk-instrumentation/cassette-filter.mjs @@ -0,0 +1,47 @@ +// @ts-check +/** @type {import("@braintrust/seinfeld").FilterSpec} */ +export const filter = [ + "default", + { + ignoreBodyFields: [ + // Ignore all body fields — deterministic call order makes callIndex + // the sole discriminator, which is stable across SDK releases. + "**", + // AI SDK volatile fields (change per-run) + "experimental_generateMessageId", + "messageId", + "messages.*.id", + "messages.*.experimental_messageId", + "input.*.id", + "input.*.experimental_messageId", + // OpenAI Responses API fields added as defaults in newer client versions. + // These don't affect request semantics but change between SDK releases. + "store", + "background", + "truncation", + "instructions", + "moderation", + "reasoning", + "reasoning.effort", + "reasoning.summary", + "safety_identifier", + "service_tier", + "text", + "text.format", + "text.format.type", + "text.verbosity", + "metadata", + "top_logprobs", + "top_p", + "presence_penalty", + "frequency_penalty", + "parallel_tool_calls", + "max_tool_calls", + "prompt_cache_key", + "prompt_cache_retention", + "previous_response_id", + "user", + "include", + ], + }, +]; diff --git a/e2e/scenarios/ai-sdk-otel-export/cassette-filter.mjs b/e2e/scenarios/ai-sdk-otel-export/cassette-filter.mjs new file mode 100644 index 000000000..18ed83184 --- /dev/null +++ b/e2e/scenarios/ai-sdk-otel-export/cassette-filter.mjs @@ -0,0 +1,3 @@ +// @ts-check +// Same AI SDK volatile fields as ai-sdk-instrumentation. +export { filter } from "../ai-sdk-instrumentation/cassette-filter.mjs"; diff --git a/e2e/scenarios/mistral-instrumentation/cassette-filter.mjs b/e2e/scenarios/mistral-instrumentation/cassette-filter.mjs new file mode 100644 index 000000000..7bc4965b4 --- /dev/null +++ b/e2e/scenarios/mistral-instrumentation/cassette-filter.mjs @@ -0,0 +1,39 @@ +// @ts-check +/** @type {import("@braintrust/seinfeld").FilterSpec} */ +export const filter = [ + "default", + { + /** + * Mistral's client generates a unique `name` field per session + * (e.g. "braintrust-e2e-"). Normalize it so cassette matching + * isn't broken by the per-run suffix. + * + * @param {import("@braintrust/seinfeld").RecordedRequest} req + * @returns {import("@braintrust/seinfeld").RecordedRequest} + */ + normalizeRequest(req) { + if ( + req.body.kind !== "json" || + req.body.value === null || + typeof req.body.value !== "object" || + Array.isArray(req.body.value) + ) { + return req; + } + const value = /** @type {Record} */ (req.body.value); + if ( + typeof value["name"] === "string" && + /** @type {string} */ (value["name"]).startsWith("braintrust-e2e-") + ) { + return { + ...req, + body: { + kind: "json", + value: { ...value, name: "braintrust-e2e-" }, + }, + }; + } + return req; + }, + }, +]; diff --git a/e2e/scenarios/openrouter-agent-instrumentation/cassette-filter.mjs b/e2e/scenarios/openrouter-agent-instrumentation/cassette-filter.mjs new file mode 100644 index 000000000..747d4d3a4 --- /dev/null +++ b/e2e/scenarios/openrouter-agent-instrumentation/cassette-filter.mjs @@ -0,0 +1,3 @@ +// @ts-check +// Same body-agnostic matching as openrouter-instrumentation. +export { filter } from "../openrouter-instrumentation/cassette-filter.mjs"; diff --git a/e2e/scenarios/openrouter-instrumentation/cassette-filter.mjs b/e2e/scenarios/openrouter-instrumentation/cassette-filter.mjs new file mode 100644 index 000000000..4c1f82e4c --- /dev/null +++ b/e2e/scenarios/openrouter-instrumentation/cassette-filter.mjs @@ -0,0 +1,12 @@ +// @ts-check +/** @type {import("@braintrust/seinfeld").FilterSpec} */ +export const filter = [ + "default", + { + ignoreBodyFields: [ + // Ignore all body fields — deterministic call order makes callIndex + // the sole discriminator, which is stable across SDK releases. + "**", + ], + }, +]; diff --git a/e2e/scripts/record-cassettes.mjs b/e2e/scripts/record-cassettes.mjs deleted file mode 100644 index 87437a446..000000000 --- a/e2e/scripts/record-cassettes.mjs +++ /dev/null @@ -1,39 +0,0 @@ -#!/usr/bin/env node -// @ts-check -/** - * Convenience wrapper for recording cassettes locally. - * - * pnpm --filter=@braintrust/js-e2e-tests run test:e2e:record [-- vitest args] - * - * Sets BRAINTRUST_E2E_CASSETTE_MODE=record, which overwrites cassette files - * with fresh recordings. - */ -import { spawn } from "node:child_process"; -import * as path from "node:path"; -import { fileURLToPath } from "node:url"; - -const SCRIPT_DIR = path.dirname(fileURLToPath(import.meta.url)); -const E2E_ROOT = path.resolve(SCRIPT_DIR, ".."); - -const args = process.argv.slice(2); - -const PNPM = process.platform === "win32" ? "pnpm.cmd" : "pnpm"; - -const env = { - ...process.env, - BRAINTRUST_E2E_CASSETTE_MODE: "record", -}; - -const child = spawn(PNPM, ["exec", "vitest", "run", ...args], { - cwd: E2E_ROOT, - env, - stdio: "inherit", -}); - -child.on("error", (error) => { - console.error(error); - process.exit(1); -}); -child.on("close", (code) => { - process.exit(code ?? 0); -}); diff --git a/e2e/vitest.setup.ts b/e2e/vitest.setup.ts index 3b910c5b8..f91599cb3 100644 --- a/e2e/vitest.setup.ts +++ b/e2e/vitest.setup.ts @@ -1,19 +1,3 @@ -import * as path from "node:path"; -import { fileURLToPath } from "node:url"; -import { config as loadDotenv } from "dotenv"; import { initializeProdForwarding } from "./helpers/prod-forwarding"; -// Load `.env` from the repo root (and `.env.local` if present, for -// developer-local overrides) into process.env so that local test runs and -// recordings can pick up provider keys without exporting them in the -// shell. Existing env values are preserved (override: false). -const setupDir = path.dirname(fileURLToPath(import.meta.url)); -const repoRoot = path.resolve(setupDir, ".."); -loadDotenv({ path: path.join(repoRoot, ".env"), override: false, quiet: true }); -loadDotenv({ - path: path.join(repoRoot, ".env.local"), - override: false, - quiet: true, -}); - await initializeProdForwarding(); diff --git a/pnpm-lock.yaml b/pnpm-lock.yaml index b8d228f2e..bf0290577 100644 --- a/pnpm-lock.yaml +++ b/pnpm-lock.yaml @@ -87,9 +87,6 @@ importers: braintrust: specifier: workspace:^ version: link:../js - dotenv: - specifier: ^17.2.3 - version: 17.3.1 msw: specifier: ^2.6.6 version: 2.6.6(@types/node@20.19.16)(typescript@5.4.4)