fix: derive wire from spec version + fold tasks/MRTR into harness conventions#8
Conversation
`RunContext.wire` was a parallel knob to `specVersion` that the tasks and MRTR harnesses used to loop scenarios across both the legacy session wire and the SEP-2575 stateless wire. After SEP-2575 collapsed the legacy wire on DRAFT-2026-v1 (a6f7c27), the matrix shrank to a single column on draft — and an explicit `wire` field was both redundant and unsafe: it let callers go out of sync with the `specVersion` they declared on the wire. The CLI's `runServerConformanceTest` never set the field, so every `npm start -- server --scenario tasks-* --url ...` invocation silently emitted the legacy `initialize` handshake against a draft server, producing requests with no `_meta.io.modelcontextprotocol/*` envelope. Strict SEP-2575 servers correctly rejected the traffic; lenient ones accepted it and masked the gap. Changes: - Drop `wire?` from `RunContext`. - Add `isStateless(ctx)` in `_shared/wire-mode.ts`; scenarios derive `stateless` from `POST_SEP_2575_VERSIONS.has(ctx.specVersion)`. - Extend the CLI's spec-version inference to default extension scenarios to draft (today every extension scenario in the repo lives there). Without this, the inference stayed on `LATEST_SPEC_VERSION` for tasks scenarios and kept emitting legacy traffic. - Flatten the now-1-element wire loop in the tasks + mrtr harnesses; remove the parseWireModes/effectiveWireModes/MCP_WIRE_MODES infra that drove it (no remaining consumers). Adds a vitest regression in `runner/server.test.ts` that spins up an in-process recording HTTP server, drives `runServerConformanceTest` at `tasks-lifecycle`, and asserts the first outgoing call is `server/discover` carrying `_meta.io.modelcontextprotocol/*` — the SEP-2575 fingerprint — so the CLI wire selection can't silently regress.
Folds the parallel mini-framework the tasks/MRTR suites had grown into
the existing harness shape pcarleton called out in conformance PR 262.
No new spec coverage; existing scenarios still pass 241/241 unit tests
and the regression added for the wire-derivation fix still pins the CLI
fingerprint.
Layout
scenarios/server/_shared/ is gone. The helper bundle moves to
scenarios/server/tasks-mrtr-helpers.ts (matching the existing
input-required-result-helpers.ts precedent); raw-session.ts and its
test fold into connection/stateless.ts.
Per-suite runners + READMEs (tasks/all-scenarios.test.ts,
mrtr/all-scenarios.test.ts, _shared/test-runner.ts, both READMEs) are
removed. The scenarios are already registered in
scenarios/index.ts pendingClientScenariosList, so the CLI runner
(`npm start -- server --scenario tasks-lifecycle --url ...` or
`--suite pending --url ...`) is the supported entry point. Per
AGENTS.md: don't reimplement the runner.
Connection extension
Connection grows the three things tasks/MRTR scenarios actually need
beyond the existing surface:
- `request(method, params, extraHeaders?)` for the SEP-2243
header-mismatch tests.
- `discover(): Promise<Record<string, unknown>>` for
capability-negotiation scenarios. Resolved eagerly from the SDK
accessors on the stateful wire; issued lazily as `server/discover`
on the stateless wire and memoized.
- `ConnectOptions { capabilities?, clientInfo? }` on
RunContext.connect() so scenarios can negotiate extensions
(`extensions: { 'io.modelcontextprotocol/tasks': {} }`) at
bootstrap on either wire.
Crucially the stateless wire does NOT eagerly emit `server/discover`
when capabilities are declared. SEP-2575 says per-request
`_meta.io.modelcontextprotocol/clientCapabilities` is authoritative;
a server that gates extensions on a prior `server/discover` is
non-conformant, and the conformance suite's job is to surface that,
not paper over it.
Scenario migration
Every tasks/MRTR scenario switches from `initRawSession(...)` to
`await ctx.connect({ capabilities })`, and the capability scenario
switches from `session.initializeResult.capabilities` to
`await withExt.discover()`. The wire choice now flows entirely from
`specVersion` — `isStateless(ctx)` (moved to connection/select.ts
next to `connectFor`) is the single source of truth.
Description sweep
Each scenario's `description` string now spells out the required
server fixtures (tool names + behavior) directly. Previously that
content lived only in the file-header comment, so an implementer
reading the runner's emitted description was missing it.
Reviewer notes
- 23 files changed, +455 / -291 (net + because the Connection
abstraction now carries surface that used to live in raw-session;
the raw-session deletion was 692 lines on its own).
- Unit tests: 241 pass (the 12 deleted tests were redundant with
connection-level coverage; the genuinely-unique SSE multi-line
`data:` test pinned behavior the connection parser doesn't have).
- Reference fixture runs: tasks scenarios surface a fixture-side
spec gap (per-request `_meta` not honored without a prior
`server/discover`). That's the suite working; the gap will be
filed upstream against the fixture, not papered over here.
…old composition into tasks/ Of the 8 checks in mrtr/ephemeral-flow.ts, 7 substantially overlapped the SEP-2322 InputRequiredResult* scenarios that modelcontextprotocol#188 already shipped (basic-elicitation, sampling, list-roots, request-state, multiple-input- requests, multi-round, wrong-input-key). Only sep-2663-mrtr-synchronous- before-task-creation — the SEP-2663 commit 451f5e1 composition path where the MRTR loop's final round returns a CreateTaskResult — is genuinely new versus modelcontextprotocol#188's coverage, and it's SEP-2663-specific, not SEP-2322-specific. Moves that one check to its own tasks scenario in tasks/composition.ts (extensionId = io.modelcontextprotocol/tasks, registered alongside the other tasks-* scenarios in scenarios/index.ts) and deletes the entire mrtr/ directory. Reuses the existing modelcontextprotocol#188 helpers (isInputRequiredResult, mockElicitResponse from input-required-result-helpers.ts) rather than reintroducing the parallel mrtr/helpers.ts ones.
| } catch { | ||
| // non-JSON body — record nothing | ||
| } |
There was a problem hiding this comment.
Wouldn't hitting this be a significant issue?
There was a problem hiding this comment.
True - when I doing this - the catch body was meant to ignore things if NOT json - thougth this would be fine but you are right. Even though the harness is only supposed to ever send JSON RPC bodies, the rare case where it may not be shoudnt be suppressed.
| const response = await send('server/discover'); | ||
| drainEvents(response); | ||
| return unwrap<Record<string, unknown>>('server/discover', response); |
There was a problem hiding this comment.
This just reduces to request('server/discover'), doesn't it?
There was a problem hiding this comment.
yep was on a different train of thought when i was doing this. But that didnt stick so shouldnt have been here. Simplified it now.
| it('returns stateful for dated 2025-x versions', () => { | ||
| expect(connectFor('2025-03-26')).toBe(connectStateful); | ||
| expect(connectFor('2025-06-18')).toBe(connectStateful); | ||
| expect(connectFor('2025-11-25')).toBe(connectStateful); | ||
| // connectFor now always wraps the underlying impl in a closure (to thread | ||
| // ConnectOptions through), so identity with connectStateful/connectStateless | ||
| // no longer holds; assert wire-shape behaviour in the dedicated suites | ||
| // below and via stateless.test.ts. | ||
| it('returns a function for dated 2025-x versions', () => { | ||
| expect(typeof connectFor('2025-03-26')).toBe('function'); | ||
| expect(typeof connectFor('2025-06-18')).toBe('function'); | ||
| expect(typeof connectFor('2025-11-25')).toBe('function'); | ||
| }); | ||
| it('returns stateless for the draft version', () => { | ||
| // connectFor wraps connectStateless in a closure (to pass the spec | ||
| // version through), so identity with connectStateless no longer holds; | ||
| // assert it did not select the stateful implementation. The wire-level | ||
| // behaviour of the wrapper is covered in stateless.test.ts. | ||
| it('returns a function for the draft version', () => { | ||
| expect(typeof connectFor('DRAFT-2026-v1')).toBe('function'); | ||
| expect(connectFor('DRAFT-2026-v1')).not.toBe(connectStateful); | ||
| expect(connectFor('DRAFT-2026-v1')).not.toBe(connectStateless); | ||
| }); |
There was a problem hiding this comment.
This looks tautological? It's not clear to me how this could fail, and what it would tell us if it does.
- connectStateless.discover() now reuses the request() pipeline (send + drainEvents + unwrap) instead of inlining a duplicate copy. The local memoization stays via discoverPromise ??=. - runner/server.test.ts mock server no longer silently swallows unparseable bodies. They land in a parseFailures sidecar that the test asserts is empty — a non-JSON body to this mock is exactly the kind of regression the test exists to catch. - Drop the tautological connectFor identity tests. They asserted `typeof === 'function'` and `not.toBe(connectStateful)`, both of which hold trivially once connectFor wraps in a closure. Wire-shape behavior is already covered by the connectStateless mock-fetch suite and by runner/server.test.ts.
Conflicts in src/connection/{select.ts, connection.test.ts}.
- select.ts: keep the ConnectOptions threading from this branch and the
isStatefulVersion / ALL_SPEC_VERSIONS / STATELESS_SPEC_VERSIONS
helpers introduced upstream for the version-aware MockServer. connectFor
now goes through isStatefulVersion while still forwarding opts; isStateless
uses the same helper.
- connection.test.ts: keep upstream's STATELESS_SPEC_VERSIONS assertions
(genuine round-trip against isStatefulVersion); drop the old connectFor
identity test that this branch already removed as tautological once
connectFor wraps in a closure.
Full vitest run: 267/267 pass; tsc clean.
Addresses the review comments on conformance#262 plus a CLI bug where draft-only scenarios silently used the legacy
initializewire instead of SEP-2575, on a single follow-up branch so it can be reviewed in isolation before merging back into PR 262.What's in here
Wire derivation (fixes a CLI
_meta-on-draft bug, addresses comment 3341183243)wire?field fromRunContext. Scenarios derive wire fromspecVersionviaisStateless(ctx)(returns true when the spec version is post-SEP-2575). Now lives inconnection/select.tsnext toconnectFor, sharing the singleSTATEFUL_VERSIONSsource of truth.LATEST_SPEC_VERSIONfor tasks scenarios so the CLI silently emitted the legacyinitializehandshake against draft instead of the SEP-2575 per-request_metaenvelope.src/runner/server.test.ts: spins up an in-process recording HTTP server, drivesrunServerConformanceTest("tasks-lifecycle"), asserts noinitializewas sent and the first body carries the SEP-2575_meta.io.modelcontextprotocol/*envelope.Layout: stop forking the harness (addresses 3341935238, 3341984677, 3341986870, 3341988987, 3341965202, 3341967601, 3341970447, 3341972921, 3341942597)
scenarios/server/_shared/entirely.checks.ts+sep-refs.ts+wire-format.tsconsolidate into a single top-levelscenarios/server/tasks-mrtr-helpers.tsmatching the existinginput-required-result-helpers.tspattern.tasks/all-scenarios.test.ts,mrtr/all-scenarios.test.ts,_shared/test-runner.ts. Per AGENTS.md line 55 ("Don't reimplement the runner"), the existing CLI runner is the supported entry point —npm start -- server --scenario tasks-lifecycle --url <fixture>or--suite pending --url <fixture>to run all of them.tasks/README.mdandmrtr/README.md. Fixture requirements moved into each scenario'sdescriptionstring (addresses 3341953147).Connection abstraction (addresses 3341919092)
_shared/raw-session.ts(692 lines) intoconnection/stateless.ts. The TS SDK didn't have theConnectionabstraction when this PR opened; PR 318 landed June 1 with exactly that, so this is the rebase the review asked for.Connectiongrows three additions tasks/MRTR scenarios actually use:request(method, params, extraHeaders?)— third positional arg, for the SEP-2243 header-mismatch tests.discover()— lazy on the stateless wire, eager-resolved from SDK accessors on the stateful wire. Returns the bootstrap-shaped{capabilities, serverInfo, instructions}.ConnectOptions { capabilities?, clientInfo? }threaded throughRunContext.connect()→connectFor→connectStateful/connectStateless.server/discoverwhen capabilities are declared. SEP-2575 says per-request_meta.clientCapabilitiesis authoritative; servers that gate extensions on a priorserver/discoverare non-conformant, and the suite's job is to surface that, not paper over it.MRTR dedupe (addresses 3341960887)
mrtr/ephemeral-flow.tssubstantially overlapped Conformance Tests for SEP-2322 MRTR modelcontextprotocol/conformance#188'sInputRequiredResult*scenarios (basic-elicitation, sampling, list-roots, request-state, multiple-input-requests, multi-round, wrong-input-key). They go away with the wholemrtr/directory.sep-2663-mrtr-synchronous-before-task-creation(MRTR → Tasks composition, SEP-2663 commit 451f5e1), moves to its own tasks scenario attasks/composition.ts(extensionId = tasks, registered alongside the othertasks-*scenarios), reusing Conformance Tests for SEP-2322 MRTR modelcontextprotocol/conformance#188's helpers (isInputRequiredResult,mockElicitResponsefrominput-required-result-helpers.ts).Reviewer's guide
Three commits, each topically self-contained. Reviewing in order matches the historical evolution; reviewing in reverse matches a "what's the final shape" pass.
Commit
6466f05— wire derivation (the CLI_meta-on-draft fix)Smallest, most behavioral. Read this first if you only have 10 minutes.
wire?field dropped fromRunContext. Single-line diff, anchors everything downstream.DRAFT_PROTOCOL_VERSION. The actual fix for the CLI bug; subtle becausetasks-*scenarios were falling through toLATEST_SPEC_VERSION(a 2025-x stateful version) and silently emitting legacyinitialize.initializeis never sent + first body has_meta. This is the test the bug would regress.stateless: ctx.wire === 'stateless'→stateless: isStateless(ctx)); skim, don't grep.Commit
3d026d4— layout refactor + Connection extensionThe bulk of the diff. Two distinct sub-changes that landed together because the layout flatten needed Connection to absorb raw-session's surface first.
Connection extension (read first to ground the rest):
ConnectOptionstype, newdiscover()method,extraHeaders3rd-arg onrequest. The whole API surface fits on one screen; the rest of the commit just propagates.mcpNameForRequestgainstasks/get|update|cancel(usesparams.taskId);withRequestMetanow accepts capabilities/clientInfo overrides;connectStatelessdoes lazy memoizedserver/discover(NOT eager — deliberately).discover()from the SDK Client's post-initializeaccessors.extraHeadersthrows on stateful (no current 2025-x scenario needs it; flag loudly if one shows up).connectFornow threadsConnectOptions;isStateless(ctx)helper lives here alongsideSTATEFUL_VERSIONS(single source of truth for the wire decision).ConnectOptionsthrough.Layout flatten (read after Connection lands in your head):
_shared/{checks,sep-refs,wire-format}.ts. Matches the existinginput-required-result-helpers.tsprecedent.src/scenarios/server/_shared/*— all 7 files deleted. raw-session's behavior is inconnection/stateless.tsnow; the others fold intotasks-mrtr-helpers.ts.descriptionstring.tasks/*.ts,mrtr/ephemeral-flow.ts) are largelyinitRawSession(...)→ctx.connect({...})plus removed unusedserverUrldestructures. Each scenario'sdescriptionalso picks up aRequired server fixturesblock at the bottom — that's the answer to comment 3341953147.Commit
aeb450e— MRTR dedupe (the smallest blast radius)Pure net-negative diff (+190 / −725). Last in the sequence because it required the layout to have landed.
mrtr/directory is gone.sep-2663-mrtr-synchronous-before-task-creation). The one truly new piece of coverage from the deleted file, repackaged under the tasks suite. ImportsisInputRequiredResult+mockElicitResponsefrominput-required-result-helpers.tsrather than reintroducing parallel helpers.TasksMrtrCompositionScenario, dropsMrtrEphemeralFlowScenario.Pressure-test
Things worth a second pair of eyes:
request(..., extraHeaders?)as a 3rd positional arg;discover()as a method, not a Promise property;ConnectOptions { capabilities?, clientInfo? }). All three were judgment calls — extraHeaders could have been an options bag;discovercould have been an awaitable property;ConnectOptionscould have grown more fields. Current shape minimizes the diff at call sites at the cost of being slightly less future-proof.src/runner/server.ts— defaulting extension-keyed scenarios to draft is correct today (every extension in the repo lives on draft), but a future non-draft extension would need an explicit lookup. If we should be doing that lookup now (e.g. per-extension map keyed onextensionId), flag it.connectStatelessdoes NOT eagerly emitserver/discoverwhen capabilities are declared). Per SEP-2575, per-request_meta.clientCapabilitiesis authoritative; servers that gate extensions on a priorserver/discoverare non-conformant. The conformance suite's job is to surface that, not paper over it — but if you'd rather the harness behaved like real-world clients (eager bootstrap when capabilities are declared) and surfaced the gap differently, now's the time to push back.src/scenarios/server/tasks-mrtr-helpers.ts—errMsg/failureCheck/skipCheck/AnyResultare generic enough that they could live at a more general top-level location (e.g.scenarios/server/check-helpers.ts). Bundled here because no non-tasks/MRTR scenario needs them today; promote if/when one does.Test plan
npm test— 241 pass (was 253; -12 is the deletedraw-session.test.ts. ~6 were redundant withconnection/connection.test.ts, ~6 pinned behavior the current connection parser doesn't have).npm run lint— clean.npm run build— 427 kB bundle.all-scenarios.test.ts(the existing full conformance suite, spawns the bundled everything-server fixture): 47/47 scenarios pass.runner/server.test.tspins the CLI's SEP-2575 wire fingerprint so the bug can't silently regress.tasks-capability-negotiationpasses 4/4 because it explicitly callsawait withExt.discover()and registers the session. Other tasks scenarios surface a fixture-side spec gap (per-request_metaignored without a priorserver/discover) — to be filed upstream against the fixture, not patched here.What's NOT in here
--spawn-cmdCLI flag to auto-launch fixture binaries - useful but the harness change be in a follow up PR.Today's workflow = spawn the fixture in the background, run
npm start -- server --suite pending --url ..., kill the fixture.