Conform Tool Schemas to JSON Schema 2020-12 per SEP-2106#417
Open
koic wants to merge 1 commit into
Open
Conversation
## Motivation and Context SEP-2106 (modelcontextprotocol/modelcontextprotocol#2106, merged for the 2026-07-28 spec release) makes tool `inputSchema` and `outputSchema` conform to the full JSON Schema 2020-12 vocabulary: an input schema keeps `type: "object"` at the root but may use any 2020-12 keyword below it; an output schema may be ANY valid schema (object, array, primitive, or a root-level composition); and `CallToolResult.structuredContent` widens from an object to any JSON value. The SEP also adds resource bounds: `$ref` resolution is restricted (same-document only in the reference implementation) and composition-heavy documents must be bounded to avoid excessive validation cost. This follows the TypeScript SDK's reference implementation (typescript-sdk#2249; the Python SDK tracks the same work in python-sdk#2792): - `Tool::Schema` now validates against the JSON Schema 2020-12 metaschema rather than the draft-04 metaschema. The draft-04 pin was a stopgap from when the SDK used the `json-schema` gem, which did not support 2020-12; `json_schemer` does, so `$defs`/`$ref` and the rest of the 2020-12 vocabulary now resolve natively. This matches the dialect the SDK already advertises in emitted schemas and the Python SDK's behavior, whose `jsonschema.validate` selects the validator from the schema's `$schema`. - `Tool::Schema` moves root-type defaulting into an overridable `apply_default_root_type!` hook. `InputSchema` keeps the historical `type: "object"` default; `OutputSchema` now applies it only when no root schema keyword (`type`, `$ref`, `oneOf`, `anyOf`, `allOf`, `not`, `if`, `const`, `enum`) is present. The previous unconditional default merged `type: "object"` into root combinators such as `{ oneOf: [...] }`, producing a wrong schema, so that case is a bug fix. - `Tool::Schema` enforces the TypeScript SDK's schema bounds at construction time: only same-document `$ref`/`$dynamicRef`s (starting with `#`, so schema handling can never trigger network or file access), `MAX_SCHEMA_DEPTH = 64` nesting levels, and `MAX_SUBSCHEMA_COUNT = 10_000` subschema objects, all raising `ArgumentError` on violation. - `Server#call_tool` mirrors non-object `structuredContent` into `content` as serialized JSON text when the tool provided no content blocks, so pre-SEP clients that only read `content` still receive the data. Object results and explicit content are untouched. Resolves modelcontextprotocol#377. ## How Has This Been Tested? - `test/mcp/tool/output_schema_test.rb`: root-level `oneOf`, `$ref`+`$defs`, primitive, and `enum` schemas serialize without an injected `type` and validate results correctly; the `properties`-only shorthand still serializes with `type: "object"` (wire-format regression); explicit `type: "array"` keeps working. - `test/mcp/tool/input_schema_test.rb`: an input schema using `$defs`, `$ref`, `oneOf`, `if`/`then`, and `allOf` keeps its object root and round-trips all keywords; a draft-04-only boolean `exclusiveMinimum` is rejected under the 2020-12 dialect while the numeric form is accepted. - `test/mcp/tool/schema_test.rb`: depth and subschema-count bound violations raise `ArgumentError`; non-same-document `$ref`s (remote URI, sibling file) are rejected while `#/$defs/...` is accepted. The previous unbounded-depth caching test is replaced, since the depth bound now rejects such documents by design. - `test/mcp/server_test.rb`: `tools/call` with array `structuredContent` and no content gains the serialized TextContent fallback; explicit content is not overwritten; object `structuredContent` gets no fallback. `bundle exec rake` (tests, RuboCop, and conformance baseline, including the `json-schema-2020-12` server scenario) passes. ## Breaking Changes Three narrow behavior changes, all intentional per the SEP: - Runtime validation now uses the JSON Schema 2020-12 metaschema instead of draft-04. Schemas that rely on draft-04-only syntax are rejected at construction time. The practical case is the boolean `exclusiveMinimum`/`exclusiveMaximum` form (deprecated since draft-06), which must now be the numeric form; the Python SDK rejects it the same way. Other draft-04 spellings (`definitions`, `id`, `dependencies`) still validate, since 2020-12 tolerates unknown keywords. - Schemas that exceed the new resource bounds (nesting deeper than 64, more than 10,000 subschema objects) or use a non-same-document `$ref`/`$dynamicRef` now raise `ArgumentError` at construction time. Previously such documents were accepted (external references were already never fetched, only ignored). - An `OutputSchema` whose root declares a schema keyword other than `type` (e.g. `oneOf`) no longer has `type: "object"` merged into it. The old output was an invalid hybrid schema, so no conforming consumer could have relied on it.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation and Context
SEP-2106 (modelcontextprotocol/modelcontextprotocol#2106, merged for the 2026-07-28 spec release) makes tool
inputSchemaandoutputSchemaconform to the full JSON Schema 2020-12 vocabulary: an input schema keepstype: "object"at the root but may use any 2020-12 keyword below it; an output schema may be ANY valid schema (object, array, primitive, or a root-level composition); andCallToolResult.structuredContentwidens from an object to any JSON value. The SEP also adds resource bounds:$refresolution is restricted (same-document only in the reference implementation) and composition-heavy documents must be bounded to avoid excessive validation cost.This follows the TypeScript SDK's reference implementation (typescript-sdk#2249; the Python SDK tracks the same work in python-sdk#2792):
Tool::Schemanow validates against the JSON Schema 2020-12 metaschema rather than the draft-04 metaschema. The draft-04 pin was a stopgap from when the SDK used thejson-schemagem, which did not support 2020-12;json_schemerdoes, so$defs/$refand the rest of the 2020-12 vocabulary now resolve natively. This matches the dialect the SDK already advertises in emitted schemas and the Python SDK's behavior, whosejsonschema.validateselects the validator from the schema's$schema.Tool::Schemamoves root-type defaulting into an overridableapply_default_root_type!hook.InputSchemakeeps the historicaltype: "object"default;OutputSchemanow applies it only when no root schema keyword (type,$ref,oneOf,anyOf,allOf,not,if,const,enum) is present. The previous unconditional default mergedtype: "object"into root combinators such as{ oneOf: [...] }, producing a wrong schema, so that case is a bug fix.Tool::Schemaenforces the TypeScript SDK's schema bounds at construction time: only same-document$ref/$dynamicRefs (starting with#, so schema handling can never trigger network or file access),MAX_SCHEMA_DEPTH = 64nesting levels, andMAX_SUBSCHEMA_COUNT = 10_000subschema objects, all raisingArgumentErroron violation.Server#call_toolmirrors non-objectstructuredContentintocontentas serialized JSON text when the tool provided no content blocks, so pre-SEP clients that only readcontentstill receive the data. Object results and explicit content are untouched.Resolves #377.
How Has This Been Tested?
test/mcp/tool/output_schema_test.rb: root-leveloneOf,$ref+$defs, primitive, andenumschemas serialize without an injectedtypeand validate results correctly; theproperties-only shorthand still serializes withtype: "object"(wire-format regression); explicittype: "array"keeps working.test/mcp/tool/input_schema_test.rb: an input schema using$defs,$ref,oneOf,if/then, andallOfkeeps its object root and round-trips all keywords; a draft-04-only booleanexclusiveMinimumis rejected under the 2020-12 dialect while the numeric form is accepted.test/mcp/tool/schema_test.rb: depth and subschema-count bound violations raiseArgumentError; non-same-document$refs (remote URI, sibling file) are rejected while#/$defs/...is accepted. The previous unbounded-depth caching test is replaced, since the depth bound now rejects such documents by design.test/mcp/server_test.rb:tools/callwith arraystructuredContentand no content gains the serialized TextContent fallback; explicit content is not overwritten; objectstructuredContentgets no fallback.bundle exec rake(tests, RuboCop, and conformance baseline, including thejson-schema-2020-12server scenario) passes.Breaking Changes
Three narrow behavior changes, all intentional per the SEP:
exclusiveMinimum/exclusiveMaximumform (deprecated since draft-06), which must now be the numeric form; the Python SDK rejects it the same way. Other draft-04 spellings (definitions,id,dependencies) still validate, since 2020-12 tolerates unknown keywords.$ref/$dynamicRefnow raiseArgumentErrorat construction time. Previously such documents were accepted (external references were already never fetched, only ignored).OutputSchemawhose root declares a schema keyword other thantype(e.g.oneOf) no longer hastype: "object"merged into it. The old output was an invalid hybrid schema, so no conforming consumer could have relied on it.Types of changes
Checklist