Skip to content

fix(client): tolerate invalid UTF-8 from server stdout in stdio_client#2873

Open
Bartok9 wants to merge 1 commit into
modelcontextprotocol:mainfrom
Bartok9:fix/2454-stdio-client-utf8-replace
Open

fix(client): tolerate invalid UTF-8 from server stdout in stdio_client#2873
Bartok9 wants to merge 1 commit into
modelcontextprotocol:mainfrom
Bartok9:fix/2454-stdio-client-utf8-replace

Conversation

@Bartok9

@Bartok9 Bartok9 commented Jun 15, 2026

Copy link
Copy Markdown

Summary

  • Default StdioServerParameters.encoding_error_handler to "replace" so a server emitting malformed UTF-8 no longer crashes the client transport.
  • Malformed bytes now surface as an in-stream JSON parse error and the transport stays alive for subsequent valid messages.

Motivation

Closes #2454.

stdio_client() decoded child stdout with encoding_error_handler="strict". When a spawned server writes invalid UTF-8 bytes to stdout, the UnicodeDecodeError raised during TextReceiveStream iteration escapes the decode loop's except clauses and tears down the transport task group — it surfaces as an ExceptionGroup out of the context manager instead of being delivered as a normal in-stream parse error.

The issue itself points at the analogous server-side hardening (#2302, errors="replace" on stdin), which deliberately chose to: replace invalid bytes with U+FFFD, let JSON validation fail on the malformed line, and keep the transport alive. This change brings the client to parity: defaulting encoding_error_handler to "replace" means the bad line fails JSON-RPC validation and is delivered as an Exception via _parse_line, while later valid messages still come through.

Verification

  • uv run pytest tests/client/test_stdio.py -q — 34 passed, 1 skipped
  • uv run pytest tests/client/ tests/interaction/transports/test_stdio.py -q — 238 passed, 1 skipped, 1 xfailed (no regression from the default change)
  • New regression test test_invalid_utf8_mid_session_surfaces_as_an_in_stream_exception fails without the fix (5s task-group hang) and passes with it.
  • Manual repro from the issue (\xff\xfe\n followed by a valid ping) crashed the task group on main with an ExceptionGroup(UnicodeDecodeError, ...); with the fix it surfaces a ValidationError then reads the following valid message.
  • ruff format --check + ruff check + pyright clean on both changed files.

Notes

  • Salvages the intent of the stale, now-conflicting fix(stdio_client): tolerate invalid UTF-8 from child stdout #2456 by @shaun0927 (same root cause + default flip), rebuilt cleanly against current main after the transport was refactored. The decode/drain plumbing moved, so this is a fresh implementation on the new code path with a mid-session regression test (distinct from the existing test_invalid_utf8_flushed_by_a_dying_server_does_not_break_shutdown, which only covers the raw-bytes shutdown drain).

Closes modelcontextprotocol#2454.

stdio_client decoded child stdout with encoding_error_handler="strict",
so a server emitting malformed bytes mid-session raised UnicodeDecodeError
inside the decode loop, escaping both except clauses and tearing down the
transport task group (surfacing as an ExceptionGroup out of the context
manager) instead of surfacing the bad line as an in-stream parse error.

Default encoding_error_handler to "replace" so invalid bytes become U+FFFD;
the malformed line then fails JSON validation and is delivered as an Exception
via _parse_line, keeping the transport alive for subsequent valid messages.
This mirrors the server-side stdin hardening (errors="replace") referenced
in the issue (modelcontextprotocol#2302).

Verified: repro that crashed the task group on main now surfaces a parse
error then reads the following valid message; new regression test fails
(5s task-group hang) without the fix, passes with it. ruff + pyright clean,
238 client/stdio tests pass.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

stdio_client crashes on malformed UTF-8 from child stdout instead of surfacing parse error

1 participant