fix(cli): handle streaming error chunks robustly in chat#1792
fix(cli): handle streaming error chunks robustly in chat#1792ianliuy wants to merge 1 commit intoNVIDIA-NeMo:developfrom
Conversation
Fix two bugs in cli/chat.py streaming error handling (issue NVIDIA-NeMo#1771): 1. Empty assistant message appended to history on streaming error: When an error chunk breaks the streaming loop, bot_message_list is empty/partial, causing a blank or incomplete assistant message to be unconditionally appended to conversation history. Fixed by tracking streaming_error flag and guarding history.append(). 2. Fragile error detection via chunk.startswith('{"error"'): Replaced raw string prefix check with json.loads() + dict key check, matching the pattern already used in iorails.py. This handles whitespace variations and alternate JSON formatting. Added 6 regression tests covering both bugs. Fixes NVIDIA-NeMo#1771 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Yiyang Liu <37043548+ianliuy@users.noreply.github.com>
Documentation preview |
📝 WalkthroughWalkthroughThis pull request fixes a streaming error handling bug in the CLI chat module. When streaming errors occur, the code now uses proper JSON parsing instead of fragile string matching for error detection and prevents empty assistant messages from being appended to conversation history. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 3 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@nemoguardrails/cli/chat.py`:
- Around line 87-96: The current check in the streaming chunk handler that
inspects chunk, parses JSON and treats any top-level {"error": {...}} as a
transport failure is too aggressive; change the logic in the chunk-handling
block (the code that reads chunk, error_data, and sets streaming_error) to only
treat a chunk as an internal streaming error when it contains an explicit
sentinel/metadata key (e.g. a reserved field like
"__nemoguardrails_internal_error__" or "_ng_stream_error") rather than any plain
JSON "error" object; keep normal user-produced {"error":...} content as regular
assistant output (do not drop it from history) and only mark streaming_error =
True when the sentinel/metadata is present or when a true transport/parse
failure occurs.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 84751eb0-4912-4e2d-b8c9-b923a147a48a
📒 Files selected for processing (3)
CHANGELOG.mdnemoguardrails/cli/chat.pytests/cli/test_chat.py
| if isinstance(chunk, str) and chunk.strip().startswith("{"): | ||
| try: | ||
| error_data = json.loads(chunk) | ||
| except (json.JSONDecodeError, ValueError): | ||
| error_data = None | ||
| if isinstance(error_data, dict) and isinstance(error_data.get("error"), dict): | ||
| error_msg = error_data["error"].get("message", "Unknown error") | ||
| console.print(f"\n\n[red]Streaming error: {error_msg}[/]") | ||
| except (json.JSONDecodeError, KeyError): | ||
| console.print(f"\n\n[red]Streaming error: {chunk}[/]") | ||
| break | ||
| streaming_error = True | ||
| break |
There was a problem hiding this comment.
Content-based error detection still breaks valid {"error": {...}} output.
Lines 87-96 are still inferring transport failures from model text. If the assistant is asked to stream a normal JSON object with a top-level error dict, this branch will print it as a streaming failure and drop it from history. Please make the producer emit an unambiguous sentinel/metadata chunk for internal errors instead of reinterpreting user content.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@nemoguardrails/cli/chat.py` around lines 87 - 96, The current check in the
streaming chunk handler that inspects chunk, parses JSON and treats any
top-level {"error": {...}} as a transport failure is too aggressive; change the
logic in the chunk-handling block (the code that reads chunk, error_data, and
sets streaming_error) to only treat a chunk as an internal streaming error when
it contains an explicit sentinel/metadata key (e.g. a reserved field like
"__nemoguardrails_internal_error__" or "_ng_stream_error") rather than any plain
JSON "error" object; keep normal user-produced {"error":...} content as regular
assistant output (do not drop it from history) and only mark streaming_error =
True when the sentinel/metadata is present or when a true transport/parse
failure occurs.
Greptile SummaryThis PR fixes two streaming error-handling bugs in One existing subtle bug is also resolved as a side-effect: the old code could raise
|
| Filename | Overview |
|---|---|
| nemoguardrails/cli/chat.py | Adds streaming_error flag and replaces startswith error detection with JSON parsing; minor unintended behavioral change in the history guard for non-streaming paths. |
| tests/cli/test_chat.py | Adds 7 regression tests covering both bugs: empty/partial history pollution and whitespace/format variants in error detection; test assertions are correct and baselines hold. |
| CHANGELOG.md | Adds [Unreleased] section with bug fix entry linking to issue #1771; entry is accurate and follows existing format. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[New user turn - streaming_error = False] --> B{server_url?}
B -- No --> C{streaming AND rails_app?}
C -- Yes --> D[async for chunk in stream_async]
D --> E{chunk is str AND starts with curly brace?}
E -- Yes --> F[json.loads chunk]
F --> G{JSONDecodeError?}
G -- Yes --> H[error_data = None - fall through as content]
G -- No --> I{error_data is dict with dict error key?}
H --> J[print green + append to bot_message_list]
I -- Yes --> K[print red error - streaming_error = True - break]
I -- No --> J
E -- No --> J
D -- loop ends --> L[bot_message = join bot_message_list]
K --> L
C -- No --> M[generate_async to bot_message]
B -- Yes --> N[HTTP request to bot_message]
L --> O{bot_message has content AND NOT streaming_error?}
M --> O
N --> O
O -- Yes --> P[history.append bot_message]
O -- No --> Q[skip append - turn preserved without assistant reply]
Prompt To Fix All With AI
This is a comment left during a code review.
Path: nemoguardrails/cli/chat.py
Line: 176-177
Comment:
**`content` truthiness guard affects all code paths, not just streaming errors**
The `bot_message.get("content")` check at line 176 is a broader behavioral change than the stated fix. For the non-streaming path and the server-URL path, `streaming_error` is always `False`, so the only relevant guard is the content truthiness check. Any legitimately empty bot response (e.g. a server returns `{"role": "assistant", "content": ""}`) will now be silently skipped and never appended to history, which was not the original behavior. The intended fix only required guarding against the error-path:
```suggestion
if not streaming_error and bot_message.get("content") is not None:
history.append(bot_message)
```
Alternatively, limiting the content check to the streaming path (where `bot_message_text` is always a string join) keeps the non-streaming and server-URL paths unchanged.
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: nemoguardrails/cli/chat.py
Line: 87-99
Comment:
**Malformed error JSON is now silently treated as content**
Previously, if a chunk matched `startswith('{"error"')` but was malformed JSON, the old code printed `"Streaming error: <chunk>"` and broke out of the loop. With the new logic, if `json.loads` raises `JSONDecodeError`, `error_data` is set to `None`, the inner `isinstance` check is skipped, and the malformed chunk falls through to be printed as green content and appended to `bot_message_list`. A server that emits a malformed error payload starting with `{` will now silently appear as garbled output in the chat rather than triggering an error message. Consider logging or printing a warning when parsing fails but the chunk looked like JSON.
How can I resolve this? If you propose a fix, please make it concise.Reviews (1): Last reviewed commit: "fix(cli): handle streaming error chunks ..." | Re-trigger Greptile
| if bot_message.get("content") and not streaming_error: | ||
| history.append(bot_message) |
There was a problem hiding this comment.
content truthiness guard affects all code paths, not just streaming errors
The bot_message.get("content") check at line 176 is a broader behavioral change than the stated fix. For the non-streaming path and the server-URL path, streaming_error is always False, so the only relevant guard is the content truthiness check. Any legitimately empty bot response (e.g. a server returns {"role": "assistant", "content": ""}) will now be silently skipped and never appended to history, which was not the original behavior. The intended fix only required guarding against the error-path:
| if bot_message.get("content") and not streaming_error: | |
| history.append(bot_message) | |
| if not streaming_error and bot_message.get("content") is not None: | |
| history.append(bot_message) |
Alternatively, limiting the content check to the streaming path (where bot_message_text is always a string join) keeps the non-streaming and server-URL paths unchanged.
Prompt To Fix With AI
This is a comment left during a code review.
Path: nemoguardrails/cli/chat.py
Line: 176-177
Comment:
**`content` truthiness guard affects all code paths, not just streaming errors**
The `bot_message.get("content")` check at line 176 is a broader behavioral change than the stated fix. For the non-streaming path and the server-URL path, `streaming_error` is always `False`, so the only relevant guard is the content truthiness check. Any legitimately empty bot response (e.g. a server returns `{"role": "assistant", "content": ""}`) will now be silently skipped and never appended to history, which was not the original behavior. The intended fix only required guarding against the error-path:
```suggestion
if not streaming_error and bot_message.get("content") is not None:
history.append(bot_message)
```
Alternatively, limiting the content check to the streaming path (where `bot_message_text` is always a string join) keeps the non-streaming and server-URL paths unchanged.
How can I resolve this? If you propose a fix, please make it concise.| if isinstance(chunk, str) and chunk.strip().startswith("{"): | ||
| try: | ||
| error_data = json.loads(chunk) | ||
| except (json.JSONDecodeError, ValueError): | ||
| error_data = None | ||
| if isinstance(error_data, dict) and isinstance(error_data.get("error"), dict): | ||
| error_msg = error_data["error"].get("message", "Unknown error") | ||
| console.print(f"\n\n[red]Streaming error: {error_msg}[/]") | ||
| except (json.JSONDecodeError, KeyError): | ||
| console.print(f"\n\n[red]Streaming error: {chunk}[/]") | ||
| break | ||
| streaming_error = True | ||
| break | ||
|
|
||
| console.print("[green]" + f"{chunk}" + "[/]", end="") | ||
| bot_message_list.append(chunk) |
There was a problem hiding this comment.
Malformed error JSON is now silently treated as content
Previously, if a chunk matched startswith('{"error"') but was malformed JSON, the old code printed "Streaming error: <chunk>" and broke out of the loop. With the new logic, if json.loads raises JSONDecodeError, error_data is set to None, the inner isinstance check is skipped, and the malformed chunk falls through to be printed as green content and appended to bot_message_list. A server that emits a malformed error payload starting with { will now silently appear as garbled output in the chat rather than triggering an error message. Consider logging or printing a warning when parsing fails but the chunk looked like JSON.
Prompt To Fix With AI
This is a comment left during a code review.
Path: nemoguardrails/cli/chat.py
Line: 87-99
Comment:
**Malformed error JSON is now silently treated as content**
Previously, if a chunk matched `startswith('{"error"')` but was malformed JSON, the old code printed `"Streaming error: <chunk>"` and broke out of the loop. With the new logic, if `json.loads` raises `JSONDecodeError`, `error_data` is set to `None`, the inner `isinstance` check is skipped, and the malformed chunk falls through to be printed as green content and appended to `bot_message_list`. A server that emits a malformed error payload starting with `{` will now silently appear as garbled output in the chat rather than triggering an error message. Consider logging or printing a warning when parsing fails but the chunk looked like JSON.
How can I resolve this? If you propose a fix, please make it concise.
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
Description
Fix two streaming error-handling bugs in
cli/chat.pydiscovered during review of #1766 and tracked in #1771:Empty assistant message on error: When a streaming error chunk breaks the loop,
bot_message_listis empty/partial, causing a blank or incomplete{"role": "assistant", "content": ""}to be unconditionally appended to conversation history, corrupting subsequent turns. Fixed by tracking astreaming_errorflag and guardinghistory.append().Fragile error-chunk detection: Errors were matched via
chunk.startswith('{"error"')— a raw string prefix check. Replaced withjson.loads()+ structural validation (isinstance(error_data.get("error"), dict)), matching the pattern already used iniorails.py:360-364. This handles leading whitespace, alternate JSON formatting, and avoids false positives on LLM-generated JSON with a string"error"value.Changes
nemoguardrails/cli/chat.py: Replacestartswithwith proper JSON parsing gated bystartswith("{")pre-check; addstreaming_errorflag; guardhistory.append()to skip empty/errored messages.tests/cli/test_chat.py: Add 7 regression tests covering both bugs (empty history, partial tokens, whitespace errors, formatting variants, false-positive prevention, baselines).CHANGELOG.md: Add bug fix entry under[Unreleased].Testing
All 30 tests in
tests/cli/test_chat.pypass:Pre-commit checks (ruff lint + ruff format) pass.
Related Issue(s)
Fixes #1771
@tgasser-nv @Pouyanpi
Summary by CodeRabbit
Bug Fixes
Tests