fix(cli): handle streaming error chunks robustly in chat by ianliuy · Pull Request #1792 · NVIDIA-NeMo/Guardrails

ianliuy · 2026-04-14T20:20:22Z

Description

Fix two streaming error-handling bugs in cli/chat.py discovered during review of #1766 and tracked in #1771:

Empty assistant message on error: When a streaming error chunk breaks the loop, bot_message_list is empty/partial, causing a blank or incomplete {"role": "assistant", "content": ""} to be unconditionally appended to conversation history, corrupting subsequent turns. Fixed by tracking a streaming_error flag and guarding history.append().
Fragile error-chunk detection: Errors were matched via chunk.startswith('{"error"') — a raw string prefix check. Replaced with json.loads() + structural validation (isinstance(error_data.get("error"), dict)), matching the pattern already used in iorails.py:360-364. This handles leading whitespace, alternate JSON formatting, and avoids false positives on LLM-generated JSON with a string "error" value.

Changes

nemoguardrails/cli/chat.py: Replace startswith with proper JSON parsing gated by startswith("{") pre-check; add streaming_error flag; guard history.append() to skip empty/errored messages.
tests/cli/test_chat.py: Add 7 regression tests covering both bugs (empty history, partial tokens, whitespace errors, formatting variants, false-positive prevention, baselines).
CHANGELOG.md: Add bug fix entry under [Unreleased].

Testing

All 30 tests in tests/cli/test_chat.py pass:

tests/cli/test_chat.py ............... 30 passed in 12.63s

Pre-commit checks (ruff lint + ruff format) pass.

Related Issue(s)

Fixes #1771

@tgasser-nv @Pouyanpi

Summary by CodeRabbit

Bug Fixes
- Fixed CLI chat to prevent empty assistant messages from being appended to conversation history when streaming errors occur
- Enhanced error detection mechanism to properly identify and handle streaming errors with improved reliability
Tests
- Added comprehensive regression test suite for streaming error handling scenarios in CLI chat, including partial response and edge case coverage

Fix two bugs in cli/chat.py streaming error handling (issue NVIDIA-NeMo#1771): 1. Empty assistant message appended to history on streaming error: When an error chunk breaks the streaming loop, bot_message_list is empty/partial, causing a blank or incomplete assistant message to be unconditionally appended to conversation history. Fixed by tracking streaming_error flag and guarding history.append(). 2. Fragile error detection via chunk.startswith('{"error"'): Replaced raw string prefix check with json.loads() + dict key check, matching the pattern already used in iorails.py. This handles whitespace variations and alternate JSON formatting. Added 6 regression tests covering both bugs. Fixes NVIDIA-NeMo#1771 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Yiyang Liu <37043548+ianliuy@users.noreply.github.com>

github-actions · 2026-04-14T20:22:53Z

Documentation preview

https://nvidia-nemo.github.io/Guardrails/review/pr-1792

coderabbitai · 2026-04-14T20:26:06Z

📝 Walkthrough

Walkthrough

This pull request fixes a streaming error handling bug in the CLI chat module. When streaming errors occur, the code now uses proper JSON parsing instead of fragile string matching for error detection and prevents empty assistant messages from being appended to conversation history.

Changes

Cohort / File(s)	Summary
Documentation `CHANGELOG.md`	Added [Unreleased] section documenting the bug fix for streaming error handling and JSON parsing improvement.
Streaming Error Handling `nemoguardrails/cli/chat.py`	Introduced `streaming_error` flag to track errors during streaming. Improved error detection by attempting JSON parsing on chunks starting with `{`, validating as error only when parsed result is a dict containing an `error` dict. Prevents appending empty assistant messages to history when streaming errors occur.
Test Coverage `tests/cli/test_chat.py`	Added `TestStreamingErrorRegression` test class with comprehensive coverage for streaming error scenarios, including tests for error detection with JSON parsing, prevention of empty/partial assistant messages in history, and edge cases with whitespace/formatting variations in error payloads.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 3 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'fix(cli): handle streaming error chunks robustly in chat' directly and specifically describes the main change—improving the robustness of streaming error handling in the CLI chat module.
Test Results For Major Changes	✅ Passed	PR is a minor bug fix with minimal code changes (+9/-5 lines) in CLI streaming error-handling. Test results documented: 30 tests passed in 12.63s with pre-commit checks passing. Comprehensive regression test coverage (322 lines) added.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@nemoguardrails/cli/chat.py`:
- Around line 87-96: The current check in the streaming chunk handler that
inspects chunk, parses JSON and treats any top-level {"error": {...}} as a
transport failure is too aggressive; change the logic in the chunk-handling
block (the code that reads chunk, error_data, and sets streaming_error) to only
treat a chunk as an internal streaming error when it contains an explicit
sentinel/metadata key (e.g. a reserved field like
"__nemoguardrails_internal_error__" or "_ng_stream_error") rather than any plain
JSON "error" object; keep normal user-produced {"error":...} content as regular
assistant output (do not drop it from history) and only mark streaming_error =
True when the sentinel/metadata is present or when a true transport/parse
failure occurs.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 84751eb0-4912-4e2d-b8c9-b923a147a48a

📥 Commits

Reviewing files that changed from the base of the PR and between 6a7be49 and 860d58a.

📒 Files selected for processing (3)

CHANGELOG.md
nemoguardrails/cli/chat.py
tests/cli/test_chat.py

coderabbitai · 2026-04-14T20:26:09Z

+                        if isinstance(chunk, str) and chunk.strip().startswith("{"):
                            try:
                                error_data = json.loads(chunk)
+                            except (json.JSONDecodeError, ValueError):
+                                error_data = None
+                            if isinstance(error_data, dict) and isinstance(error_data.get("error"), dict):
                                error_msg = error_data["error"].get("message", "Unknown error")
                                console.print(f"\n\n[red]Streaming error: {error_msg}[/]")
-                            except (json.JSONDecodeError, KeyError):
-                                console.print(f"\n\n[red]Streaming error: {chunk}[/]")
-                            break
+                                streaming_error = True
+                                break


⚠️ Potential issue | 🟠 Major

Content-based error detection still breaks valid {"error": {...}} output.

Lines 87-96 are still inferring transport failures from model text. If the assistant is asked to stream a normal JSON object with a top-level error dict, this branch will print it as a streaming failure and drop it from history. Please make the producer emit an unambiguous sentinel/metadata chunk for internal errors instead of reinterpreting user content.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@nemoguardrails/cli/chat.py` around lines 87 - 96, The current check in the streaming chunk handler that inspects chunk, parses JSON and treats any top-level {"error": {...}} as a transport failure is too aggressive; change the logic in the chunk-handling block (the code that reads chunk, error_data, and sets streaming_error) to only treat a chunk as an internal streaming error when it contains an explicit sentinel/metadata key (e.g. a reserved field like "__nemoguardrails_internal_error__" or "_ng_stream_error") rather than any plain JSON "error" object; keep normal user-produced {"error":...} content as regular assistant output (do not drop it from history) and only mark streaming_error = True when the sentinel/metadata is present or when a true transport/parse failure occurs.

greptile-apps · 2026-04-14T20:26:18Z

Greptile Summary

This PR fixes two streaming error-handling bugs in nemoguardrails/cli/chat.py: empty/partial assistant messages being unconditionally appended to history after a streaming error, and fragile startswith('{\"error\"') error detection that missed leading whitespace and alternate JSON formatting. The streaming_error flag is correctly scoped (reset each loop iteration) and the structured JSON check matches the pattern already used in iorails.py. Seven targeted regression tests cover both bugs and the main edge cases.

One existing subtle bug is also resolved as a side-effect: the old code could raise AttributeError when a chunk matched {\"error\": \"string_value\"} because str.get() does not exist on strings, which was not caught by the (JSONDecodeError, KeyError) handler.

Confidence Score: 5/5

Safe to merge; both bugs are correctly fixed, tests are comprehensive, and no regressions introduced in the primary streaming path.

All findings are P2: the content-truthiness guard broadening and the silent treatment of malformed JSON are minor behavioral changes unlikely to affect real usage. The core bug fixes are correct and well-tested with 7 new regression cases.

nemoguardrails/cli/chat.py line 176 — the bot_message.get('content') guard applies to non-streaming paths as a side effect.

Important Files Changed

Filename	Overview
nemoguardrails/cli/chat.py	Adds `streaming_error` flag and replaces `startswith` error detection with JSON parsing; minor unintended behavioral change in the history guard for non-streaming paths.
tests/cli/test_chat.py	Adds 7 regression tests covering both bugs: empty/partial history pollution and whitespace/format variants in error detection; test assertions are correct and baselines hold.
CHANGELOG.md	Adds `[Unreleased]` section with bug fix entry linking to issue #1771; entry is accurate and follows existing format.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[New user turn - streaming_error = False] --> B{server_url?}
    B -- No --> C{streaming AND rails_app?}
    C -- Yes --> D[async for chunk in stream_async]
    D --> E{chunk is str AND starts with curly brace?}
    E -- Yes --> F[json.loads chunk]
    F --> G{JSONDecodeError?}
    G -- Yes --> H[error_data = None - fall through as content]
    G -- No --> I{error_data is dict with dict error key?}
    H --> J[print green + append to bot_message_list]
    I -- Yes --> K[print red error - streaming_error = True - break]
    I -- No --> J
    E -- No --> J
    D -- loop ends --> L[bot_message = join bot_message_list]
    K --> L
    C -- No --> M[generate_async to bot_message]
    B -- Yes --> N[HTTP request to bot_message]
    L --> O{bot_message has content AND NOT streaming_error?}
    M --> O
    N --> O
    O -- Yes --> P[history.append bot_message]
    O -- No --> Q[skip append - turn preserved without assistant reply]

Prompt To Fix All With AI

This is a comment left during a code review.
Path: nemoguardrails/cli/chat.py
Line: 176-177

Comment:
**`content` truthiness guard affects all code paths, not just streaming errors**

The `bot_message.get("content")` check at line 176 is a broader behavioral change than the stated fix. For the non-streaming path and the server-URL path, `streaming_error` is always `False`, so the only relevant guard is the content truthiness check. Any legitimately empty bot response (e.g. a server returns `{"role": "assistant", "content": ""}`) will now be silently skipped and never appended to history, which was not the original behavior. The intended fix only required guarding against the error-path:

```suggestion
        if not streaming_error and bot_message.get("content") is not None:
            history.append(bot_message)
```

Alternatively, limiting the content check to the streaming path (where `bot_message_text` is always a string join) keeps the non-streaming and server-URL paths unchanged.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: nemoguardrails/cli/chat.py
Line: 87-99

Comment:
**Malformed error JSON is now silently treated as content**

Previously, if a chunk matched `startswith('{"error"')` but was malformed JSON, the old code printed `"Streaming error: <chunk>"` and broke out of the loop. With the new logic, if `json.loads` raises `JSONDecodeError`, `error_data` is set to `None`, the inner `isinstance` check is skipped, and the malformed chunk falls through to be printed as green content and appended to `bot_message_list`. A server that emits a malformed error payload starting with `{` will now silently appear as garbled output in the chat rather than triggering an error message. Consider logging or printing a warning when parsing fails but the chunk looked like JSON.

How can I resolve this? If you propose a fix, please make it concise.

_{Reviews (1): Last reviewed commit: "fix(cli): handle streaming error chunks ..." | Re-trigger Greptile}

greptile-apps · 2026-04-14T20:26:22Z

+        if bot_message.get("content") and not streaming_error:
+            history.append(bot_message)


content truthiness guard affects all code paths, not just streaming errors

The bot_message.get("content") check at line 176 is a broader behavioral change than the stated fix. For the non-streaming path and the server-URL path, streaming_error is always False, so the only relevant guard is the content truthiness check. Any legitimately empty bot response (e.g. a server returns {"role": "assistant", "content": ""}) will now be silently skipped and never appended to history, which was not the original behavior. The intended fix only required guarding against the error-path:

Suggested change

if bot_message.get("content") and not streaming_error:

history.append(bot_message)

if not streaming_error and bot_message.get("content") is not None:

history.append(bot_message)

Alternatively, limiting the content check to the streaming path (where bot_message_text is always a string join) keeps the non-streaming and server-URL paths unchanged.

Prompt To Fix With AI

This is a comment left during a code review. Path: nemoguardrails/cli/chat.py Line: 176-177 Comment: **`content` truthiness guard affects all code paths, not just streaming errors** The `bot_message.get("content")` check at line 176 is a broader behavioral change than the stated fix. For the non-streaming path and the server-URL path, `streaming_error` is always `False`, so the only relevant guard is the content truthiness check. Any legitimately empty bot response (e.g. a server returns `{"role": "assistant", "content": ""}`) will now be silently skipped and never appended to history, which was not the original behavior. The intended fix only required guarding against the error-path: ```suggestion if not streaming_error and bot_message.get("content") is not None: history.append(bot_message) ``` Alternatively, limiting the content check to the streaming path (where `bot_message_text` is always a string join) keeps the non-streaming and server-URL paths unchanged. How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-04-14T20:26:23Z

+                        if isinstance(chunk, str) and chunk.strip().startswith("{"):
                            try:
                                error_data = json.loads(chunk)
+                            except (json.JSONDecodeError, ValueError):
+                                error_data = None
+                            if isinstance(error_data, dict) and isinstance(error_data.get("error"), dict):
                                error_msg = error_data["error"].get("message", "Unknown error")
                                console.print(f"\n\n[red]Streaming error: {error_msg}[/]")
-                            except (json.JSONDecodeError, KeyError):
-                                console.print(f"\n\n[red]Streaming error: {chunk}[/]")
-                            break
+                                streaming_error = True
+                                break

                        console.print("[green]" + f"{chunk}" + "[/]", end="")
                        bot_message_list.append(chunk)


Malformed error JSON is now silently treated as content

Previously, if a chunk matched startswith('{"error"') but was malformed JSON, the old code printed "Streaming error: <chunk>" and broke out of the loop. With the new logic, if json.loads raises JSONDecodeError, error_data is set to None, the inner isinstance check is skipped, and the malformed chunk falls through to be printed as green content and appended to bot_message_list. A server that emits a malformed error payload starting with { will now silently appear as garbled output in the chat rather than triggering an error message. Consider logging or printing a warning when parsing fails but the chunk looked like JSON.

Prompt To Fix With AI

This is a comment left during a code review. Path: nemoguardrails/cli/chat.py Line: 87-99 Comment: **Malformed error JSON is now silently treated as content** Previously, if a chunk matched `startswith('{"error"')` but was malformed JSON, the old code printed `"Streaming error: <chunk>"` and broke out of the loop. With the new logic, if `json.loads` raises `JSONDecodeError`, `error_data` is set to `None`, the inner `isinstance` check is skipped, and the malformed chunk falls through to be printed as green content and appended to `bot_message_list`. A server that emits a malformed error payload starting with `{` will now silently appear as garbled output in the chat rather than triggering an error message. Consider logging or printing a warning when parsing fails but the chunk looked like JSON. How can I resolve this? If you propose a fix, please make it concise.

codecov · 2026-04-14T20:28:31Z

Codecov Report

❌ Patch coverage is 77.77778% with 2 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
nemoguardrails/cli/chat.py	77.77%	2 Missing ⚠️

📢 Thoughts on this report? Let us know!

ianliuy mentioned this pull request Apr 14, 2026

bug: Chat streaming error handling bugs #1771

Open

4 tasks

coderabbitai bot reviewed Apr 14, 2026

View reviewed changes

greptile-apps bot reviewed Apr 14, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(cli): handle streaming error chunks robustly in chat#1792

fix(cli): handle streaming error chunks robustly in chat#1792
ianliuy wants to merge 1 commit intoNVIDIA-NeMo:developfrom
ianliuy:fix/issue-1771-streaming-error-handling

ianliuy commented Apr 14, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

github-actions bot commented Apr 14, 2026

Uh oh!

coderabbitai bot commented Apr 14, 2026

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Apr 14, 2026

Uh oh!

greptile-apps bot commented Apr 14, 2026

Confidence Score: 5/5

Flowchart

Uh oh!

greptile-apps bot Apr 14, 2026

Uh oh!

greptile-apps bot Apr 14, 2026

Uh oh!

codecov bot commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		if bot_message.get("content") and not streaming_error:
		history.append(bot_message)

Conversation

ianliuy commented Apr 14, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Changes

Testing

Related Issue(s)

Summary by CodeRabbit

Uh oh!

github-actions bot commented Apr 14, 2026

Documentation preview

Uh oh!

coderabbitai bot commented Apr 14, 2026

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot commented Apr 14, 2026

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Flowchart

Uh oh!

greptile-apps bot Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Apr 14, 2026

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ianliuy commented Apr 14, 2026 •

edited by coderabbitai bot

Loading