Skip to content

fix(cli): handle streaming error chunks robustly in chat#1792

Open
ianliuy wants to merge 1 commit intoNVIDIA-NeMo:developfrom
ianliuy:fix/issue-1771-streaming-error-handling
Open

fix(cli): handle streaming error chunks robustly in chat#1792
ianliuy wants to merge 1 commit intoNVIDIA-NeMo:developfrom
ianliuy:fix/issue-1771-streaming-error-handling

Conversation

@ianliuy
Copy link
Copy Markdown

@ianliuy ianliuy commented Apr 14, 2026

Description

Fix two streaming error-handling bugs in cli/chat.py discovered during review of #1766 and tracked in #1771:

  1. Empty assistant message on error: When a streaming error chunk breaks the loop, bot_message_list is empty/partial, causing a blank or incomplete {"role": "assistant", "content": ""} to be unconditionally appended to conversation history, corrupting subsequent turns. Fixed by tracking a streaming_error flag and guarding history.append().

  2. Fragile error-chunk detection: Errors were matched via chunk.startswith('{"error"') — a raw string prefix check. Replaced with json.loads() + structural validation (isinstance(error_data.get("error"), dict)), matching the pattern already used in iorails.py:360-364. This handles leading whitespace, alternate JSON formatting, and avoids false positives on LLM-generated JSON with a string "error" value.

Changes

  • nemoguardrails/cli/chat.py: Replace startswith with proper JSON parsing gated by startswith("{") pre-check; add streaming_error flag; guard history.append() to skip empty/errored messages.
  • tests/cli/test_chat.py: Add 7 regression tests covering both bugs (empty history, partial tokens, whitespace errors, formatting variants, false-positive prevention, baselines).
  • CHANGELOG.md: Add bug fix entry under [Unreleased].

Testing

All 30 tests in tests/cli/test_chat.py pass:

tests/cli/test_chat.py ............... 30 passed in 12.63s

Pre-commit checks (ruff lint + ruff format) pass.

Related Issue(s)

Fixes #1771

@tgasser-nv @Pouyanpi

Summary by CodeRabbit

  • Bug Fixes

    • Fixed CLI chat to prevent empty assistant messages from being appended to conversation history when streaming errors occur
    • Enhanced error detection mechanism to properly identify and handle streaming errors with improved reliability
  • Tests

    • Added comprehensive regression test suite for streaming error handling scenarios in CLI chat, including partial response and edge case coverage

Fix two bugs in cli/chat.py streaming error handling (issue NVIDIA-NeMo#1771):

1. Empty assistant message appended to history on streaming error:
   When an error chunk breaks the streaming loop, bot_message_list is
   empty/partial, causing a blank or incomplete assistant message to be
   unconditionally appended to conversation history. Fixed by tracking
   streaming_error flag and guarding history.append().

2. Fragile error detection via chunk.startswith('{"error"'):
   Replaced raw string prefix check with json.loads() + dict key check,
   matching the pattern already used in iorails.py. This handles
   whitespace variations and alternate JSON formatting.

Added 6 regression tests covering both bugs.

Fixes NVIDIA-NeMo#1771

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: Yiyang Liu <37043548+ianliuy@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown
Contributor

Documentation preview

https://nvidia-nemo.github.io/Guardrails/review/pr-1792

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 14, 2026

📝 Walkthrough

Walkthrough

This pull request fixes a streaming error handling bug in the CLI chat module. When streaming errors occur, the code now uses proper JSON parsing instead of fragile string matching for error detection and prevents empty assistant messages from being appended to conversation history.

Changes

Cohort / File(s) Summary
Documentation
CHANGELOG.md
Added [Unreleased] section documenting the bug fix for streaming error handling and JSON parsing improvement.
Streaming Error Handling
nemoguardrails/cli/chat.py
Introduced streaming_error flag to track errors during streaming. Improved error detection by attempting JSON parsing on chunks starting with {, validating as error only when parsed result is a dict containing an error dict. Prevents appending empty assistant messages to history when streaming errors occur.
Test Coverage
tests/cli/test_chat.py
Added TestStreamingErrorRegression test class with comprehensive coverage for streaming error scenarios, including tests for error detection with JSON parsing, prevention of empty/partial assistant messages in history, and edge cases with whitespace/formatting variations in error payloads.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 3 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix(cli): handle streaming error chunks robustly in chat' directly and specifically describes the main change—improving the robustness of streaming error handling in the CLI chat module.
Test Results For Major Changes ✅ Passed PR is a minor bug fix with minimal code changes (+9/-5 lines) in CLI streaming error-handling. Test results documented: 30 tests passed in 12.63s with pre-commit checks passing. Comprehensive regression test coverage (322 lines) added.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@nemoguardrails/cli/chat.py`:
- Around line 87-96: The current check in the streaming chunk handler that
inspects chunk, parses JSON and treats any top-level {"error": {...}} as a
transport failure is too aggressive; change the logic in the chunk-handling
block (the code that reads chunk, error_data, and sets streaming_error) to only
treat a chunk as an internal streaming error when it contains an explicit
sentinel/metadata key (e.g. a reserved field like
"__nemoguardrails_internal_error__" or "_ng_stream_error") rather than any plain
JSON "error" object; keep normal user-produced {"error":...} content as regular
assistant output (do not drop it from history) and only mark streaming_error =
True when the sentinel/metadata is present or when a true transport/parse
failure occurs.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 84751eb0-4912-4e2d-b8c9-b923a147a48a

📥 Commits

Reviewing files that changed from the base of the PR and between 6a7be49 and 860d58a.

📒 Files selected for processing (3)
  • CHANGELOG.md
  • nemoguardrails/cli/chat.py
  • tests/cli/test_chat.py

Comment on lines +87 to +96
if isinstance(chunk, str) and chunk.strip().startswith("{"):
try:
error_data = json.loads(chunk)
except (json.JSONDecodeError, ValueError):
error_data = None
if isinstance(error_data, dict) and isinstance(error_data.get("error"), dict):
error_msg = error_data["error"].get("message", "Unknown error")
console.print(f"\n\n[red]Streaming error: {error_msg}[/]")
except (json.JSONDecodeError, KeyError):
console.print(f"\n\n[red]Streaming error: {chunk}[/]")
break
streaming_error = True
break
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Content-based error detection still breaks valid {"error": {...}} output.

Lines 87-96 are still inferring transport failures from model text. If the assistant is asked to stream a normal JSON object with a top-level error dict, this branch will print it as a streaming failure and drop it from history. Please make the producer emit an unambiguous sentinel/metadata chunk for internal errors instead of reinterpreting user content.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nemoguardrails/cli/chat.py` around lines 87 - 96, The current check in the
streaming chunk handler that inspects chunk, parses JSON and treats any
top-level {"error": {...}} as a transport failure is too aggressive; change the
logic in the chunk-handling block (the code that reads chunk, error_data, and
sets streaming_error) to only treat a chunk as an internal streaming error when
it contains an explicit sentinel/metadata key (e.g. a reserved field like
"__nemoguardrails_internal_error__" or "_ng_stream_error") rather than any plain
JSON "error" object; keep normal user-produced {"error":...} content as regular
assistant output (do not drop it from history) and only mark streaming_error =
True when the sentinel/metadata is present or when a true transport/parse
failure occurs.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Apr 14, 2026

Greptile Summary

This PR fixes two streaming error-handling bugs in nemoguardrails/cli/chat.py: empty/partial assistant messages being unconditionally appended to history after a streaming error, and fragile startswith('{\"error\"') error detection that missed leading whitespace and alternate JSON formatting. The streaming_error flag is correctly scoped (reset each loop iteration) and the structured JSON check matches the pattern already used in iorails.py. Seven targeted regression tests cover both bugs and the main edge cases.

One existing subtle bug is also resolved as a side-effect: the old code could raise AttributeError when a chunk matched {\"error\": \"string_value\"} because str.get() does not exist on strings, which was not caught by the (JSONDecodeError, KeyError) handler.

Confidence Score: 5/5

Safe to merge; both bugs are correctly fixed, tests are comprehensive, and no regressions introduced in the primary streaming path.

All findings are P2: the content-truthiness guard broadening and the silent treatment of malformed JSON are minor behavioral changes unlikely to affect real usage. The core bug fixes are correct and well-tested with 7 new regression cases.

nemoguardrails/cli/chat.py line 176 — the bot_message.get('content') guard applies to non-streaming paths as a side effect.

Important Files Changed

Filename Overview
nemoguardrails/cli/chat.py Adds streaming_error flag and replaces startswith error detection with JSON parsing; minor unintended behavioral change in the history guard for non-streaming paths.
tests/cli/test_chat.py Adds 7 regression tests covering both bugs: empty/partial history pollution and whitespace/format variants in error detection; test assertions are correct and baselines hold.
CHANGELOG.md Adds [Unreleased] section with bug fix entry linking to issue #1771; entry is accurate and follows existing format.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[New user turn - streaming_error = False] --> B{server_url?}
    B -- No --> C{streaming AND rails_app?}
    C -- Yes --> D[async for chunk in stream_async]
    D --> E{chunk is str AND starts with curly brace?}
    E -- Yes --> F[json.loads chunk]
    F --> G{JSONDecodeError?}
    G -- Yes --> H[error_data = None - fall through as content]
    G -- No --> I{error_data is dict with dict error key?}
    H --> J[print green + append to bot_message_list]
    I -- Yes --> K[print red error - streaming_error = True - break]
    I -- No --> J
    E -- No --> J
    D -- loop ends --> L[bot_message = join bot_message_list]
    K --> L
    C -- No --> M[generate_async to bot_message]
    B -- Yes --> N[HTTP request to bot_message]
    L --> O{bot_message has content AND NOT streaming_error?}
    M --> O
    N --> O
    O -- Yes --> P[history.append bot_message]
    O -- No --> Q[skip append - turn preserved without assistant reply]
Loading
Prompt To Fix All With AI
This is a comment left during a code review.
Path: nemoguardrails/cli/chat.py
Line: 176-177

Comment:
**`content` truthiness guard affects all code paths, not just streaming errors**

The `bot_message.get("content")` check at line 176 is a broader behavioral change than the stated fix. For the non-streaming path and the server-URL path, `streaming_error` is always `False`, so the only relevant guard is the content truthiness check. Any legitimately empty bot response (e.g. a server returns `{"role": "assistant", "content": ""}`) will now be silently skipped and never appended to history, which was not the original behavior. The intended fix only required guarding against the error-path:

```suggestion
        if not streaming_error and bot_message.get("content") is not None:
            history.append(bot_message)
```

Alternatively, limiting the content check to the streaming path (where `bot_message_text` is always a string join) keeps the non-streaming and server-URL paths unchanged.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: nemoguardrails/cli/chat.py
Line: 87-99

Comment:
**Malformed error JSON is now silently treated as content**

Previously, if a chunk matched `startswith('{"error"')` but was malformed JSON, the old code printed `"Streaming error: <chunk>"` and broke out of the loop. With the new logic, if `json.loads` raises `JSONDecodeError`, `error_data` is set to `None`, the inner `isinstance` check is skipped, and the malformed chunk falls through to be printed as green content and appended to `bot_message_list`. A server that emits a malformed error payload starting with `{` will now silently appear as garbled output in the chat rather than triggering an error message. Consider logging or printing a warning when parsing fails but the chunk looked like JSON.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "fix(cli): handle streaming error chunks ..." | Re-trigger Greptile

Comment on lines +176 to +177
if bot_message.get("content") and not streaming_error:
history.append(bot_message)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 content truthiness guard affects all code paths, not just streaming errors

The bot_message.get("content") check at line 176 is a broader behavioral change than the stated fix. For the non-streaming path and the server-URL path, streaming_error is always False, so the only relevant guard is the content truthiness check. Any legitimately empty bot response (e.g. a server returns {"role": "assistant", "content": ""}) will now be silently skipped and never appended to history, which was not the original behavior. The intended fix only required guarding against the error-path:

Suggested change
if bot_message.get("content") and not streaming_error:
history.append(bot_message)
if not streaming_error and bot_message.get("content") is not None:
history.append(bot_message)

Alternatively, limiting the content check to the streaming path (where bot_message_text is always a string join) keeps the non-streaming and server-URL paths unchanged.

Prompt To Fix With AI
This is a comment left during a code review.
Path: nemoguardrails/cli/chat.py
Line: 176-177

Comment:
**`content` truthiness guard affects all code paths, not just streaming errors**

The `bot_message.get("content")` check at line 176 is a broader behavioral change than the stated fix. For the non-streaming path and the server-URL path, `streaming_error` is always `False`, so the only relevant guard is the content truthiness check. Any legitimately empty bot response (e.g. a server returns `{"role": "assistant", "content": ""}`) will now be silently skipped and never appended to history, which was not the original behavior. The intended fix only required guarding against the error-path:

```suggestion
        if not streaming_error and bot_message.get("content") is not None:
            history.append(bot_message)
```

Alternatively, limiting the content check to the streaming path (where `bot_message_text` is always a string join) keeps the non-streaming and server-URL paths unchanged.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +87 to 99
if isinstance(chunk, str) and chunk.strip().startswith("{"):
try:
error_data = json.loads(chunk)
except (json.JSONDecodeError, ValueError):
error_data = None
if isinstance(error_data, dict) and isinstance(error_data.get("error"), dict):
error_msg = error_data["error"].get("message", "Unknown error")
console.print(f"\n\n[red]Streaming error: {error_msg}[/]")
except (json.JSONDecodeError, KeyError):
console.print(f"\n\n[red]Streaming error: {chunk}[/]")
break
streaming_error = True
break

console.print("[green]" + f"{chunk}" + "[/]", end="")
bot_message_list.append(chunk)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Malformed error JSON is now silently treated as content

Previously, if a chunk matched startswith('{"error"') but was malformed JSON, the old code printed "Streaming error: <chunk>" and broke out of the loop. With the new logic, if json.loads raises JSONDecodeError, error_data is set to None, the inner isinstance check is skipped, and the malformed chunk falls through to be printed as green content and appended to bot_message_list. A server that emits a malformed error payload starting with { will now silently appear as garbled output in the chat rather than triggering an error message. Consider logging or printing a warning when parsing fails but the chunk looked like JSON.

Prompt To Fix With AI
This is a comment left during a code review.
Path: nemoguardrails/cli/chat.py
Line: 87-99

Comment:
**Malformed error JSON is now silently treated as content**

Previously, if a chunk matched `startswith('{"error"')` but was malformed JSON, the old code printed `"Streaming error: <chunk>"` and broke out of the loop. With the new logic, if `json.loads` raises `JSONDecodeError`, `error_data` is set to `None`, the inner `isinstance` check is skipped, and the malformed chunk falls through to be printed as green content and appended to `bot_message_list`. A server that emits a malformed error payload starting with `{` will now silently appear as garbled output in the chat rather than triggering an error message. Consider logging or printing a warning when parsing fails but the chunk looked like JSON.

How can I resolve this? If you propose a fix, please make it concise.

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 14, 2026

Codecov Report

❌ Patch coverage is 77.77778% with 2 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
nemoguardrails/cli/chat.py 77.77% 2 Missing ⚠️

📢 Thoughts on this report? Let us know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: Chat streaming error handling bugs

1 participant