Skip to content

Commit b9625bf

Browse files
lorenzbaraldiGWeale
authored andcommitted
fix: Litellm preserve streamed reasoning deltas in LiteLLM adapter
Merge #4952 **Please ensure you have read the [contribution guide](https://github.com/google/adk-python/blob/main/CONTRIBUTING.md) before creating a pull request.** ### Link to Issue or Description of Change **1. Link to an existing issue (if applicable):** **2. Or, if no issue exists, describe the change:** Fixes: #5645 **Problem:** In `LiteLlm` message conversion, reasoning parts were combined with newline injection: `reasoning_content = _NEW_LINE.join(text for text in reasoning_texts if text)` For providers that stream reasoning in delta fragments (for example, vLLM-style reasoning chunks), this mutates the original stream by inserting extra separators. The reconstructed reasoning can differ compared to provider output. **Solution:** Preserve reasoning text exactly as streamed by concatenating fragments without adding separators: `reasoning_content = "".join(text for text in reasoning_texts if text)` This avoids corruption of chunked reasoning while still preserving explicit newlines already present in fragments. Also added targeted regression tests to lock behavior: - `test_content_to_message_param_preserves_chunked_reasoning_deltas` - `test_content_to_message_param_preserves_reasoning_newlines` ### Testing Plan **Unit Tests:** - [x] I have added or updated unit tests for my change. - [x] All unit tests pass locally. Summary of local `pytest` runs: 1. `python -m pytest tests/unittests/models/test_litellm.py -k "content_to_message_param_assistant_thought_and_content_message or preserves_chunked_reasoning_deltas or preserves_reasoning_newlines"` - Result: `3 passed, 244 deselected` 2. `python -m pytest tests/unittests/models/test_litellm.py -k "preserves_chunked_reasoning_deltas or preserves_reasoning_newlines"` - Result: `2 passed, 245 deselected` **Manual End-to-End (E2E) Tests:** ### Checklist - [x] I have read the [CONTRIBUTING.md](https://github.com/google/adk-python/blob/main/CONTRIBUTING.md) document. - [x] I have performed a self-review of my own code. - [x] I have commented my code, particularly in hard-to-understand areas. - [x] I have added tests that prove my fix is effective or that my feature works. - [x] New and existing unit tests pass locally with my changes. - [ ] I have manually tested my changes end-to-end. - [ ] Any dependent changes have been merged and published in downstream modules. ### Additional context Scope is intentionally minimal and low risk: - 1-line behavior change in reasoning-content reconstruction. - 2 regression tests added. - Anthropic `thinking_blocks` path is unchanged. Co-authored-by: George Weale <gweale@google.com> COPYBARA_INTEGRATE_REVIEW=#4952 from lorenzbaraldi:fix/reasoning-accumulation 5a09d55 PiperOrigin-RevId: 938260836
1 parent 0a9ce0f commit b9625bf

2 files changed

Lines changed: 35 additions & 1 deletion

File tree

src/google/adk/models/lite_llm.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1030,7 +1030,9 @@ async def _content_to_message_param(
10301030
):
10311031
reasoning_texts.append(_decode_inline_text_data(part.inline_data.data))
10321032

1033-
reasoning_content = _NEW_LINE.join(text for text in reasoning_texts if text)
1033+
# Preserve reasoning deltas exactly as received. Injecting separators
1034+
# between fragments can corrupt provider-streamed thinking text.
1035+
reasoning_content = "".join(text for text in reasoning_texts if text)
10341036
return ChatCompletionAssistantMessage(
10351037
role=role,
10361038
content=final_content,

tests/unittests/models/test_litellm.py

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2217,6 +2217,38 @@ async def test_content_to_message_param_assistant_thought_and_content_message():
22172217
assert message["reasoning_content"] == "internal reasoning"
22182218

22192219

2220+
@pytest.mark.asyncio
2221+
async def test_content_to_message_param_preserves_chunked_reasoning_deltas():
2222+
thought_part_1 = types.Part.from_text(text="Hel")
2223+
thought_part_1.thought = True
2224+
thought_part_2 = types.Part.from_text(text="lo")
2225+
thought_part_2.thought = True
2226+
content = types.Content(
2227+
role="assistant", parts=[thought_part_1, thought_part_2]
2228+
)
2229+
2230+
message = await _content_to_message_param(content)
2231+
2232+
assert message["role"] == "assistant"
2233+
assert message["content"] is None
2234+
assert message["reasoning_content"] == "Hello"
2235+
2236+
2237+
@pytest.mark.asyncio
2238+
async def test_content_to_message_param_preserves_reasoning_newlines():
2239+
thought_part_1 = types.Part.from_text(text="line 1\n")
2240+
thought_part_1.thought = True
2241+
thought_part_2 = types.Part.from_text(text="line 2")
2242+
thought_part_2.thought = True
2243+
content = types.Content(
2244+
role="assistant", parts=[thought_part_1, thought_part_2]
2245+
)
2246+
2247+
message = await _content_to_message_param(content)
2248+
2249+
assert message["reasoning_content"] == "line 1\nline 2"
2250+
2251+
22202252
@pytest.mark.asyncio
22212253
async def test_content_to_message_param_function_call():
22222254
content = types.Content(

0 commit comments

Comments
 (0)