Skip to content

fix(inference): replay deepseek reasoning_content on tool-call turns (Sentry TAURI-RUST-4KB)#2876

Merged
M3gA-Mind merged 6 commits into
tinyhumansai:mainfrom
M3gA-Mind:fix/sentry-tauri-rust-4kb-deepseek-reasoning-content
May 28, 2026
Merged

fix(inference): replay deepseek reasoning_content on tool-call turns (Sentry TAURI-RUST-4KB)#2876
M3gA-Mind merged 6 commits into
tinyhumansai:mainfrom
M3gA-Mind:fix/sentry-tauri-rust-4kb-deepseek-reasoning-content

Conversation

@M3gA-Mind
Copy link
Copy Markdown
Contributor

Summary

Rebased and fixed version of #2817 by @CodeGhost21.

What changed from #2817:

  1. Added one additional commit to fix the two failing tests (parse_native_response_captures_reasoning_content and parse_native_response_blank_reasoning_is_none). The tests correctly expected reasoning_content to be trimmed and whitespace-only values to be None, but the implementation was cloning the raw untrimmed string. Fixed with the same .as_deref().map(str::trim).filter(!s.is_empty()).map(str::to_owned) pattern used in the chat_with_tools and streaming paths.
  2. Merged current upstream/main to resolve the stale base (includes fix(inference): preserve reasoning_content in multi-turn thinking model conversations #2818, fix(observability): classify list_models 404 as ProviderUserState (Sentry TAURI-RUST-YJ) #2873, fix(cron): accept bare cron-expression string in Schedule deserializer (Sentry CORE-RUST-FY) #2874, etc.).

Note on supersede claim: Despite the comment on #2817 claiming it was superseded by #2818, that is not accurate. PR #2818 (already on main) fixes the main session-turn path via extra_metadata in turn.rs. PR #2817 fixes two separate paths:

  • tool_loop.rs — passes reasoning_content to build_native_assistant_history
  • subagent_runner/ops.rs — same
  • chat_with_tools — returns actual reasoning_content instead of None
  • convert_messages_for_native — lifts reasoning_content from JSON content (with fallback to extra_metadata)

These are complementary to #2818, not redundant with it. On current main, multi-turn reasoning model tool calls via the tool loop still fail because build_native_assistant_history doesn't embed reasoning_content.

Closes #2817.


Original PR description from @CodeGhost21:

  • DeepSeek's thinking mode rejects multi-turn tool calls because we never replayed the model's reasoning_content on the follow-up request.
  • Round-trips reasoning_content for tool-call assistant turns through all four layers of the OpenAI-compatible inference path.
  • Gated by skip_serializing_if = Option::is_none so non-reasoning providers see zero change on the wire.
  • Fixes Sentry TAURI-RUST-4KB (issue 5236) — 31 events since v0.56.0, all multi-turn deepseek-reasoner tool calls.

Test plan

  • cargo test --lib inference::provider::compatible::tests::parse_native_response — 7 passed, 0 failed (includes the 2 previously-failing tests)
  • cargo test --lib "reasoning" (26 tests), cargo test --lib "agent::" (890 tests), cargo test --lib "inference::provider::" (316 tests) — all pass
  • cargo fmt --check clean
  • Diff coverage ≥ 80%

CodeGhost21 and others added 6 commits May 28, 2026 10:40
…(Sentry TAURI-RUST-4KB)

Resolves Sentry issue 5236 (TAURI-RUST-4KB):
https://sentry.tinyhumans.ai/organizations/tinyhumans/issues/5236/

DeepSeek's thinking mode returns `reasoning_content` alongside `tool_calls`
and requires that reasoning to be replayed on the follow-up request. Our
OpenAI-compatible provider dropped it: `ChatResponse`, the assistant history
JSON, and the `NativeMessage` wire type had no carrier for `reasoning_content`,
so the next request omitted it and DeepSeek returned:

  400 Bad Request: The `reasoning_content` in the thinking mode must be
  passed back to the API.

The agent loop (`run_chat_task`) then failed every multi-turn tool call
against deepseek-reasoner (31 events since v0.56.0).

Fix: round-trip `reasoning_content` for tool-call assistant turns across all
four layers —
  - `ChatResponse.reasoning_content` (captured in `parse_native_response`
    and `chat_with_tools`, trimmed; empty -> None)
  - `build_native_assistant_history` writes it into the assistant history
    JSON (omitted when empty)
  - `convert_messages_for_native` lifts it back onto the wire message
  - `NativeMessage.reasoning_content` serializes only when present

Because the field is `skip_serializing_if = Option::is_none` and only
populated for reasoning models, non-reasoning providers see zero change on
the wire.

Tests: provider capture (`parse_native_response_captures_reasoning_content`,
blank -> None), wire round-trip (`convert_preserves/omits_reasoning_content`),
and history-builder round-trip in `parse_tests`.
…Response initializers

The new `ChatResponse.reasoning_content` field was added to every `src/`
initializer but the `tests/calendar_grounding_e2e.rs` integration test was
missed, so the test build failed to compile (error[E0063]: missing field
`reasoning_content`). That broke the Rust Core Tests + Quality, Rust Core
Coverage, and Linux Rust integration-suite checks on this PR. Set the field
to None at both mock-provider initializers; `cargo test --no-run` now
compiles all test targets cleanly.
…ontent

Resolved conflicts in:
- inference/provider/traits.rs: doc comment wording (took main's)
- inference/provider/compatible_types.rs: doc comment wording (took main's)
- inference/provider/compatible.rs: combined both storage approaches —
  prefer JSON-content (tool_loop path) or fall back to extra_metadata
  (session-turn path), so both replay paths work correctly
- agent/dispatcher_tests.rs: indentation (took main's)
- agent/harness/session/turn_tests.rs: indentation (took main's)
- agent/tests.rs: indentation (took main's)
Both the PR and main added parse_native_response_captures_reasoning_content
testing different code paths. Rename the second one (non-streaming API
response path) to avoid the duplicate symbol compile error.
The two tests added in this PR (parse_native_response_captures_reasoning_content
and parse_native_response_blank_reasoning_is_none) expected the field to be
normalised: trimmed and empty-after-trim → None. The implementation was cloning
the raw value verbatim, so whitespace-padded strings weren't trimmed and
whitespace-only strings weren't collapsed to None.

Apply the same `.as_deref().map(str::trim).filter(!s.is_empty()).map(str::to_owned)`
pattern already used in the chat_with_tools and streaming paths.
@M3gA-Mind M3gA-Mind requested a review from a team May 28, 2026 22:21
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 28, 2026

Warning

Review limit reached

@M3gA-Mind, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 53 minutes. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 9a664432-3cfc-4861-9e24-1bf47d90ef8e

📥 Commits

Reviewing files that changed from the base of the PR and between 7fbcbe8 and 6de6c1a.

📒 Files selected for processing (6)
  • src/openhuman/agent/harness/parse.rs
  • src/openhuman/agent/harness/parse_tests.rs
  • src/openhuman/agent/harness/subagent_runner/ops.rs
  • src/openhuman/agent/harness/tool_loop.rs
  • src/openhuman/inference/provider/compatible.rs
  • src/openhuman/inference/provider/compatible_tests.rs

Comment @coderabbitai help to get the list of available commands and usage tips.

@M3gA-Mind M3gA-Mind merged commit fa50ceb into tinyhumansai:main May 28, 2026
28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants