fix: preserve reasoning_content in multi-turn tool-call sessions (MiMo compatibility)#259
Open
Noogear wants to merge 3 commits into
Open
fix: preserve reasoning_content in multi-turn tool-call sessions (MiMo compatibility)#259Noogear wants to merge 3 commits into
Noogear wants to merge 3 commits into
Conversation
…sions MiMo API returns 400 when historical assistant messages with tool_calls are missing reasoning_content. VS Code's LanguageModelThinkingPart may not be preserved in conversation history between requests. Changes: - commonApi: accumulate raw reasoning_content during streaming, track emitted tool call IDs, expose getters - openaiApi: convertMessages accepts optional reasoningContentCache and lastReasoningContent; replays cached reasoning for assistant messages with tool_calls when VS Code history lacks thinking parts - provider: manage reasoning content cache (Map) and lastReasoningContent (string) across requests; pass to convertMessages and store after stream Fallback chain (tool_calls messages only): joinedThinking -> cache by tool-call IDs -> lastReasoningContent -> placeholder Only activates when include_reasoning_in_request: true is set.
The previous fix only covered OpenAI mode. The Anthropic path had the identical problem: when VS Code history doesn't preserve LanguageModelThinkingPart, thinking content is lost and falls back to placeholder 'Next step.', causing MiMo 400 errors on Anthropic endpoint. Changes: - anthropicApi: convertMessages now accepts reasoningContentCache and lastReasoningContent, with same fallback chain as openaiApi - provider: pass cache to Anthropic convertMessages, store reasoning content after streaming (mirrors OpenAI path)
|
Same issue: Pls merge it. |
Owner
|
Did you set |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
When using models like Xiaomi MiMo that require
reasoning_content(or thinking blocks) to be preserved in historical assistant messages containingtool_calls, the API returns 400 error on subsequent requests in multi-turn agent sessions.This happens because VS Code's
LanguageModelThinkingPartis not guaranteed to be preserved in conversation history between requests, causing reasoning content to be lost when reconstructing messages.See: https://platform.xiaomimimo.com/docs/zh-CN/usage-guide/passing-back-reasoning_content
Solution
Implemented a 3-level reasoning content caching mechanism, gated by the existing
include_reasoning_in_requestconfig option (opt-in only):_reasoningContentCache): keyed by sorted tool-call IDs, persists across requests within a session_lastReasoningContent): stores the most recent reasoning content as a cross-restart fallback"Next step."when no cached content is availableFiles changed
src/commonApi.ts_accumulatedReasoningContent) and tool-call ID tracking (_emittedToolCallIds) during streamingsrc/openai/openaiApi.tsconvertMessages()now accepts optional reasoning cache; restoresreasoning_contentfor assistant messages with tool_callssrc/anthropic/anthropicApi.tsconvertMessages()now accepts optional reasoning cache; restoresthinkingcontent blocks for assistant messages with tool_use (was completely missing before)src/provider.ts_reasoningContentCacheMap and_lastReasoningContentstring; passes cache to both OpenAI and Anthropic paths; stores reasoning content after streaming completesBehavior
include_reasoning_in_request: trueis set in model configTesting
Verified with Xiaomi MiMo (
mimo-v2.5-pro) using Anthropic protocol mode.Multi-turn agent sessions with tool calls no longer produce 400 errors.
Config example:
{ "id": "mimo-v2.5-pro", "apiMode": "anthropic", "baseUrl": "https://token-plan-sgp.xiaomimimo.com/anthropic", "include_reasoning_in_request": true }