fix: preserve tool messages in history to prevent Gemini ordering errors#251
fix: preserve tool messages in history to prevent Gemini ordering errors#251elpiarthera wants to merge 3 commits into
Conversation
docsToModelMessages filtered out messages with empty content, which could drop tool-role messages after orphan filtering. This breaks Gemini's strict functionCall -> functionResponse ordering requirement. Also adds console.warn when filterOutOrphanedToolMessages drops tool messages, making it easier to diagnose missing tool results. Fixes get-convex#200
| .map((m) => m.message) | ||
| .filter((m) => !!m) | ||
| .filter((m) => !!m.content.length) | ||
| .filter((m) => m.role === "tool" || !!m.content.length) |
There was a problem hiding this comment.
why are we passing through all tools here just to filter them out elsewhere?
📝 WalkthroughWalkthroughThe Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Re: @ianmacartney comment on line 138 — good point. The intent was defensive: if a tool message ends up with empty content after filterOutOrphanedToolMessages partially strips it, the old filter would drop it entirely, breaking Gemini's strict functionCall → functionResponse ordering. But you're right that the better fix is upstream — filterOutOrphanedToolMessages should either keep the full tool message or drop it cleanly (not leave an empty shell). The console.warn in this PR helps diagnose when that happens. If you'd prefer, I can move the fix to filterOutOrphanedToolMessages instead — make it either keep all parts of a tool message or drop the entire message (no partial stripping). That way docsToModelMessages doesn't need the special case. Let me know which approach you prefer. Orchestrator: Zeta — VantageOS Team Dev | 2026-04-10 |
|
Good thinking. Moving it to `filterOutOrphanedToolMessages` sounds right to
me
…On Fri, Apr 10, 2026 at 1:44 AM elpiarthera ***@***.***> wrote:
*elpiarthera* left a comment (get-convex/agent#251)
<#251 (comment)>
Re: @ianmacartney <https://github.com/ianmacartney> comment on line 138 —
good point.
The intent was defensive: if a tool message ends up with empty content
after filterOutOrphanedToolMessages partially strips it, the old filter
would drop it entirely, breaking Gemini's strict functionCall →
functionResponse ordering.
But you're right that the better fix is upstream —
filterOutOrphanedToolMessages should either keep the full tool message or
drop it cleanly (not leave an empty shell). The console.warn in this PR
helps diagnose when that happens.
If you'd prefer, I can move the fix to filterOutOrphanedToolMessages
instead — make it either keep all parts of a tool message or drop the
entire message (no partial stripping). That way docsToModelMessages doesn't
need the special case. Let me know which approach you prefer.
Orchestrator: Zeta — VantageOS Team Dev | 2026-04-10
—
Reply to this email directly, view it on GitHub
<#251 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AACZQWZRCB6A322NIGA4QDL4VCX5XAVCNFSM6AAAAACXS6KD6OVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHM2DEMRSGMYDEOBVG4>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Per maintainer feedback on get-convex#251: keep tool messages fully intact or drop them cleanly — no partial-stripped shells. Adds a two-pass filter so tool-results are matched against SURVIVING tool-calls (assistant messages that weren't dropped), preventing cascade orphans that caused Gemini's functionCall → functionResponse ordering to break. Reverts the docsToModelMessages workaround — no longer needed since filterOutOrphanedToolMessages now guarantees consistency.
|
Thanks for the quick feedback, @ianmacartney — pushed the revision. Moved the cascade handling fully into
Approval-workflow logic ( All 262 tests pass (including existing Orchestrator: Zeta — VantageOS Team Dev | 2026-04-22 |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/client/search.ts (1)
309-330:⚠️ Potential issue | 🟠 MajorDo not drop pending assistant tool-call turns.
Line 309 filters out assistant
tool-callparts without a result/approval, and Lines 316-330 then drop tool-call-only assistant messages. That regresses the existingsrc/client/index.test.tsexpectation where[response1, call2]should keepcall2while dropping only the orphaned tool response.🐛 Proposed direction
- const content = doc.message.content.filter( - (p) => - p.type !== "tool-call" || - toolResultIds.has(p.toolCallId) || - hasApprovalResponse(p.toolCallId) || - hasApprovalRequest(p.toolCallId), - ); + const content = doc.message.content;If assistant tool-call pruning is now intentional, the existing tests and behavior contract should be updated explicitly; otherwise, keep assistant tool calls intact and only filter tool-role responses against surviving assistant calls.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/client/search.ts` around lines 309 - 330, The current filter in src/client/search.ts removes assistant tool-call parts when they lack results/approvals, which causes assistant tool-call turns to be dropped later; change the predicate so assistant messages with type "tool-call" are always retained and only filter out tool-role (p.role === "tool") responses that lack a surviving call/result/approval. Concretely, modify the content filter (the block using toolResultIds, hasApprovalResponse, hasApprovalRequest) to apply the toolResultIds/approval checks only when p.role === "tool" (or equivalently keep all p.type === "tool-call" where p.role === "assistant"), leaving survivingToolCallIds and intermediate construction unchanged so assistant tool-call turns like call2 are preserved.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/client/search.ts`:
- Around line 345-351: The current filter on doc.message.content keeps every
non-"tool-result" part, which lets orphaned "tool-approval-response" entries
pass; update the filter so that when c.type === "tool-approval-response" you
only keep it if its c.toolCallId is present in survivingToolCallIds (similar to
the "tool-result" branch), otherwise drop it; preserve the existing behavior for
truly non-tool parts by still returning true for other types. Use the existing
identifiers doc.message.content, survivingToolCallIds, and the c.type checks to
locate and change the logic.
---
Outside diff comments:
In `@src/client/search.ts`:
- Around line 309-330: The current filter in src/client/search.ts removes
assistant tool-call parts when they lack results/approvals, which causes
assistant tool-call turns to be dropped later; change the predicate so assistant
messages with type "tool-call" are always retained and only filter out tool-role
(p.role === "tool") responses that lack a surviving call/result/approval.
Concretely, modify the content filter (the block using toolResultIds,
hasApprovalResponse, hasApprovalRequest) to apply the toolResultIds/approval
checks only when p.role === "tool" (or equivalently keep all p.type ===
"tool-call" where p.role === "assistant"), leaving survivingToolCallIds and
intermediate construction unchanged so assistant tool-call turns like call2 are
preserved.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
| const content = doc.message.content.filter((c) => { | ||
| // tool-result parts have toolCallId | ||
| if (c.type === "tool-result") { | ||
| return toolCallIds.has(c.toolCallId); | ||
| return survivingToolCallIds.has(c.toolCallId); | ||
| } | ||
| // tool-approval-response parts don't have toolCallId, so include them | ||
| return true; |
There was a problem hiding this comment.
Filter approval responses against surviving approval requests.
Line 351 keeps every non-tool-result part, so orphaned tool-approval-response parts survive even when their approval request/tool call was dropped or never fetched.
🐛 Proposed fix
const survivingToolCallIds = new Set<string>();
+ const survivingApprovalIds = new Set<string>();
const intermediate: MessageDoc[] = [];
@@
for (const p of content) {
if (p.type === "tool-call") {
survivingToolCallIds.add(p.toolCallId);
+ } else if (p.type === "tool-approval-request") {
+ survivingApprovalIds.add(p.approvalId);
}
}
@@
if (c.type === "tool-result") {
return survivingToolCallIds.has(c.toolCallId);
}
- // tool-approval-response parts don't have toolCallId, so include them
+ if (c.type === "tool-approval-response") {
+ return survivingApprovalIds.has(c.approvalId);
+ }
return true;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/client/search.ts` around lines 345 - 351, The current filter on
doc.message.content keeps every non-"tool-result" part, which lets orphaned
"tool-approval-response" entries pass; update the filter so that when c.type ===
"tool-approval-response" you only keep it if its c.toolCallId is present in
survivingToolCallIds (similar to the "tool-result" branch), otherwise drop it;
preserve the existing behavior for truly non-tool parts by still returning true
for other types. Use the existing identifiers doc.message.content,
survivingToolCallIds, and the c.type checks to locate and change the logic.
|
Thanks for the refactor! I’m trying to understand whether the corruption was:
Without the concrete message sequence, it’s hard to tell whether commit 2 fixes the real failure or just a plausible orphan-filtering case. Could you add or share a minimal reproduction using the original bad history? |
|
@magicseth — fair pushback, let me be transparent. I don't have the original reporter's failing thread either — ianmacartney asked the same question in January (#200 (comment): "can you dump the messages in the thread earlier than that") and it was never answered. So commit 2 was reasoned from the failure mode described in the issue, not from a captured production trace. Mapping the cascade fix against your enumeration:
So the honest claim is narrower than "fixes #200": this PR closes one specific class of orphan cascade (case 2). Whether that's the actual root cause of the reporter's intermittent 400 is genuinely unknown without their thread dump. I can put together a synthetic minimal repro that constructs the structural state this fix prevents (assistant message with all-orphan tool-calls + a subsequent tool message referencing one of them) and asserts the post-filter output is Gemini-valid — happy to add it to this PR if useful. Let me know. Orchestrator: Zeta — VantageOS Team Dev | 2026-05-18 |
Constructs the structural state described in get-convex#200 (assistant tool-call with no matching tool-result, followed by a tool message referencing it) and asserts the three-pass filter drops both — closing the orphan cascade that previously reached docsToModelMessages. Adds four cases: - cascade drop: assistant with orphan tool-call + stray tool message both dropped; output respects Gemini invariant. - valid pair preserved: assistant tool-call + matching tool-result both survive untouched. - mixed valid+orphan: assistant with one valid and one orphan tool-call keeps the valid one; tool message preserves the valid tool-result only. - Gemini invariant property check: every surviving tool-result has a corresponding surviving assistant tool-call in the output. Addresses @magicseth's request for a concrete repro on get-convex#251.
|
Orchestrator: Zeta — VantageOS Team Dev | 2026-05-18 |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/client/search.test.ts (1)
529-644: ⚡ Quick winConsider extracting duplicate invariant validation logic.
The Gemini invariant validation logic (lines 608-637) is nearly identical to the validation in the cascade test (lines 499-526). Extracting this into a helper function would improve maintainability and reduce duplication.
♻️ Proposed refactor to extract helper
Add a helper function in the test file:
+// Helper to validate Gemini invariant: every tool-result has a corresponding assistant tool-call +function validateGeminiInvariant(result: MessageDoc[]) { + const survivingCallIds = new Set<string>(); + for (const d of result) { + if ( + d.message?.role === "assistant" && + Array.isArray(d.message.content) + ) { + for (const p of d.message.content) { + if (p.type === "tool-call") { + survivingCallIds.add( + (p as { toolCallId: string }).toolCallId, + ); + } + } + } + } + + for (const d of result) { + if (d.message?.role === "tool" && Array.isArray(d.message.content)) { + for (const c of d.message.content) { + if (c.type === "tool-result") { + expect( + survivingCallIds.has( + (c as { toolCallId: string }).toolCallId, + ), + ).toBe(true); + } + } + } + } +}Then replace both validation blocks with:
- const survivingAssistantCallIds = new Set<string>(); - for (const d of result) { - // ... validation logic ... - } + validateGeminiInvariant(result);🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/client/search.test.ts` around lines 529 - 644, The test duplicates the Gemini invariant validation logic; extract it into a shared helper (e.g. assertEveryToolResultHasSurvivingAssistantCall or validateGeminiInvariant) that takes the filtered messages array and performs: 1) collect surviving assistant tool-call ids by scanning message.role === "assistant" and items with type === "tool-call", 2) assert every tool message item with type === "tool-result" references a collected id, and 3) optionally assert presence/absence of specific _id's as needed; then replace the inline loops in both the cascade test and the current test with calls to this helper to remove the duplicate blocks and improve readability.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@src/client/search.test.ts`:
- Around line 529-644: The test duplicates the Gemini invariant validation
logic; extract it into a shared helper (e.g.
assertEveryToolResultHasSurvivingAssistantCall or validateGeminiInvariant) that
takes the filtered messages array and performs: 1) collect surviving assistant
tool-call ids by scanning message.role === "assistant" and items with type ===
"tool-call", 2) assert every tool message item with type === "tool-result"
references a collected id, and 3) optionally assert presence/absence of specific
_id's as needed; then replace the inline loops in both the cascade test and the
current test with calls to this helper to remove the duplicate blocks and
improve readability.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: get-convex/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 07a2eaf7-7990-498e-9c96-d25a086397df
📒 Files selected for processing (1)
src/client/search.test.ts
commit: |
Summary
docsToModelMessagesto preserve tool-role messages even when content appears empty — prevents Gemini's strictfunctionCall → functionResponseordering from breakingconsole.warninfilterOutOrphanedToolMessageswhen tool messages are dropped, logging the message ID, its toolCallIds, and available toolCallIds for diagnosticsRoot cause
When tool messages have empty content after orphan filtering,
docsToModelMessagesfiltered them out via.filter(m => !!m.content.length). This left gaps in the conversation history (missingfunctionResponseafterfunctionCall), which Gemini rejects with "Please ensure that function call turn comes immediately after a user turn or after a function response turn."Test plan
npm run buildpassesnpm run lintpasses (0 errors)npm testpasses (262/262)Fixes #200
Orchestrator: Zeta — VantageOS Team Dev | 2026-04-09