fix: preserve tool messages in history to prevent Gemini ordering errors by elpiarthera · Pull Request #251 · get-convex/agent

elpiarthera · 2026-04-09T19:21:59Z

Summary

Fixed docsToModelMessages to preserve tool-role messages even when content appears empty — prevents Gemini's strict functionCall → functionResponse ordering from breaking
Added console.warn in filterOutOrphanedToolMessages when tool messages are dropped, logging the message ID, its toolCallIds, and available toolCallIds for diagnostics

Root cause

When tool messages have empty content after orphan filtering, docsToModelMessages filtered them out via .filter(m => !!m.content.length). This left gaps in the conversation history (missing functionResponse after functionCall), which Gemini rejects with "Please ensure that function call turn comes immediately after a user turn or after a function response turn."

Test plan

npm run build passes
npm run lint passes (0 errors)
npm test passes (262/262)

Fixes #200

Orchestrator: Zeta — VantageOS Team Dev | 2026-04-09

docsToModelMessages filtered out messages with empty content, which could drop tool-role messages after orphan filtering. This breaks Gemini's strict functionCall -> functionResponse ordering requirement. Also adds console.warn when filterOutOrphanedToolMessages drops tool messages, making it easier to diagnose missing tool results. Fixes get-convex#200

ianmacartney · 2026-04-09T19:23:27Z

    .map((m) => m.message)
    .filter((m) => !!m)
-    .filter((m) => !!m.content.length)
+    .filter((m) => m.role === "tool" || !!m.content.length)


why are we passing through all tools here just to filter them out elsewhere?

coderabbitai · 2026-04-09T19:25:41Z

📝 Walkthrough

Walkthrough

The filterOutOrphanedToolMessages function is refactored from a single-pass to a three-pass filtering approach: (1) collect all tool-call, tool-result, and approval identifiers; (2) filter assistant messages into an intermediate list while computing surviving tool-call ids; (3) filter tool-role messages to retain only tool-result parts whose toolCallId exists in the surviving set. It now logs a console.warn when a non-empty tool message is dropped. New Vitest tests were added to reproduce the cascade regression and assert invariants and regression cases.

Changes

Cohort / File(s)	Summary
Tool Message Filtering & Tests `src/client/search.ts`, `src/client/search.test.ts`	Refactored `filterOutOrphanedToolMessages` to a three-pass algorithm that tracks surviving tool-call ids and filters tool messages accordingly; added warnings when non-empty tool messages are dropped. Added four tests covering cascade regression, invariant property, regression pair preservation, and mixed-case partial-resolution behavior.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

get-convex/agent#230: Also modifies filterOutOrphanedToolMessages in src/client/search.ts to change tool-call/tool-result/approval processing.

Suggested reviewers

ianmacartney

🚥 Pre-merge checks | ✅ 4

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately reflects the main change: fixing tool message preservation in history to prevent Gemini ordering errors, which is the central fix implemented in the PR.
Description check	✅ Passed	The description clearly explains the fix (preserving tool messages in history), root cause (empty content filtering), and provides test plan verification, all directly related to the changeset.
Linked Issues check	✅ Passed	The PR addresses the core issue `#200` requirements: prevents tool-result message drops via three-pass filtering, adds console.warn diagnostics, and ensures the Gemini-required message sequence by preserving tool messages rather than silently dropping them.
Out of Scope Changes check	✅ Passed	All changes are scoped to fixing issue `#200`: refactored filterOutOrphanedToolMessages, added diagnostic logging, and included comprehensive test coverage for the cascade behavior—no unrelated modifications detected.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

elpiarthera · 2026-04-10T08:43:50Z

Re: @ianmacartney comment on line 138 — good point.

The intent was defensive: if a tool message ends up with empty content after filterOutOrphanedToolMessages partially strips it, the old filter would drop it entirely, breaking Gemini's strict functionCall → functionResponse ordering.

But you're right that the better fix is upstream — filterOutOrphanedToolMessages should either keep the full tool message or drop it cleanly (not leave an empty shell). The console.warn in this PR helps diagnose when that happens.

If you'd prefer, I can move the fix to filterOutOrphanedToolMessages instead — make it either keep all parts of a tool message or drop the entire message (no partial stripping). That way docsToModelMessages doesn't need the special case. Let me know which approach you prefer.

Orchestrator: Zeta — VantageOS Team Dev | 2026-04-10

ianmacartney · 2026-04-10T10:03:53Z

Good thinking. Moving it to `filterOutOrphanedToolMessages` sounds right to me

…

On Fri, Apr 10, 2026 at 1:44 AM elpiarthera ***@***.***> wrote: *elpiarthera* left a comment (get-convex/agent#251) <#251 (comment)> Re: @ianmacartney <https://github.com/ianmacartney> comment on line 138 — good point. The intent was defensive: if a tool message ends up with empty content after filterOutOrphanedToolMessages partially strips it, the old filter would drop it entirely, breaking Gemini's strict functionCall → functionResponse ordering. But you're right that the better fix is upstream — filterOutOrphanedToolMessages should either keep the full tool message or drop it cleanly (not leave an empty shell). The console.warn in this PR helps diagnose when that happens. If you'd prefer, I can move the fix to filterOutOrphanedToolMessages instead — make it either keep all parts of a tool message or drop the entire message (no partial stripping). That way docsToModelMessages doesn't need the special case. Let me know which approach you prefer. Orchestrator: Zeta — VantageOS Team Dev | 2026-04-10 — Reply to this email directly, view it on GitHub <#251 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AACZQWZRCB6A322NIGA4QDL4VCX5XAVCNFSM6AAAAACXS6KD6OVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHM2DEMRSGMYDEOBVG4> . You are receiving this because you were mentioned.Message ID: ***@***.***>

Per maintainer feedback on get-convex#251: keep tool messages fully intact or drop them cleanly — no partial-stripped shells. Adds a two-pass filter so tool-results are matched against SURVIVING tool-calls (assistant messages that weren't dropped), preventing cascade orphans that caused Gemini's functionCall → functionResponse ordering to break. Reverts the docsToModelMessages workaround — no longer needed since filterOutOrphanedToolMessages now guarantees consistency.

elpiarthera · 2026-04-23T13:31:33Z

Thanks for the quick feedback, @ianmacartney — pushed the revision.

Moved the cascade handling fully into filterOutOrphanedToolMessages (no more partial stripping leaking downstream), and reverted the docsToModelMessages workaround. The filter now runs in two relevant passes after ID collection:

Filter assistant messages first — any tool-call whose toolCallId has no matching tool-result and no approval request/response is stripped. If that leaves the assistant message empty, the whole message is dropped (existing behavior).
Track survivingToolCallIds — only the tool-calls in kept assistant messages make it into this set. Tool-calls that were stripped OR lived in a dropped message do NOT survive.
Filter tool messages against survivingToolCallIds (instead of the raw toolCallIds from pass 1) — this closes the cascade: when an assistant message is dropped, its orphaned tool-results in later messages are now correctly filtered out too. Messages become either fully kept or fully dropped — no empty shells.

Approval-workflow logic (approvalRequestsByToolCallId, approvalResponseIds, hasApprovalResponse, hasApprovalRequest) is untouched. The console.warn now logs "surviving toolCallIds" to reflect the new semantic.

All 262 tests pass (including existing filterOutOrphanedToolMessages coverage in src/client/search.test.ts). Diff is tight: 2 files, +32/-7.

Orchestrator: Zeta — VantageOS Team Dev | 2026-04-22

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/client/search.ts (1)
309-330: ⚠️ Potential issue | 🟠 Major

Do not drop pending assistant tool-call turns.

Line 309 filters out assistant tool-call parts without a result/approval, and Lines 316-330 then drop tool-call-only assistant messages. That regresses the existing src/client/index.test.ts expectation where [response1, call2] should keep call2 while dropping only the orphaned tool response.
🐛 Proposed direction
-      const content = doc.message.content.filter(
-        (p) =>
-          p.type !== "tool-call" ||
-          toolResultIds.has(p.toolCallId) ||
-          hasApprovalResponse(p.toolCallId) ||
-          hasApprovalRequest(p.toolCallId),
-      );
+      const content = doc.message.content;
If assistant tool-call pruning is now intentional, the existing tests and behavior contract should be updated explicitly; otherwise, keep assistant tool calls intact and only filter tool-role responses against surviving assistant calls.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/client/search.ts` around lines 309 - 330, The current filter in
src/client/search.ts removes assistant tool-call parts when they lack
results/approvals, which causes assistant tool-call turns to be dropped later;
change the predicate so assistant messages with type "tool-call" are always
retained and only filter out tool-role (p.role === "tool") responses that lack a
surviving call/result/approval. Concretely, modify the content filter (the block
using toolResultIds, hasApprovalResponse, hasApprovalRequest) to apply the
toolResultIds/approval checks only when p.role === "tool" (or equivalently keep
all p.type === "tool-call" where p.role === "assistant"), leaving
survivingToolCallIds and intermediate construction unchanged so assistant
tool-call turns like call2 are preserved.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/client/search.ts`:
- Around line 345-351: The current filter on doc.message.content keeps every
non-"tool-result" part, which lets orphaned "tool-approval-response" entries
pass; update the filter so that when c.type === "tool-approval-response" you
only keep it if its c.toolCallId is present in survivingToolCallIds (similar to
the "tool-result" branch), otherwise drop it; preserve the existing behavior for
truly non-tool parts by still returning true for other types. Use the existing
identifiers doc.message.content, survivingToolCallIds, and the c.type checks to
locate and change the logic.

---

Outside diff comments:
In `@src/client/search.ts`:
- Around line 309-330: The current filter in src/client/search.ts removes
assistant tool-call parts when they lack results/approvals, which causes
assistant tool-call turns to be dropped later; change the predicate so assistant
messages with type "tool-call" are always retained and only filter out tool-role
(p.role === "tool") responses that lack a surviving call/result/approval.
Concretely, modify the content filter (the block using toolResultIds,
hasApprovalResponse, hasApprovalRequest) to apply the toolResultIds/approval
checks only when p.role === "tool" (or equivalently keep all p.type ===
"tool-call" where p.role === "assistant"), leaving survivingToolCallIds and
intermediate construction unchanged so assistant tool-call turns like call2 are
preserved.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 78add62d-6c7d-456f-becc-0a1c8797e849

📥 Commits

Reviewing files that changed from the base of the PR and between 06c6795 and 0648e28.

📒 Files selected for processing (1)

src/client/search.ts

coderabbitai · 2026-04-23T13:35:57Z

      const content = doc.message.content.filter((c) => {
        // tool-result parts have toolCallId
        if (c.type === "tool-result") {
-          return toolCallIds.has(c.toolCallId);
+          return survivingToolCallIds.has(c.toolCallId);
        }
        // tool-approval-response parts don't have toolCallId, so include them
        return true;


⚠️ Potential issue | 🟠 Major

Filter approval responses against surviving approval requests.

Line 351 keeps every non-tool-result part, so orphaned tool-approval-response parts survive even when their approval request/tool call was dropped or never fetched.

🐛 Proposed fix

const survivingToolCallIds = new Set<string>(); + const survivingApprovalIds = new Set<string>(); const intermediate: MessageDoc[] = []; @@ for (const p of content) { if (p.type === "tool-call") { survivingToolCallIds.add(p.toolCallId); + } else if (p.type === "tool-approval-request") { + survivingApprovalIds.add(p.approvalId); } } @@ if (c.type === "tool-result") { return survivingToolCallIds.has(c.toolCallId); } - // tool-approval-response parts don't have toolCallId, so include them + if (c.type === "tool-approval-response") { + return survivingApprovalIds.has(c.approvalId); + } return true;

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/client/search.ts` around lines 345 - 351, The current filter on doc.message.content keeps every non-"tool-result" part, which lets orphaned "tool-approval-response" entries pass; update the filter so that when c.type === "tool-approval-response" you only keep it if its c.toolCallId is present in survivingToolCallIds (similar to the "tool-result" branch), otherwise drop it; preserve the existing behavior for truly non-tool parts by still returning true for other types. Use the existing identifiers doc.message.content, survivingToolCallIds, and the c.type checks to locate and change the logic.

magicseth · 2026-04-27T23:56:56Z

Thanks for the refactor!
Do we have the actual stored thread messages from the failing #200 case, especially the messages immediately before the Gemini 400?

I’m trying to understand whether the corruption was:

an assistant tool-call was preserved but the matching tool-result/function response was dropped,
a tool-result/approval response was preserved after its assistant tool-call was dropped,
an empty role: "tool" message reached docsToModelMessages and was dropped there,
Gemini/provider conversion changed a tool response into a user role in a way this filter doesn’t see, or
something else in the saved order/stepOrder sequence.

Without the concrete message sequence, it’s hard to tell whether commit 2 fixes the real failure or just a plausible orphan-filtering case. Could you add or share a minimal reproduction using the original bad history?

elpiarthera · 2026-05-18T19:20:06Z

@magicseth — fair pushback, let me be transparent.

I don't have the original reporter's failing thread either — ianmacartney asked the same question in January (#200 (comment): "can you dump the messages in the thread earlier than that") and it was never answered. So commit 2 was reasoned from the failure mode described in the issue, not from a captured production trace.

Mapping the cascade fix against your enumeration:

assistant tool-call preserved, matching tool-result dropped — not addressed by this PR; would require a different fix (drop the orphan assistant call).
tool-result preserved after its assistant tool-call was dropped — this is what commit 2 targets. The previous single-pass version filtered tool messages against the original toolCallIds set, so if pass 1 of the assistant filter dropped an assistant message whose calls had no matching results, the tool messages still pointing at those (now-stale) toolCallIds would survive into docsToModelMessages. The three-pass version tracks survivingToolCallIds from the post-filter assistant set, closing that cascade.
empty role: "tool" reaching docsToModelMessages — addressed indirectly: if all of a tool message's content items were orphans, the message was already dropped by the original filter; cascade handling doesn't change that.
provider conversion flipping role to user — out of scope; happens after this filter in the AI SDK's Gemini adapter.

So the honest claim is narrower than "fixes #200": this PR closes one specific class of orphan cascade (case 2). Whether that's the actual root cause of the reporter's intermittent 400 is genuinely unknown without their thread dump.

I can put together a synthetic minimal repro that constructs the structural state this fix prevents (assistant message with all-orphan tool-calls + a subsequent tool message referencing one of them) and asserts the post-filter output is Gemini-valid — happy to add it to this PR if useful. Let me know.

Orchestrator: Zeta — VantageOS Team Dev | 2026-05-18

@magicseth

Constructs the structural state described in get-convex#200 (assistant tool-call with no matching tool-result, followed by a tool message referencing it) and asserts the three-pass filter drops both — closing the orphan cascade that previously reached docsToModelMessages. Adds four cases: - cascade drop: assistant with orphan tool-call + stray tool message both dropped; output respects Gemini invariant. - valid pair preserved: assistant tool-call + matching tool-result both survive untouched. - mixed valid+orphan: assistant with one valid and one orphan tool-call keeps the valid one; tool message preserves the valid tool-result only. - Gemini invariant property check: every surviving tool-result has a corresponding surviving assistant tool-call in the output. Addresses @magicseth's request for a concrete repro on get-convex#251.

elpiarthera · 2026-05-18T19:39:10Z

Orchestrator: Zeta — VantageOS Team Dev | 2026-05-18

coderabbitai

🧹 Nitpick comments (1)

src/client/search.test.ts (1)

529-644: ⚡ Quick win

Consider extracting duplicate invariant validation logic.

The Gemini invariant validation logic (lines 608-637) is nearly identical to the validation in the cascade test (lines 499-526). Extracting this into a helper function would improve maintainability and reduce duplication.

♻️ Proposed refactor to extract helper

Add a helper function in the test file:

+// Helper to validate Gemini invariant: every tool-result has a corresponding assistant tool-call
+function validateGeminiInvariant(result: MessageDoc[]) {
+  const survivingCallIds = new Set<string>();
+  for (const d of result) {
+    if (
+      d.message?.role === "assistant" &&
+      Array.isArray(d.message.content)
+    ) {
+      for (const p of d.message.content) {
+        if (p.type === "tool-call") {
+          survivingCallIds.add(
+            (p as { toolCallId: string }).toolCallId,
+          );
+        }
+      }
+    }
+  }
+
+  for (const d of result) {
+    if (d.message?.role === "tool" && Array.isArray(d.message.content)) {
+      for (const c of d.message.content) {
+        if (c.type === "tool-result") {
+          expect(
+            survivingCallIds.has(
+              (c as { toolCallId: string }).toolCallId,
+            ),
+          ).toBe(true);
+        }
+      }
+    }
+  }
+}

Then replace both validation blocks with:

-      const survivingAssistantCallIds = new Set<string>();
-      for (const d of result) {
-        // ... validation logic ...
-      }
+      validateGeminiInvariant(result);

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/client/search.test.ts` around lines 529 - 644, The test duplicates the
Gemini invariant validation logic; extract it into a shared helper (e.g.
assertEveryToolResultHasSurvivingAssistantCall or validateGeminiInvariant) that
takes the filtered messages array and performs: 1) collect surviving assistant
tool-call ids by scanning message.role === "assistant" and items with type ===
"tool-call", 2) assert every tool message item with type === "tool-result"
references a collected id, and 3) optionally assert presence/absence of specific
_id's as needed; then replace the inline loops in both the cascade test and the
current test with calls to this helper to remove the duplicate blocks and
improve readability.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@src/client/search.test.ts`:
- Around line 529-644: The test duplicates the Gemini invariant validation
logic; extract it into a shared helper (e.g.
assertEveryToolResultHasSurvivingAssistantCall or validateGeminiInvariant) that
takes the filtered messages array and performs: 1) collect surviving assistant
tool-call ids by scanning message.role === "assistant" and items with type ===
"tool-call", 2) assert every tool message item with type === "tool-result"
references a collected id, and 3) optionally assert presence/absence of specific
_id's as needed; then replace the inline loops in both the cascade test and the
current test with calls to this helper to remove the duplicate blocks and
improve readability.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository: get-convex/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 07a2eaf7-7990-498e-9c96-d25a086397df

📥 Commits

Reviewing files that changed from the base of the PR and between 0648e28 and d857523.

📒 Files selected for processing (1)

src/client/search.test.ts

pkg-pr-new · 2026-05-27T22:30:54Z

Open in StackBlitz

npm i https://pkg.pr.new/@convex-dev/agent@251

commit: d857523

elpiarthera mentioned this pull request Apr 9, 2026

Gemini Tool Call History Corruption #200

Open

ianmacartney reviewed Apr 9, 2026

View reviewed changes

coderabbitai Bot reviewed Apr 23, 2026

View reviewed changes

ianmacartney requested a review from sethconvex April 27, 2026 23:12

coderabbitai Bot reviewed May 18, 2026

View reviewed changes

robelest mentioned this pull request May 24, 2026

fix(mapping): preserve tool messages in multi-step history (#200) #272

Open

Conversation

elpiarthera commented Apr 9, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Root cause

Test plan

Uh oh!

ianmacartney Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Uh oh!

elpiarthera commented Apr 10, 2026

Uh oh!

ianmacartney commented Apr 10, 2026 via email

Uh oh!

elpiarthera commented Apr 23, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

magicseth commented Apr 27, 2026

Uh oh!

elpiarthera commented May 18, 2026

Uh oh!

elpiarthera commented May 18, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

pkg-pr-new Bot commented May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

elpiarthera commented Apr 9, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 9, 2026 •

edited

Loading