Replace Large ToolCall Arguments with Pointer to Avoid Duplicate LLM Payload & UI Blocking

<img width="1353" height="701" alt="Image" src="https://github.com/user-attachments/assets/6e5a0ab6-16d7-4a23-9879-685debb18cd4" />


<h3><strong>Summary</strong></h3>
<p>When invoking the <code inline="">identify_improvements</code> tool (and similar tools requiring document context), the <code inline="">toolCall.Arguments</code> currently embeds the full paper text directly in the function arguments. For large documents (10k+ tokens), this creates two problems:</p>
<ol>
<li>
<p><strong>Unnecessary duplication of content</strong><br>
The full text already exists in project storage and does not need to be repeated in every tool call.</p>
</li>
<li>
<p><strong>Poor frontend UX</strong><br>
The UI remains stuck on <code inline="">Preparing Tools…</code> for several seconds because it waits for the full payload to be serialized and streamed.</p>
</li>
</ol>
<p>Example debug state:</p>
<pre><code>toolCall.Name = "identify_improvements"
len(toolCall.Arguments) = 10108 bytes
</code></pre>
<hr>
<h3><strong>Expected Behavior</strong></h3>
<p>Instead of embedding the entire document in the tool call payload, the Arguments should reference the document via a lightweight pointer, e.g.:</p>
<pre><code class="language-json">{
  "doc_id": "paper_123",
  "focus_areas": null,
  "part": "full"
}
</code></pre>
<p>The backend should resolve the actual text <strong>server-side</strong> before invoking the registered tool.</p>
<hr>
<h3><strong>Current Behavior</strong></h3>
<ul>
<li>
<p>LLM response includes full document text inside <code inline="">output[].function_call.arguments</code>.</p>
</li>
<li>
<p>VS Code debugger and client transport show large serialized payloads.</p>
</li>
<li>
<p>Frontend UI blocks with spinner ("Preparing Tools") until the full JSON loads.</p>
</li>
</ul>
<hr>
<h3><strong>Impact</strong></h3>

Area | Impact
-- | --
Latency | 🚩 High – tool call dispatch takes significantly longer
Frontend UX | 🚩 Poor – user perceives UI as stuck
Token Usage | 🚩 Wasteful – LLM resends content it already used
Memory | ⚠️ Higher heap and websocket payloads


<hr>
<h3><strong>Risks</strong></h3>
<ul>
<li>
<p>Tools expecting raw text will require minor adaptation to resolve pointer.</p>
</li>
<li>
<p>Migration must ensure backward compatibility for existing workflows.</p>
</li>
</ul>
<hr>
<h3><strong>Acceptance Criteria</strong></h3>
<ul>
<li>
<p>UI no longer stuck in <code inline="">Preparing Tools…</code> during tool dispatch.</p>
</li>
<li>
<p>Websocket payload size ≤ 1KB for pointer calls.</p>
</li>
<li>
<p>Tool execution still resolves and operates on the correct document.</p>
</li>
</ul>
<hr>
<h3><strong>Labels</strong></h3>
<p><code inline="">performance</code>, <code inline="">backend</code>, <code inline="">UX</code>, <code inline="">enhancement</code></p>


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace Large ToolCall Arguments with Pointer to Avoid Duplicate LLM Payload & UI Blocking #16

Summary

Expected Behavior

Current Behavior

Impact

Risks

Acceptance Criteria

Labels

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Area	Impact
Latency	🚩 High – tool call dispatch takes significantly longer
Frontend UX	🚩 Poor – user perceives UI as stuck
Token Usage	🚩 Wasteful – LLM resends content it already used
Memory	⚠️ Higher heap and websocket payloads

Replace Large ToolCall Arguments with Pointer to Avoid Duplicate LLM Payload & UI Blocking #16

Description

Summary

Expected Behavior

Current Behavior

Impact

Risks

Acceptance Criteria

Labels

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions