-
Notifications
You must be signed in to change notification settings - Fork 68
Description
Summary
When invoking the identify_improvements tool (and similar tools requiring document context), the toolCall.Arguments currently embeds the full paper text directly in the function arguments. For large documents (10k+ tokens), this creates two problems:
-
Unnecessary duplication of content
The full text already exists in project storage and does not need to be repeated in every tool call. -
Poor frontend UX
The UI remains stuck onPreparing Tools…for several seconds because it waits for the full payload to be serialized and streamed.
Example debug state:
toolCall.Name = "identify_improvements"
len(toolCall.Arguments) = 10108 bytes
Expected Behavior
Instead of embedding the entire document in the tool call payload, the Arguments should reference the document via a lightweight pointer, e.g.:
{
"doc_id": "paper_123",
"focus_areas": null,
"part": "full"
}
The backend should resolve the actual text server-side before invoking the registered tool.
Current Behavior
-
LLM response includes full document text inside
output[].function_call.arguments. -
VS Code debugger and client transport show large serialized payloads.
-
Frontend UI blocks with spinner ("Preparing Tools") until the full JSON loads.
Impact
| Area | Impact |
|---|---|
| Latency | 🚩 High – tool call dispatch takes significantly longer |
| Frontend UX | 🚩 Poor – user perceives UI as stuck |
| Token Usage | 🚩 Wasteful – LLM resends content it already used |
| Memory |
Risks
-
Tools expecting raw text will require minor adaptation to resolve pointer.
-
Migration must ensure backward compatibility for existing workflows.
Acceptance Criteria
-
UI no longer stuck in
Preparing Tools…during tool dispatch. -
Websocket payload size ≤ 1KB for pointer calls.
-
Tool execution still resolves and operates on the correct document.
Labels
performance, backend, UX, enhancement
Metadata
Metadata
Assignees
Labels
Type
Projects
Status