feat: autonomous tool-calling agent with extended thinking#115
Open
nv78 wants to merge 2 commits intoclaude/add-multimodal-support-QBQcafrom
Open
feat: autonomous tool-calling agent with extended thinking#115nv78 wants to merge 2 commits intoclaude/add-multimodal-support-QBQcafrom
nv78 wants to merge 2 commits intoclaude/add-multimodal-support-QBQcafrom
Conversation
- New AutonomousDocumentAgent in backend/agents/autonomous_agent.py
- Native Anthropic tool_use agentic loop with extended thinking support
(claude-3-7+ models) and graceful fallback for other models
- Native OpenAI function-calling agentic loop for model_type=0
- Tools: retrieve_documents, list_documents, get_chat_history, run_python
- run_python enables real code execution for data analysis tasks
- Streams thinking blocks as {type:"thinking"} + backward-compat llm_reasoning events
- Streams tool_start / tool_end events per iteration
- Replaces sequential LangGraph workflow with a true reason→act→observe loop
- app.py now instantiates AutonomousDocumentAgent instead of ReactiveDocumentAgent
https://claude.ai/code/session_01C9mHttiQ4ZAaBbQecVV7uu
…ug fixes Critical bug fix: - Extended thinking was incorrectly enabled for claude-3-5-sonnet (only claude-3-7+ supports it); fix: check "claude-3-7" only, with betas header New capabilities: - search_web tool: DuckDuckGo HTML search (no API key required), returns titles + snippets + URLs for up to 10 results - fetch_url tool: requests + BeautifulSoup to extract readable text from any public web page (boilerplate removed, truncated to 6k chars) Streaming improvements: - Anthropic path now uses client.messages.stream() for real-time text_token and thinking_delta events (users see tokens as they arrive) - OpenAI path uses streaming=True for real-time text_token events - Both paths accumulate tool_use input JSON deltas correctly Robustness: - Context window pruning (_prune_messages): keeps initial message + last 4 tool-result rounds to prevent context overflow on long agent loops - Tool output truncation: capped at 8000 chars with clear truncation notice - Non-streaming fallback if thinking-enabled call fails - Retry without thinking on Anthropic API errors (e.g. unsupported model) Improved system prompt: - 10 explicit numbered principles for tool ordering and citation behavior - Instructs agent to do multi-step tasks (search → fetch → summarize) https://claude.ai/code/session_01C9mHttiQ4ZAaBbQecVV7uu
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
backend/agents/autonomous_agent.pytool_useAPI with extended thinking enabled for claude-3-7+ models; graceful fallback for other modelsretrieve_documents— semantic search over uploaded docslist_documents— enumerate available filesget_chat_history— fetch recent conversation contextrun_python— execute Python code for data analysis / calculationsthinking,tool_start,tool_end, andcompleteevents — fully compatible with the existing SSE frontend protocol (no frontend changes needed)app.pynow instantiatesAutonomousDocumentAgentinstead ofReactiveDocumentAgentWhy this matters
The old system used a LangChain ReAct prompt loop that forced the model to output
Thought/Action/Action Inputtext and parsed it with regex — fragile and not truly agentic. The LangGraph system was a fixed sequential pipeline with no planning loop.The new agent lets Claude (or GPT-4o) natively decide when to call tools, in what order, and how many times — with full reasoning visible to users via extended thinking blocks.
Test plan
retrieve_documents, cites sourcesrun_pythonto analyzeget_chat_historytool_start/tool_end/completeevents in browser DevToolshttps://claude.ai/code/session_01C9mHttiQ4ZAaBbQecVV7uu