Skip to content

feat: autonomous tool-calling agent with extended thinking#115

Open
nv78 wants to merge 2 commits intoclaude/add-multimodal-support-QBQcafrom
claude/autonomous-agent-core
Open

feat: autonomous tool-calling agent with extended thinking#115
nv78 wants to merge 2 commits intoclaude/add-multimodal-support-QBQcafrom
claude/autonomous-agent-core

Conversation

@nv78
Copy link
Member

@nv78 nv78 commented Mar 24, 2026

Summary

  • Replaces the sequential LangChain ReAct + LangGraph pipeline with a true autonomous agentic loop in backend/agents/autonomous_agent.py
  • Anthropic path (model_type=1): native tool_use API with extended thinking enabled for claude-3-7+ models; graceful fallback for other models
  • OpenAI path (model_type=0): native function-calling agentic loop via openai SDK
  • Tools available to the agent:
    • retrieve_documents — semantic search over uploaded docs
    • list_documents — enumerate available files
    • get_chat_history — fetch recent conversation context
    • run_python — execute Python code for data analysis / calculations
  • Streams thinking, tool_start, tool_end, and complete events — fully compatible with the existing SSE frontend protocol (no frontend changes needed)
  • app.py now instantiates AutonomousDocumentAgent instead of ReactiveDocumentAgent

Why this matters

The old system used a LangChain ReAct prompt loop that forced the model to output Thought/Action/Action Input text and parsed it with regex — fragile and not truly agentic. The LangGraph system was a fixed sequential pipeline with no planning loop.

The new agent lets Claude (or GPT-4o) natively decide when to call tools, in what order, and how many times — with full reasoning visible to users via extended thinking blocks.

Test plan

  • Send a message with an uploaded PDF → agent calls retrieve_documents, cites sources
  • Ask a data question with a CSV uploaded → agent calls run_python to analyze
  • Ask a follow-up that requires context → agent calls get_chat_history
  • Verify SSE stream produces tool_start / tool_end / complete events in browser DevTools
  • Test with model_type=0 (OpenAI) and model_type=1 (Anthropic)
  • Verify guest mode returns a general-knowledge response without document access

https://claude.ai/code/session_01C9mHttiQ4ZAaBbQecVV7uu

claude added 2 commits March 24, 2026 21:41
- New AutonomousDocumentAgent in backend/agents/autonomous_agent.py
- Native Anthropic tool_use agentic loop with extended thinking support
  (claude-3-7+ models) and graceful fallback for other models
- Native OpenAI function-calling agentic loop for model_type=0
- Tools: retrieve_documents, list_documents, get_chat_history, run_python
- run_python enables real code execution for data analysis tasks
- Streams thinking blocks as {type:"thinking"} + backward-compat llm_reasoning events
- Streams tool_start / tool_end events per iteration
- Replaces sequential LangGraph workflow with a true reason→act→observe loop
- app.py now instantiates AutonomousDocumentAgent instead of ReactiveDocumentAgent

https://claude.ai/code/session_01C9mHttiQ4ZAaBbQecVV7uu
…ug fixes

Critical bug fix:
- Extended thinking was incorrectly enabled for claude-3-5-sonnet (only
  claude-3-7+ supports it); fix: check "claude-3-7" only, with betas header

New capabilities:
- search_web tool: DuckDuckGo HTML search (no API key required), returns
  titles + snippets + URLs for up to 10 results
- fetch_url tool: requests + BeautifulSoup to extract readable text from
  any public web page (boilerplate removed, truncated to 6k chars)

Streaming improvements:
- Anthropic path now uses client.messages.stream() for real-time text_token
  and thinking_delta events (users see tokens as they arrive)
- OpenAI path uses streaming=True for real-time text_token events
- Both paths accumulate tool_use input JSON deltas correctly

Robustness:
- Context window pruning (_prune_messages): keeps initial message + last 4
  tool-result rounds to prevent context overflow on long agent loops
- Tool output truncation: capped at 8000 chars with clear truncation notice
- Non-streaming fallback if thinking-enabled call fails
- Retry without thinking on Anthropic API errors (e.g. unsupported model)

Improved system prompt:
- 10 explicit numbered principles for tool ordering and citation behavior
- Instructs agent to do multi-step tasks (search → fetch → summarize)

https://claude.ai/code/session_01C9mHttiQ4ZAaBbQecVV7uu
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants