Support text-embedded extra tool calls in agent#6
Support text-embedded extra tool calls in agent#6sweettastebuds wants to merge 1 commit intofeature/agent-loopfrom
Conversation
The agent loop's text-embedded tool call detection only matched "execute" tool calls. When the LLM output other tools (e.g. smart_search) as JSON text instead of using the tool-calling API, the raw JSON was returned as the final response and posted as a Gitea comment. - Generalize _TEXT_TOOL_RE regex to match any tool name - Add _TextToolCall dataclass to hold parsed name + arguments - Refactor _extract_text_tool_call() to return _TextToolCall | None - Update agent loop to dispatch text-embedded calls to the correct handler (_handle_tool_call for execute, _dispatch_extra_tool for registered extra tools) - Validate tool name against registered tools before dispatching https://claude.ai/code/session_012TQ4HhX4ys5dEbVbeS7CVP
There was a problem hiding this comment.
Pull request overview
This PR extends the agent’s “text-embedded tool call” support so that JSON tool calls emitted in plain text can dispatch not only execute, but also any registered extra_tools, while sharing the same per-session cap.
Changes:
- Generalized text tool-call detection to extract
{name, arguments}into a new_TextToolCallstructure. - Updated the agent loop to validate text-embedded tool names against registered tools and dispatch either
executeor an extra tool accordingly. - Added/updated tests to cover structured extraction and extra-tool text-embedded scenarios (including cap behavior and mixed-tool sessions).
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
forge_bot/agent.py |
Generalizes regex + extraction to return {name, arguments} and updates run() to dispatch extra tools from text-embedded JSON. |
tests/test_agent.py |
Updates extraction assertions for _TextToolCall and adds new integration tests for text-embedded extra tool dispatch/cap/mixed sessions. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| class TestTextEmbeddedExtraTools: | ||
| @pytest.mark.asyncio | ||
| async def test_text_embedded_extra_tool_dispatched(self) -> None: | ||
| """Text-embedded extra tool calls are dispatched via _dispatch_extra_tool.""" | ||
| text_with_json = 'Let me search: {"name": "fake_search", "arguments": {"query": "auth"}}' | ||
| fake_tool = _FakeTool(result="Found auth module.") | ||
| agent, llm, container = _make_agent( | ||
| llm_responses=[ | ||
| _llm_text(text_with_json), | ||
| _llm_text("Auth uses JWT based on search results."), | ||
| ], | ||
| extra_tools=[fake_tool], | ||
| ) | ||
| reply = await agent.run("system prompt", "explain auth") | ||
| assert reply == "Auth uses JWT based on search results." | ||
| # execute should NOT have been called | ||
| container.exec.assert_not_awaited() | ||
|
|
There was a problem hiding this comment.
These new tests validate that the loop continues and container.exec isn't used, but they don't assert that the extra tool was actually executed with the expected arguments. To prevent regressions where _dispatch_extra_tool() returns early (e.g., unknown tool / parsing issues) while the test still passes, consider instrumenting _FakeTool.execute (e.g., wrap with AsyncMock or add a counter) and assert it was awaited the expected number of times with the expected query values.
| valid_names = {"execute"} | {t.name for t in self._extra_tools} | ||
| text_call = _extract_text_tool_call(msg.content or "") | ||
| if text_call and text_call.name in valid_names and state.text_call_count < 3: | ||
| state.text_call_count += 1 | ||
| messages.append({"role": "assistant", "content": msg.content}) | ||
| result = await self._handle_tool_call(text_cmd, state, round_num) | ||
| if text_call.name == "execute": | ||
| command = text_call.arguments.get("command", "") | ||
| result = await self._handle_tool_call(command, state, round_num) | ||
| else: |
There was a problem hiding this comment.
Behavior change vs previous implementation: the loop now treats any parsed text tool object as dispatchable (subject to name), even if required arguments are missing. For execute, this means a text snippet like { "name": "execute", "arguments": {} } will now increment text_call_count, call _handle_tool_call(""), and continue (returning an error), whereas previously an empty extracted command would fall through and return the text. Consider validating required args (e.g., non-empty command for execute) before counting/dispatching.
| @@ -160,10 +169,13 @@ def _extract_text_tool_call(text: str) -> str | None: | |||
| end = i + 1 | |||
| break | |||
| data = json.loads(text[start:end]) | |||
There was a problem hiding this comment.
_extract_text_tool_call() uses text.index("{") to pick the first { in the message, which can fail to parse a valid tool-call JSON later in the text (e.g., if the message contains earlier braces/code blocks before the tool JSON). Use the regex match position (or otherwise locate the { that starts the matched tool object) as the start index before scanning for the matching closing brace.
|
|
||
| # Detect text-embedded tool calls (models that output JSON instead of using API). | ||
| _TEXT_TOOL_RE = re.compile(r'\{\s*"name"\s*:\s*"execute"') | ||
| _TEXT_TOOL_RE = re.compile(r'\{\s*"name"\s*:\s*"[a-zA-Z_]\w*"') |
There was a problem hiding this comment.
The updated _TEXT_TOOL_RE matches any JSON object with a "name" field, even when it isn't a tool call (e.g., examples/config JSON that happen to use a name key). If the name coincides with a registered tool, the agent may dispatch unintentionally. Consider tightening detection to require an "arguments" field (and optionally ensure the parsed object has the expected shape) before attempting dispatch.
| _TEXT_TOOL_RE = re.compile(r'\{\s*"name"\s*:\s*"[a-zA-Z_]\w*"') | |
| _TEXT_TOOL_RE = re.compile( | |
| r'\{\s*"name"\s*:\s*"[a-zA-Z_]\w*"\s*,\s*"arguments"\s*:\s*\{' | |
| ) |
Summary
Extended the agent's text-embedded tool call handling to support extra tools beyond just the
executecommand. Previously, the agent only recognized and dispatchedexecutecommands embedded as JSON in LLM text output. Now it can handle any registered extra tool in the same way.Key Changes
_TextToolCalldataclass: Changed return type of_extract_text_tool_call()from a simple command string to a structured_TextToolCallobject containing both the tool name and parsed arguments_TEXT_TOOL_REto match any valid tool name (not just "execute"), allowing detection of extra tool calls in textexecutecommands go through_handle_tool_call()as before_dispatch_extra_tool()with full argumentsImplementation Details
_extract_text_tool_call()function now returns a complete_TextToolCallobject with name and arguments dict, rather than just extracting the command stringhttps://claude.ai/code/session_012TQ4HhX4ys5dEbVbeS7CVP