Skip to content

[bot] Anthropic tool_runner agentic loop not instrumented #153

@braintrust-bot

Description

@braintrust-bot

Summary

The Anthropic Ruby SDK provides client.beta.messages.tool_runner(...), a beta agentic loop API that automatically executes tools and manages the multi-turn conversation cycle. This surface is not instrumented. The SDK currently instruments beta.messages.create() and beta.messages.stream() (via BetaMessagesPatcher), so individual LLM calls within the runner may produce spans, but the tool executions and overall agentic run are invisible in traces.

What is missing

The tool_runner API (Anthropic::Resources::Beta::Messages#tool_runner) returns a runner object with these execution methods:

  • each_message { |msg| ... } — iterates through the agentic loop, auto-executing tools via BaseTool#call between iterations
  • run_until_finished — runs the full loop and returns all messages
  • next_message — step-by-step manual iteration
  • each_streaming { |event| ... } — streaming variant of the agentic loop

At each iteration, the runner:

  1. Sends a message to Claude (this call IS traced via existing BetaMessagesPatcher)
  2. Detects tool_use blocks in Claude's response
  3. Executes BaseTool#call on the matching tool (NOT traced)
  4. Sends tool results back and loops

What instrumentation should capture

Parent span for the agentic run (e.g., anthropic.tool_runner):

  • Input: initial messages and tool definitions
  • Output: final response after all tool loops complete
  • Metrics: aggregate token usage across all iterations, total duration

Child spans for each tool execution (e.g., anthropic.tool.{tool_name}):

  • Input: tool name + arguments (from Claude's tool_use block)
  • Output: tool result (from BaseTool#call return value)
  • Span attributes: {type: "tool"}

This pattern is already established in this repo — the RubyLLM integration creates ruby_llm.tool.{tool_name} child spans with braintrust.span_attributes: {type: "tool"} for each tool execution in lib/braintrust/contrib/ruby_llm/instrumentation/chat.rb.

Braintrust docs status

  • Braintrust documents "tool" as a first-class span type: "A tool call made by the model — an external API, code execution, database query, etc." (tracing guide)
  • Ruby-specific tool_runner instrumentation: not_found

Upstream sources

Local repo files inspected

  • lib/braintrust/contrib/anthropic/patcher.rb — defines MessagesPatcher and BetaMessagesPatcher; no tool_runner patcher
  • lib/braintrust/contrib/anthropic/instrumentation/beta_messages.rb — wraps create() and stream() only; no tool_runner wrapper
  • Grep for tool_runner, BaseTool, each_message, run_until_finished across lib/ — zero matches
  • lib/braintrust/contrib/ruby_llm/instrumentation/chat.rb — demonstrates the existing pattern for tool execution tracing with nested spans

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions