Skip to content

Releases: Siddhant-K-code/agent-trace

v0.22.0

11 Apr 17:48

Choose a tag to compare

Semantic Session Diff

Ever re-run an agent on the same task and wondered what actually changed? agent-trace diff --semantic compares two sessions by outcome rather than raw event order — so you see meaningful differences like new tools used, changed error rates, or diverging token costs, not just line-by-line noise.

# Compare two runs of the same task
agent-trace diff SESSION_A SESSION_B --semantic

# Export a structured JSON report for CI or further processing
agent-trace diff SESSION_A SESSION_B --semantic --eval-config eval.json

What the report covers:

  • Tools added or removed between runs
  • Change in total tool calls, errors, and token usage
  • Whether the session outcome improved, regressed, or stayed the same
  • A plain-language summary you can paste into a PR description or eval log

This is useful for regression testing agent behaviour — run the same prompt twice (or against two model versions) and get a structured diff you can assert on in CI.

v0.21.0 — Token Budget Tracking and Context Window Early Warning

11 Apr 17:37
5ac2eda

Choose a tag to compare

Token Budget Tracking and Context Window Early Warning

Long-running agents can silently burn through a model's context window and then fail or degrade in quality. This release adds a token budget tracker that watches cumulative usage and warns you before you hit the limit.

# Check token usage for a session
agent-trace token-budget SESSION_ID

# Specify the model to get an accurate limit
agent-trace token-budget SESSION_ID --model claude-3-5-sonnet

# Warn at 75% instead of the default 80%
agent-trace token-budget SESSION_ID --model gpt-4o --warn-at 75

In watch mode, the same threshold applies in real time:

agent-trace watch --max-context-pct 80 SESSION_ID

Supported models and their limits:

Model Context
claude-3-5-sonnet 200k tokens
claude-3-opus 200k tokens
gpt-4o 128k tokens
gpt-4-turbo 128k tokens
gemini-1.5-pro 1M tokens

For models not in the list, you can pass --limit to set a custom token ceiling.

v0.20.0 — Replay Annotations

11 Apr 17:37
be73e05

Choose a tag to compare

Replay Annotations

Add notes, labels, and bookmarks to any event in a recorded session. Useful for code review, debugging, and building eval datasets — you can mark interesting moments and come back to them later.

# Add a note to event #12 in a session
agent-trace annotate SESSION_ID 12 --note "Why did it call bash here instead of write_file?"

# Tag an event with a label
agent-trace annotate SESSION_ID 12 --label regression

# Bookmark an event for quick navigation in the HTML viewer
agent-trace annotate SESSION_ID 12 --bookmark

# List all annotations on a session
agent-trace annotate SESSION_ID --list

# Remove an annotation
agent-trace annotate SESSION_ID 12 --delete ANNOTATION_ID

Annotations are stored alongside the session, so they persist across replays and show up in shared HTML reports (v0.18.0+) as a bookmarks sidebar. They're also useful for building eval datasets — label sessions as pass / fail / interesting and filter on those labels later.

v0.19.0 — OTLP Span Hierarchy and GenAI Semantic Conventions

11 Apr 17:37
5335593

Choose a tag to compare

OTLP Export Improvements

The OTLP exporter now produces traces that look right in Jaeger, Grafana Tempo, and other OpenTelemetry backends — with proper parent-child span relationships and standard GenAI attributes.

Span hierarchy: Tool call spans are nested under their parent LLM span, which is nested under the session root. You can now see the full call tree in your tracing UI without any post-processing.

GenAI semantic conventions: LLM spans now carry standard attributes like gen_ai.system, gen_ai.request.model, gen_ai.usage.input_tokens, and gen_ai.usage.output_tokens. These work out of the box with dashboards built on the OpenTelemetry GenAI spec.

Multi-agent traces: Use tree_to_otlp() to export a root session and all its subagents as a single linked trace — so you see the whole multi-agent run as one waterfall in your tracing UI.

from agent_trace.otlp import export_session, tree_to_otlp

# Single session
export_session(session, endpoint="http://localhost:4318")

# Full subagent tree as one trace
export_session(root, endpoint="http://localhost:4318", children=[child1, child2])

v0.18.0 — Multi-Agent Session Tree in Share HTML

11 Apr 17:36
4d0bf49

Choose a tag to compare

Multi-Agent Session Tree in Shared Reports

When you share a session that spawned subagents, the HTML report now shows the full agent tree — not just the root session. Each node in the tree is clickable and shows its own tool call count, token usage, and error count.

agent-trace share SESSION_ID -o report.html

Open the report and you'll find a collapsible tree sidebar on the left. Click any subagent node to jump to its events. If you've added annotations (new in v0.20.0), bookmarks appear in a separate sidebar for quick navigation.

This makes it much easier to understand what happened in complex multi-agent runs — you can see at a glance which subagent did the most work, which one errored, and how the overall task was decomposed.

v0.17.0 — Multi-Session Dashboard

11 Apr 17:36
d8b0ba7

Choose a tag to compare

Multi-Session Dashboard

Get a bird's-eye view across all your agent sessions. The new dashboard command aggregates stats, shows trends over time, and surfaces which tools your agents use most — all in the terminal or as a self-contained HTML report.

# View stats for all sessions
agent-trace dashboard

# Scope to recent sessions
agent-trace dashboard --last 20
agent-trace dashboard --since 2024-06-01

# Export as an HTML report to share with your team
agent-trace dashboard --html report.html

What you'll see:

  • Total sessions, tool calls, errors, tokens, and estimated cost
  • ASCII sparkline charts showing how tool usage, token consumption, and error rates trend over time
  • A per-session breakdown table so you can spot outliers at a glance
  • Top tools ranked by call frequency

The HTML export is self-contained — no server needed, just open it in a browser.

v0.16.0 — Session Attribution

11 Apr 17:36
f24004f

Choose a tag to compare

Session Attribution

Every session now records who and what spawned it. When you open a trace, you'll see the OS user, the detected agent provider, the git repo and branch the agent was working in, and the chain of parent processes that launched it.

This makes it much easier to answer questions like:

  • Which developer's machine did this run on?
  • Was this a Claude Code session or a Cursor session?
  • What branch was the agent working on when it made this change?

Attribution is collected automatically — nothing to configure. It shows up in agent-trace show and in shared HTML reports.

Detected providers: claude-code, cursor, github-copilot, cody, continue, and a generic mcp-client fallback.

agent-trace show SESSION_ID
# Attribution
#   User:     alice
#   Provider: claude-code
#   Branch:   feat/my-feature
#   Commit:   a1b2c3d
#   CWD:      /home/alice/projects/myapp

v0.15.0 — Per-Operation Enforcement in Watch Mode

11 Apr 17:36
3630a27

Choose a tag to compare

Per-Operation Enforcement in Watch Mode

Watch mode can now enforce rate limits per tool, not just per session. This is handy when you want to let an agent call read_file freely but cap how many times it can call bash or write_file in a single run.

from agent_trace.watch import WatcherConfig, OperationRule

config = WatcherConfig(
    max_tool_calls=100,
    operation_rules=[
        OperationRule(operation="bash", max_calls=10),
        OperationRule(operation="write_file", max_calls=5, window_seconds=60),
    ],
    max_context_pct=80.0,  # warn when context window is 80% full
)

You can also set a context window threshold from the CLI:

agent-trace watch --max-context-pct 80 SESSION_ID

When the threshold is crossed, watch mode fires a warning before the session hits the hard limit — giving you time to intervene or checkpoint.

v0.14.0 — Live PII and Sensitive Data Masking

11 Apr 17:35
603da53

Choose a tag to compare

Live PII Masking

Sensitive data in your agent's tool calls and responses is now masked before it ever hits disk. This is useful when tracing agents that handle user data, API keys, or credentials — you get full observability without storing raw PII.

Masked by default:

  • Email addresses
  • Phone numbers
  • Credit card numbers
  • US Social Security Numbers
  • AWS ARNs

Masking is applied in the proxy layer, so it works transparently whether you're using the stdio proxy or the HTTP proxy.

# Use a custom masking config when starting a proxy
from agent_trace.masking import MaskingConfig
from agent_trace.proxy import start_proxy

config = MaskingConfig(
    redact_emails=True,
    redact_phone_numbers=True,
    redact_credit_cards=True,
)
start_proxy(masking_config=config)

You can also call mask_event_data() directly if you want to sanitise events from an existing session before sharing or exporting them.

v0.13.0

11 Apr 17:33

Choose a tag to compare

v0.13.0 - Policy suggestion

Generate a minimal .agent-scope.json allow-list directly from what your agent actually did, instead of writing one by hand.

New command

agent-strace policy                        # analyse all sessions, print suggested policy
agent-strace policy <session-id>...        # analyse specific sessions
agent-strace policy --output .agent-scope.json   # write to file
agent-strace policy --dry-run              # print without writing

How it works

agent-strace policy scans tool_call events across one or more sessions and collects:

  • Files read — from read, view, grep, glob tool calls
  • Files written — from write, edit, create tool calls
  • Commands — from bash tool calls
  • Network hosts — extracted from URLs in bash commands

Paths are collapsed to glob patterns when 3 or more files share a directory (e.g. src/a.py, src/b.py, src/c.pysrc/**). Commands are collapsed to base-executable patterns (e.g. pytest tests/foo.py -xpytest *).

Example output

{
  "files": {
    "read":  { "allow": ["src/**", "tests/**", "pyproject.toml"] },
    "write": { "allow": ["src/**"] }
  },
  "commands": {
    "allow": ["pytest *", "git *"]
  },
  "network": {
    "deny_all": true,
    "allow": ["api.anthropic.com"]
  }
}

The generated policy can be used directly with agent-strace audit to flag future sessions that exceed the observed scope.