Releases: Siddhant-K-code/agent-trace
v0.22.0
Semantic Session Diff
Ever re-run an agent on the same task and wondered what actually changed? agent-trace diff --semantic compares two sessions by outcome rather than raw event order — so you see meaningful differences like new tools used, changed error rates, or diverging token costs, not just line-by-line noise.
# Compare two runs of the same task
agent-trace diff SESSION_A SESSION_B --semantic
# Export a structured JSON report for CI or further processing
agent-trace diff SESSION_A SESSION_B --semantic --eval-config eval.jsonWhat the report covers:
- Tools added or removed between runs
- Change in total tool calls, errors, and token usage
- Whether the session outcome improved, regressed, or stayed the same
- A plain-language summary you can paste into a PR description or eval log
This is useful for regression testing agent behaviour — run the same prompt twice (or against two model versions) and get a structured diff you can assert on in CI.
v0.21.0 — Token Budget Tracking and Context Window Early Warning
Token Budget Tracking and Context Window Early Warning
Long-running agents can silently burn through a model's context window and then fail or degrade in quality. This release adds a token budget tracker that watches cumulative usage and warns you before you hit the limit.
# Check token usage for a session
agent-trace token-budget SESSION_ID
# Specify the model to get an accurate limit
agent-trace token-budget SESSION_ID --model claude-3-5-sonnet
# Warn at 75% instead of the default 80%
agent-trace token-budget SESSION_ID --model gpt-4o --warn-at 75In watch mode, the same threshold applies in real time:
agent-trace watch --max-context-pct 80 SESSION_IDSupported models and their limits:
| Model | Context |
|---|---|
| claude-3-5-sonnet | 200k tokens |
| claude-3-opus | 200k tokens |
| gpt-4o | 128k tokens |
| gpt-4-turbo | 128k tokens |
| gemini-1.5-pro | 1M tokens |
For models not in the list, you can pass --limit to set a custom token ceiling.
v0.20.0 — Replay Annotations
Replay Annotations
Add notes, labels, and bookmarks to any event in a recorded session. Useful for code review, debugging, and building eval datasets — you can mark interesting moments and come back to them later.
# Add a note to event #12 in a session
agent-trace annotate SESSION_ID 12 --note "Why did it call bash here instead of write_file?"
# Tag an event with a label
agent-trace annotate SESSION_ID 12 --label regression
# Bookmark an event for quick navigation in the HTML viewer
agent-trace annotate SESSION_ID 12 --bookmark
# List all annotations on a session
agent-trace annotate SESSION_ID --list
# Remove an annotation
agent-trace annotate SESSION_ID 12 --delete ANNOTATION_IDAnnotations are stored alongside the session, so they persist across replays and show up in shared HTML reports (v0.18.0+) as a bookmarks sidebar. They're also useful for building eval datasets — label sessions as pass / fail / interesting and filter on those labels later.
v0.19.0 — OTLP Span Hierarchy and GenAI Semantic Conventions
OTLP Export Improvements
The OTLP exporter now produces traces that look right in Jaeger, Grafana Tempo, and other OpenTelemetry backends — with proper parent-child span relationships and standard GenAI attributes.
Span hierarchy: Tool call spans are nested under their parent LLM span, which is nested under the session root. You can now see the full call tree in your tracing UI without any post-processing.
GenAI semantic conventions: LLM spans now carry standard attributes like gen_ai.system, gen_ai.request.model, gen_ai.usage.input_tokens, and gen_ai.usage.output_tokens. These work out of the box with dashboards built on the OpenTelemetry GenAI spec.
Multi-agent traces: Use tree_to_otlp() to export a root session and all its subagents as a single linked trace — so you see the whole multi-agent run as one waterfall in your tracing UI.
from agent_trace.otlp import export_session, tree_to_otlp
# Single session
export_session(session, endpoint="http://localhost:4318")
# Full subagent tree as one trace
export_session(root, endpoint="http://localhost:4318", children=[child1, child2])v0.18.0 — Multi-Agent Session Tree in Share HTML
Multi-Agent Session Tree in Shared Reports
When you share a session that spawned subagents, the HTML report now shows the full agent tree — not just the root session. Each node in the tree is clickable and shows its own tool call count, token usage, and error count.
agent-trace share SESSION_ID -o report.htmlOpen the report and you'll find a collapsible tree sidebar on the left. Click any subagent node to jump to its events. If you've added annotations (new in v0.20.0), bookmarks appear in a separate sidebar for quick navigation.
This makes it much easier to understand what happened in complex multi-agent runs — you can see at a glance which subagent did the most work, which one errored, and how the overall task was decomposed.
v0.17.0 — Multi-Session Dashboard
Multi-Session Dashboard
Get a bird's-eye view across all your agent sessions. The new dashboard command aggregates stats, shows trends over time, and surfaces which tools your agents use most — all in the terminal or as a self-contained HTML report.
# View stats for all sessions
agent-trace dashboard
# Scope to recent sessions
agent-trace dashboard --last 20
agent-trace dashboard --since 2024-06-01
# Export as an HTML report to share with your team
agent-trace dashboard --html report.htmlWhat you'll see:
- Total sessions, tool calls, errors, tokens, and estimated cost
- ASCII sparkline charts showing how tool usage, token consumption, and error rates trend over time
- A per-session breakdown table so you can spot outliers at a glance
- Top tools ranked by call frequency
The HTML export is self-contained — no server needed, just open it in a browser.
v0.16.0 — Session Attribution
Session Attribution
Every session now records who and what spawned it. When you open a trace, you'll see the OS user, the detected agent provider, the git repo and branch the agent was working in, and the chain of parent processes that launched it.
This makes it much easier to answer questions like:
- Which developer's machine did this run on?
- Was this a Claude Code session or a Cursor session?
- What branch was the agent working on when it made this change?
Attribution is collected automatically — nothing to configure. It shows up in agent-trace show and in shared HTML reports.
Detected providers: claude-code, cursor, github-copilot, cody, continue, and a generic mcp-client fallback.
agent-trace show SESSION_ID
# Attribution
# User: alice
# Provider: claude-code
# Branch: feat/my-feature
# Commit: a1b2c3d
# CWD: /home/alice/projects/myappv0.15.0 — Per-Operation Enforcement in Watch Mode
Per-Operation Enforcement in Watch Mode
Watch mode can now enforce rate limits per tool, not just per session. This is handy when you want to let an agent call read_file freely but cap how many times it can call bash or write_file in a single run.
from agent_trace.watch import WatcherConfig, OperationRule
config = WatcherConfig(
max_tool_calls=100,
operation_rules=[
OperationRule(operation="bash", max_calls=10),
OperationRule(operation="write_file", max_calls=5, window_seconds=60),
],
max_context_pct=80.0, # warn when context window is 80% full
)You can also set a context window threshold from the CLI:
agent-trace watch --max-context-pct 80 SESSION_IDWhen the threshold is crossed, watch mode fires a warning before the session hits the hard limit — giving you time to intervene or checkpoint.
v0.14.0 — Live PII and Sensitive Data Masking
Live PII Masking
Sensitive data in your agent's tool calls and responses is now masked before it ever hits disk. This is useful when tracing agents that handle user data, API keys, or credentials — you get full observability without storing raw PII.
Masked by default:
- Email addresses
- Phone numbers
- Credit card numbers
- US Social Security Numbers
- AWS ARNs
Masking is applied in the proxy layer, so it works transparently whether you're using the stdio proxy or the HTTP proxy.
# Use a custom masking config when starting a proxy
from agent_trace.masking import MaskingConfig
from agent_trace.proxy import start_proxy
config = MaskingConfig(
redact_emails=True,
redact_phone_numbers=True,
redact_credit_cards=True,
)
start_proxy(masking_config=config)You can also call mask_event_data() directly if you want to sanitise events from an existing session before sharing or exporting them.
v0.13.0
v0.13.0 - Policy suggestion
Generate a minimal .agent-scope.json allow-list directly from what your agent actually did, instead of writing one by hand.
New command
agent-strace policy # analyse all sessions, print suggested policy
agent-strace policy <session-id>... # analyse specific sessions
agent-strace policy --output .agent-scope.json # write to file
agent-strace policy --dry-run # print without writingHow it works
agent-strace policy scans tool_call events across one or more sessions and collects:
- Files read — from
read,view,grep,globtool calls - Files written — from
write,edit,createtool calls - Commands — from
bashtool calls - Network hosts — extracted from URLs in bash commands
Paths are collapsed to glob patterns when 3 or more files share a directory (e.g. src/a.py, src/b.py, src/c.py → src/**). Commands are collapsed to base-executable patterns (e.g. pytest tests/foo.py -x → pytest *).
Example output
{
"files": {
"read": { "allow": ["src/**", "tests/**", "pyproject.toml"] },
"write": { "allow": ["src/**"] }
},
"commands": {
"allow": ["pytest *", "git *"]
},
"network": {
"deny_all": true,
"allow": ["api.anthropic.com"]
}
}The generated policy can be used directly with agent-strace audit to flag future sessions that exceed the observed scope.