Skip to content

feat(studio): comparison analytics charts for skills/workflow benchmarking#1104

Merged
christso merged 3 commits intomainfrom
feat/1102-analytics-charts
Apr 15, 2026
Merged

feat(studio): comparison analytics charts for skills/workflow benchmarking#1104
christso merged 3 commits intomainfrom
feat/1102-analytics-charts

Conversation

@christso
Copy link
Copy Markdown
Collaborator

@christso christso commented Apr 15, 2026

Summary

  • Renames "Compare" tab to "Analytics" throughout Studio (tab labels, route IDs, component refs)
  • Adds recharts as the sole charting library (no shadcn/charts, radix, or other UI libs)
  • Implements ?baseline=<target> query param on /api/compare endpoint, adding delta and normalized_gain fields to non-baseline cells
  • Adds collapsible Analytics section below the existing aggregated matrix with:
    • Normalized gain bar chart — horizontal bars showing g per experiment × target, color-coded green/red/gray
    • Tag × target heatmap — pass rate grid with emerald/yellow/red color coding
    • Negative delta table — filtered view of regressions vs. baseline
    • Score distribution histogram — variance visualization across test cases
    • Trend-over-time line chart — mean score per target over time from /api/runs

Test plan

  • Verify "Analytics" tab label shows in both root and project views
  • Verify existing matrix/table functionality is unchanged when analytics section is collapsed
  • Verify baseline selector dropdown populates with available targets
  • Verify normalized gain bar chart renders when baseline is selected with multiple targets
  • Verify ?baseline=<target> returns 400 for non-existent target
  • Verify no regression in existing API responses when ?baseline is omitted
  • All 2170+ tests pass, build/typecheck/lint clean

Closes #1102

🤖 Generated with Claude Code

Rename Compare tab to Analytics. Add recharts for visualization.
Implement ?baseline=<target> query param on /api/compare endpoint
to compute delta and normalized gain (g) per cell. Add collapsible
analytics section below the aggregated matrix with:
- Normalized gain bar chart (horizontal, color-coded by effect)
- Tag × target pass rate heatmap
- Negative delta regression table
- Score distribution histogram
- Trend-over-time line chart

Closes #1102

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages Bot commented Apr 15, 2026

Deploying agentv with  Cloudflare Pages  Cloudflare Pages

Latest commit: 95985e5
Status: ✅  Deploy successful!
Preview URL: https://fd4a5cfd.agentv.pages.dev
Branch Preview URL: https://feat-1102-analytics-charts.agentv.pages.dev

View logs

- Fix serve.ts: build cells with delta/normalized_gain fields upfront
  instead of mutating via type bypass
- Fix query key collision: use distinct keys for compare vs baseline queries
- Add `enabled: !!baseline` guard to prevent unnecessary API calls
- Remove dead CostVsImprovement component and unused recharts imports
- Fix misleading GainRow.testId → experiment naming
- Rename "Compare runs" heading to "Analyze runs"
- Fix biome formatting issues

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@christso christso marked this pull request as ready for review April 15, 2026 01:12
- Rename CompareTab.tsx → AnalyticsTab.tsx with updated exports
- Update imports in index.tsx and $benchmarkId.tsx route files
- Update studio.mdx docs: rename Compare section to Analytics
- Add analytics charts documentation with baseline selector, normalized
  gain chart, tag heatmap, negative delta table, score distribution,
  and trend-over-time chart descriptions
- Add three new screenshots: aggregated matrix, charts with baseline
  selector, and score trend over time

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@christso christso merged commit 5c5bb87 into main Apr 15, 2026
4 checks passed
@christso christso deleted the feat/1102-analytics-charts branch April 15, 2026 02:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(studio): comparison analytics charts for skills/workflow benchmarking

1 participant