feat(studio): comparison analytics charts for skills/workflow benchmarking#1104
Merged
feat(studio): comparison analytics charts for skills/workflow benchmarking#1104
Conversation
Rename Compare tab to Analytics. Add recharts for visualization. Implement ?baseline=<target> query param on /api/compare endpoint to compute delta and normalized gain (g) per cell. Add collapsible analytics section below the aggregated matrix with: - Normalized gain bar chart (horizontal, color-coded by effect) - Tag × target pass rate heatmap - Negative delta regression table - Score distribution histogram - Trend-over-time line chart Closes #1102 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Deploying agentv with
|
| Latest commit: |
95985e5
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://fd4a5cfd.agentv.pages.dev |
| Branch Preview URL: | https://feat-1102-analytics-charts.agentv.pages.dev |
- Fix serve.ts: build cells with delta/normalized_gain fields upfront instead of mutating via type bypass - Fix query key collision: use distinct keys for compare vs baseline queries - Add `enabled: !!baseline` guard to prevent unnecessary API calls - Remove dead CostVsImprovement component and unused recharts imports - Fix misleading GainRow.testId → experiment naming - Rename "Compare runs" heading to "Analyze runs" - Fix biome formatting issues Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Rename CompareTab.tsx → AnalyticsTab.tsx with updated exports - Update imports in index.tsx and $benchmarkId.tsx route files - Update studio.mdx docs: rename Compare section to Analytics - Add analytics charts documentation with baseline selector, normalized gain chart, tag heatmap, negative delta table, score distribution, and trend-over-time chart descriptions - Add three new screenshots: aggregated matrix, charts with baseline selector, and score trend over time Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
rechartsas the sole charting library (no shadcn/charts, radix, or other UI libs)?baseline=<target>query param on/api/compareendpoint, addingdeltaandnormalized_gainfields to non-baseline cellsgper experiment × target, color-coded green/red/gray/api/runsTest plan
?baseline=<target>returns 400 for non-existent target?baselineis omittedCloses #1102
🤖 Generated with Claude Code