feat(pipeline): add --experiment flag for labeling eval run conditions by christso · Pull Request #802 · EntityProcess/agentv

christso · 2026-03-28T05:32:16Z

Summary

Add --experiment option to pipeline input and pipeline run commands
Experiment label flows through manifest.json → pipeline bench → index.jsonl entries + benchmark.json metadata
New feature example: examples/features/experiments/
Docs: experiment section in running-evals.mdx, experiments workflow in skill-improvement-workflow.mdx

Motivation

Following the convex-evals pattern, an experiment is a run-level label that records conditions (with_skills, without_skills, web_search, etc.) while keeping eval files identical across runs. This enables dashboard filtering and structured A/B comparison.

agentv pipeline run evals/coding-ability.eval.yaml --experiment with_skills
agentv pipeline run evals/coding-ability.eval.yaml --experiment without_skills

Test plan

2 new tests: experiment flag writes to manifest, omitted when not provided
All 14 pipeline tests pass
All 81 results tests pass
Eval YAML validates
Pre-commit hooks pass (build, typecheck, lint, test, validate)

🤖 Generated with Claude Code

cloudflare-workers-and-pages · 2026-03-28T05:33:05Z

Deploying agentv with Cloudflare Pages

Latest commit:	`61c2c7a`
Status:	⚡️ Build in progress...

View logs

Add --experiment option to pipeline input and pipeline run commands. The label is written to manifest.json and propagated through pipeline bench into index.jsonl entries and benchmark.json metadata. - pipeline input: accepts --experiment, writes to manifest - pipeline run: accepts --experiment, writes to manifest - pipeline bench: reads manifest.experiment, includes in index entries - New feature example: examples/features/experiments/ - Docs: add experiment section to running-evals.mdx - Docs: add experiments workflow to skill-improvement-workflow.mdx - Tests: 2 new tests for experiment flag presence/absence Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

christso force-pushed the feat/experiment-flag branch from 2b7250f to 61c2c7a Compare March 28, 2026 05:51

christso merged commit 443766e into main Mar 28, 2026
1 of 2 checks passed

christso deleted the feat/experiment-flag branch March 28, 2026 05:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(pipeline): add --experiment flag for labeling eval run conditions#802

feat(pipeline): add --experiment flag for labeling eval run conditions#802
christso merged 1 commit intomainfrom
feat/experiment-flag

christso commented Mar 28, 2026

Uh oh!

cloudflare-workers-and-pages Bot commented Mar 28, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

christso commented Mar 28, 2026

Summary

Motivation

Test plan

Uh oh!

cloudflare-workers-and-pages Bot commented Mar 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Deploying agentv with Cloudflare Pages

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

cloudflare-workers-and-pages Bot commented Mar 28, 2026 •

edited

Loading