feat(config): add hooks.pre_run for pre-eval environment injection by christso · Pull Request #1150 · EntityProcess/agentv

christso · 2026-04-23T02:17:54Z

Summary

Adds hooks.pre_run to .agentv/config.yaml and hooks.preRun to agentv.config.ts (defineConfig)
Runs a shell command before the eval starts; parses stdout for env vars and injects them into process.env
Existing env vars are never overwritten — process env always wins
Stderr from the hook is forwarded to the user; non-zero exit aborts the eval
Bonus fix: --dry-run mock response now satisfies all LLM grader schemas — no more parse errors when running end-to-end harness tests with graders

Closes #1149

What was implemented

New file: `packages/core/src/evaluation/hooks.ts`

parseEnvOutput(stdout) — parses export KEY="value" and KEY=value lines from stdout
runPreRunHook(command) — spawns via sh -c, captures stdout, parses and injects env vars, forwards stderr

Config schema changes

packages/core/src/evaluation/config.ts: added hooks.preRun (optional string) to AgentVConfigSchema
packages/core/src/evaluation/loaders/config-loader.ts: added HooksConfig type, parseHooksConfig(), wired into loadConfig()
plugins/agentv-dev/skills/agentv-eval-builder/references/config-schema.json: added hooks.pre_run for YAML IDE autocomplete

Wired into CLI

apps/cli/src/commands/eval/run-eval.ts: calls runPreRunHook after version check, before normalizeOptions — so secrets are available for all subsequent env lookups

Precedence

YAML config hooks.pre_run takes priority over TS config hooks.preRun, matching the existing pattern for other settings.

Dry-run mock response fix

apps/cli/src/commands/eval/targets.ts + run-eval.ts: changed dry-run mock response from {"answer":"Mock dry-run response"} (invalid grader response) to {"score":1,"assertions":[],"checks":[],"overall_reasoning":"dry-run mock"} — satisfies all three LLM grader schemas (freeform, rubric, score-range)
packages/core/src/evaluation/graders/llm-grader.ts: exported scoreRangeEvaluationSchema
packages/core/test/evaluation/graders/dry-run-mock-response.test.ts: regression test verifying all three schema validations

Test plan

12 unit tests for parseEnvOutput covering dotenv, shell-export, quoted/unquoted, equals-in-value, comments, blanks, invalid lines
4 regression tests for dry-run mock response schema compatibility (freeform, rubric, score-range)
All 2275 existing tests pass (bun run test)
Build passes (bun run build)
Lint passes (bun run lint)
Pre-push hook (build + typecheck + lint + test + validate-examples) passes

Red/green UAT evidence

Pre-run hook

Red (before — no hook support):

Artifact directory: .agentv/results/runs/default/...
Using target: llm → llm-dry-run
0/7   🔄 code-review-javascript | llm → llm-dry-run
...

No hook output — hooks.pre_run field did not exist.

Green (with hooks.pre_run: "sh /tmp/test-agentv-hook.sh" in .agentv/config.yaml):

Running pre-run hook: sh /tmp/test-agentv-hook.sh
Pre-run hook injected 2 environment variable(s).
Artifact directory: .agentv/results/runs/default/...
Using target: llm → llm-dry-run
0/7   🔄 code-review-javascript | llm → llm-dry-run
...

Hook fires, 2 env vars injected (HOOK_FIRED=1, TEST_SECRET=hello-from-hook), eval proceeds normally.

Dry-run LLM grader fix

Red (before fix):

⚠ LLM grader "llm-grader" failed after 3 attempts (Failed to parse evaluator response after 3 attempts and 1 structure-fix attempt: [
  { "code": "invalid_type", "expected": "number", "received": "undefined", "path": ["score"], "message": "Required" }
]) — skipped
1/7   ⚠️ code-review-javascript | llm → llm-dry-run | 0% FAIL
2/7   ⚠️ feature-proposal-brainstorm | llm → llm-dry-run | 0% FAIL
...

Green (after fix):

1/7   ✅ code-review-javascript | llm → llm-dry-run | 100% PASS
2/7   ✅ feature-proposal-brainstorm | llm → llm-dry-run | 100% PASS
...
RESULT: FAIL  (6/7 scored >= 80%, mean: 93%)

No LLM grader parse errors. All tests run cleanly through the full grader pipeline.

Adds a hooks.pre_run field to both agentv.config.ts (as hooks.preRun) and .agentv/config.yaml (as hooks.pre_run) that runs a shell command before an eval starts and injects its exported env vars into process.env. - New hooks.ts utility: parseEnvOutput + runPreRunHook - Parses both `export KEY="value"` and `KEY=value` stdout formats - Only injects keys not already set in process.env (existing env wins) - Forwards stderr to process.stderr; non-zero exit aborts the eval - Wired into runEvalCommand before normalizeOptions so secrets are available for all subsequent config and env lookups - JSON schema updated for .agentv/config.yaml IDE autocomplete Closes #1149 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

cloudflare-workers-and-pages · 2026-04-23T02:18:37Z

Deploying agentv with Cloudflare Pages

Latest commit:	`30ee637`
Status:	✅ Deploy successful!
Preview URL:	https://332cce72.agentv.pages.dev
Branch Preview URL:	https://feat-1149-pre-run-hook.agentv.pages.dev

View logs

… claim tracking Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

--dry-run previously returned '{"answer":"Mock dry-run response"}' which caused LLM graders to fail with 'Required: score' parse errors after 3 attempts. The mock response now satisfies all three grader schemas (freeform, rubric, score-range) so --dry-run works end-to-end including grader plumbing without real LLM calls. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…ypoint Hook now runs once per agentv invocation (not once per eval run), covering all commands including interactive mode. Equivalent to the project-level wrapper script pattern from the end user's perspective. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

christso and others added 6 commits April 23, 2026 12:24

docs(agents): replace in-progress label with project board status for…

1d481f1

… claim tracking Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

docs(agents): make code review step optional for focused changes

7e75597

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

style(cli): fix biome formatting for dry-run mock response

0e31f23

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

style: fix import order and blank line

30ee637

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

christso marked this pull request as ready for review April 23, 2026 03:16

christso merged commit 12c1dd9 into main Apr 23, 2026
4 checks passed

christso deleted the feat/1149-pre-run-hook branch April 23, 2026 03:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(config): add hooks.pre_run for pre-eval environment injection#1150

feat(config): add hooks.pre_run for pre-eval environment injection#1150
christso merged 7 commits intomainfrom
feat/1149-pre-run-hook

christso commented Apr 23, 2026 •

edited

Loading

Uh oh!

cloudflare-workers-and-pages Bot commented Apr 23, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

christso commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What was implemented

New file: packages/core/src/evaluation/hooks.ts

Config schema changes

Wired into CLI

Precedence

Dry-run mock response fix

Test plan

Red/green UAT evidence

Pre-run hook

Dry-run LLM grader fix

Uh oh!

cloudflare-workers-and-pages Bot commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Deploying agentv with Cloudflare Pages

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

christso commented Apr 23, 2026 •

edited

Loading

New file: `packages/core/src/evaluation/hooks.ts`

cloudflare-workers-and-pages Bot commented Apr 23, 2026 •

edited

Loading