feat(config): add hooks.pre_run for pre-eval environment injection#1150
Merged
feat(config): add hooks.pre_run for pre-eval environment injection#1150
Conversation
Adds a hooks.pre_run field to both agentv.config.ts (as hooks.preRun) and .agentv/config.yaml (as hooks.pre_run) that runs a shell command before an eval starts and injects its exported env vars into process.env. - New hooks.ts utility: parseEnvOutput + runPreRunHook - Parses both `export KEY="value"` and `KEY=value` stdout formats - Only injects keys not already set in process.env (existing env wins) - Forwards stderr to process.stderr; non-zero exit aborts the eval - Wired into runEvalCommand before normalizeOptions so secrets are available for all subsequent config and env lookups - JSON schema updated for .agentv/config.yaml IDE autocomplete Closes #1149 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Deploying agentv with
|
| Latest commit: |
30ee637
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://332cce72.agentv.pages.dev |
| Branch Preview URL: | https://feat-1149-pre-run-hook.agentv.pages.dev |
… claim tracking Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
--dry-run previously returned '{"answer":"Mock dry-run response"}' which
caused LLM graders to fail with 'Required: score' parse errors after 3
attempts. The mock response now satisfies all three grader schemas
(freeform, rubric, score-range) so --dry-run works end-to-end including
grader plumbing without real LLM calls.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ypoint Hook now runs once per agentv invocation (not once per eval run), covering all commands including interactive mode. Equivalent to the project-level wrapper script pattern from the end user's perspective. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
hooks.pre_runto.agentv/config.yamlandhooks.preRuntoagentv.config.ts(defineConfig)process.env--dry-runmock response now satisfies all LLM grader schemas — no more parse errors when running end-to-end harness tests with gradersCloses #1149
What was implemented
New file:
packages/core/src/evaluation/hooks.tsparseEnvOutput(stdout)— parsesexport KEY="value"andKEY=valuelines from stdoutrunPreRunHook(command)— spawns viash -c, captures stdout, parses and injects env vars, forwards stderrConfig schema changes
packages/core/src/evaluation/config.ts: addedhooks.preRun(optional string) toAgentVConfigSchemapackages/core/src/evaluation/loaders/config-loader.ts: addedHooksConfigtype,parseHooksConfig(), wired intoloadConfig()plugins/agentv-dev/skills/agentv-eval-builder/references/config-schema.json: addedhooks.pre_runfor YAML IDE autocompleteWired into CLI
apps/cli/src/commands/eval/run-eval.ts: callsrunPreRunHookafter version check, beforenormalizeOptions— so secrets are available for all subsequent env lookupsPrecedence
YAML config
hooks.pre_runtakes priority over TS confighooks.preRun, matching the existing pattern for other settings.Dry-run mock response fix
apps/cli/src/commands/eval/targets.ts+run-eval.ts: changed dry-run mock response from{"answer":"Mock dry-run response"}(invalid grader response) to{"score":1,"assertions":[],"checks":[],"overall_reasoning":"dry-run mock"}— satisfies all three LLM grader schemas (freeform, rubric, score-range)packages/core/src/evaluation/graders/llm-grader.ts: exportedscoreRangeEvaluationSchemapackages/core/test/evaluation/graders/dry-run-mock-response.test.ts: regression test verifying all three schema validationsTest plan
parseEnvOutputcovering dotenv, shell-export, quoted/unquoted, equals-in-value, comments, blanks, invalid linesbun run test)bun run build)bun run lint)Red/green UAT evidence
Pre-run hook
Red (before — no hook support):
No hook output —
hooks.pre_runfield did not exist.Green (with
hooks.pre_run: "sh /tmp/test-agentv-hook.sh"in.agentv/config.yaml):Hook fires, 2 env vars injected (
HOOK_FIRED=1,TEST_SECRET=hello-from-hook), eval proceeds normally.Dry-run LLM grader fix
Red (before fix):
Green (after fix):
No LLM grader parse errors. All tests run cleanly through the full grader pipeline.