fix(core): extract system messages in prompt builder for LLM grader by christso · Pull Request #983 · EntityProcess/agentv

christso · 2026-04-08T22:23:11Z

Closes #982

Summary

Fix buildPromptInputs to extract system messages into systemMessage field
Fix orchestrator to pass systemPrompt directly instead of in metadata

Test plan

Unit tests pass (1901 tests across all packages)
Build succeeds
Manual eval with system prompt produces correct grader scores

Co-Authored-By: Claude Opus 4.6 (1M context) noreply@anthropic.com

The buildPromptInputs function now correctly extracts system messages and returns them in the systemMessage field. The orchestrator passes the system prompt directly instead of burying it in metadata. Closes #982 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

cloudflare-workers-and-pages · 2026-04-08T22:23:44Z

Deploying agentv with Cloudflare Pages

Latest commit:	`2f58cfa`
Status:	✅ Deploy successful!
Preview URL:	https://78c2eebe.agentv.pages.dev
Branch Preview URL:	https://fix-982-grader-system-prompt.agentv.pages.dev

View logs

…ride (#982) When a user writes `prompt: "Check step-by-step work"` in an llm-grader assertion, the text was being used as the entire evaluator template, replacing the DEFAULT_EVALUATOR_TEMPLATE which contains {{output}}, {{input}}, {{criteria}} variables. This meant the grader never saw the actual candidate response, always scoring 0. Now bare inline prompt strings (without template variables like {{output}}) are injected into the default template's {{criteria}} slot instead. Prompts that contain recognized template variables, and prompts from scripts/files, continue to work as full template overrides. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Replace while-loop assignment with matchAll iterator (biome noAssignInExpressions) and collapse short import to single line (biome formatter). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Rerun of with-superpowers vs without-superpowers logic-puzzle evals after merging fix(core): extract system messages in prompt builder for LLM grader (EntityProcess/agentv#983). Both experiments now score 100%/0.990 on gemini and azure. Previous 0% for with-superpowers/azure was entirely a grader artifact from bug #982. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

christso mentioned this pull request Apr 8, 2026

bug: LLM grader reports 'no response provided' when system prompt is present in input #982

Closed

christso and others added 2 commits April 8, 2026 22:49

style: fix lint issues in prompt-resolution and builtin-evaluators

2f58cfa

Replace while-loop assignment with matchAll iterator (biome noAssignInExpressions) and collapse short import to single line (biome formatter). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

christso mentioned this pull request Apr 8, 2026

bench: re-run with-superpowers vs without-superpowers after grader fix #989

Closed

3 tasks

christso marked this pull request as ready for review April 8, 2026 23:01

christso merged commit 8f4a29b into main Apr 8, 2026
4 checks passed

christso deleted the fix/982-grader-system-prompt branch April 8, 2026 23:01

christso mentioned this pull request Apr 9, 2026

feat: wire base_commit, multi-project dashboard default, assertion includes, SWE-bench importer fixes #995

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(core): extract system messages in prompt builder for LLM grader#983

fix(core): extract system messages in prompt builder for LLM grader#983
christso merged 3 commits intomainfrom
fix/982-grader-system-prompt

christso commented Apr 8, 2026

Uh oh!

cloudflare-workers-and-pages Bot commented Apr 8, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

christso commented Apr 8, 2026

Summary

Test plan

Uh oh!

cloudflare-workers-and-pages Bot commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Deploying agentv with Cloudflare Pages

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

cloudflare-workers-and-pages Bot commented Apr 8, 2026 •

edited

Loading