fix(core): extract system messages in prompt builder for LLM grader#983
Merged
fix(core): extract system messages in prompt builder for LLM grader#983
Conversation
The buildPromptInputs function now correctly extracts system messages and returns them in the systemMessage field. The orchestrator passes the system prompt directly instead of burying it in metadata. Closes #982 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Deploying agentv with
|
| Latest commit: |
2f58cfa
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://78c2eebe.agentv.pages.dev |
| Branch Preview URL: | https://fix-982-grader-system-prompt.agentv.pages.dev |
…ride (#982) When a user writes `prompt: "Check step-by-step work"` in an llm-grader assertion, the text was being used as the entire evaluator template, replacing the DEFAULT_EVALUATOR_TEMPLATE which contains {{output}}, {{input}}, {{criteria}} variables. This meant the grader never saw the actual candidate response, always scoring 0. Now bare inline prompt strings (without template variables like {{output}}) are injected into the default template's {{criteria}} slot instead. Prompts that contain recognized template variables, and prompts from scripts/files, continue to work as full template overrides. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace while-loop assignment with matchAll iterator (biome noAssignInExpressions) and collapse short import to single line (biome formatter). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3 tasks
christso
added a commit
to EntityProcess/agentv-bench-skills
that referenced
this pull request
Apr 9, 2026
Rerun of with-superpowers vs without-superpowers logic-puzzle evals after merging fix(core): extract system messages in prompt builder for LLM grader (EntityProcess/agentv#983). Both experiments now score 100%/0.990 on gemini and azure. Previous 0% for with-superpowers/azure was entirely a grader artifact from bug #982. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
5 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #982
Summary
buildPromptInputsto extract system messages intosystemMessagefieldsystemPromptdirectly instead of in metadataTest plan
Co-Authored-By: Claude Opus 4.6 (1M context) noreply@anthropic.com