Python: Improve the Python sample validation workflow to increase pass rate#6747
Python: Improve the Python sample validation workflow to increase pass rate#6747TaoChenOSU wants to merge 3 commits into
Conversation
There was a problem hiding this comment.
Automated Code Review
Reviewers: 5 | Confidence: 82%
✓ Correctness
The PR adds a SkillsProvider to the sample validation agent, allowing it to use file-based skills (specifically foundry-config-setup) to resolve common setup issues. The implementation is correct: SkillsProvider.from_paths accepts str|Path|Sequence, the SkillsProvider extends ContextProvider matching the expected type for context_providers, and the SKILLS_DIR path is correctly computed relative to the script. The SKILL.md content is clear and well-structured. No correctness issues found.
✓ Security Reliability
This PR adds a skills-based mechanism to the sample validation workflow, allowing the validation agent to fix known placeholder configurations in samples before re-running them. The changes are well-scoped: a static SKILL.md provides bounded instructions for replacing hardcoded Foundry placeholders with environment variable reads. The
SkillsProvider.from_pathsAPI is used correctly (parameter type matchesstr | Path | Sequence[str | Path]), and theSKILLS_DIRpath is safely constructed relative to the script file. No security or reliability concerns identified—the code modification scope is bounded by the skill content, which is committed to the repo alongside the script.
✓ Test Coverage
This PR adds a SkillsProvider with a file-based skill to the sample validation workflow and updates agent instructions. The changes are configuration/instruction-level modifications to an operational CI script that has no existing test suite. The underlying SkillsProvider API is well-tested in the core package. While adding tests for the workflow script would be beneficial, the lack of tests is a pre-existing condition and the new changes (a constant definition, a provider wiring, and prompt text updates) don't introduce complex testable logic beyond what's already covered by the SkillsProvider unit tests.
✓ Failure Modes
This PR adds a SkillsProvider to the sample validation workflow agents, along with a SKILL.md file for resolving hardcoded Foundry config placeholders. The integration uses the correct API (from_paths accepts str), the skills directory is co-located and committed in the same PR, and there are no silent failure paths, swallowed exceptions, or operational failure modes introduced by these changes.
✗ Design Approach
The new skill-based validation path changes the meaning of a passing result in a way that conflicts with the repo’s documented sample conventions: it tells the validator to rewrite basic
01-get-startedsamples to use environment variables and then report them assuccess, even though those samples are explicitly supposed to keep self-documenting placeholder values instead ofos.environ. I also found a narrower configuration issue in the same skill: it makesFOUNDRY_MODELmandatory, while the repo’s own Foundry guidance and existing samples treat that variable as optional with a default.
Flagged Issues
- The validator now reports
01-get-startedsamples as passing only after rewriting them into a form the repo explicitly forbids for basic samples (python/scripts/sample_validation/skills/foundry-config-setup/SKILL.md:50-52vspython/samples/SAMPLE_GUIDELINES.md:80-84).
Automated review by TaoChenOSU's agents
|
|
||
| These samples are intentionally written with hardcoded placeholders, so this | ||
| is expected setup—not a defect in the sample. After applying the change, | ||
| re-run the sample and report the result as a `success` if it now runs. Do not |
There was a problem hiding this comment.
This changes the validator from “validate the checked-in basic sample” to “rewrite the basic sample into a different style, then count it as success.” The repo’s own sample contract says otherwise: Basic samples should NOT use os.environ, load_dotenv(), or .env files. The placeholder values make the code self-documenting. (python/samples/SAMPLE_GUIDELINES.md:80-84). The targeted 01-get-started files really do follow that placeholder pattern today (python/samples/01-get-started/01_hello_agent.py:21-25, python/samples/01-get-started/02_add_tools.py:36-40), so this would silently mark guideline-violating rewrites as passing validation.
|
Flagged issue The validator now reports Source: automated DevFlow PR review |
There was a problem hiding this comment.
Pull request overview
This PR aims to improve the Python sample-validation workflow pass rate by enabling the validation agent to apply a curated, file-based “skill” when it encounters known setup/placeholder issues in samples (e.g., hardcoded Foundry endpoint/model), then re-run the sample.
Changes:
- Adds a file-based skill (
foundry-config-setup) that instructs how to fix placeholder/hardcoded Foundry configuration in samples by reading from environment variables. - Wires a
SkillsProviderinto the sample validation agents so they can load and follow file-based skills during validation. - Updates agent instructions to allow modifying sample code only when a relevant skill explicitly directs it.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| python/scripts/sample_validation/skills/foundry-config-setup/SKILL.md | Introduces a file-based skill for resolving Foundry placeholder configuration in samples. |
| python/scripts/sample_validation/create_dynamic_workflow_executor.py | Loads file-based skills into validation agents and relaxes “no edits” guidance when a skill applies. |
| These samples are intentionally written with hardcoded placeholders, so this | ||
| is expected setup—not a defect in the sample. After applying the change, | ||
| re-run the sample and report the result as a `success` if it now runs. Do not | ||
| include a suggested `fix`. |
Motivation & Context
Description & Review Guide
Related Issue
Fixes #
Contribution Checklist
breaking changelabel (or add "[BREAKING]" to the title prefix, before or after any language prefix) — a workflow keeps the label and title prefix in sync automatically.