Skip to content

feat: add --runs-functional and --runs-trigger flags to report command#11

Merged
melanie531 merged 1 commit into
aws-samples:mainfrom
udid-aws:feat/report-runs-flags
May 27, 2026
Merged

feat: add --runs-functional and --runs-trigger flags to report command#11
melanie531 merged 1 commit into
aws-samples:mainfrom
udid-aws:feat/report-runs-flags

Conversation

@udid-aws

Copy link
Copy Markdown
Contributor

Problem

The report command runs functional and trigger evals internally but doesn't expose a way to control the number of runs. The standalone functional and trigger commands both have --runs, but report always uses the hardcoded defaults.

Solution

Adds two flags to the report command:

  • --runs-functional N — number of runs per functional eval case (default: 1)
  • --runs-trigger N — number of runs per trigger query (default: 3)

Defaults match the standalone commands, so without these flags behavior is unchanged.

Usage

skill-eval report myskill --runs-functional 3 --runs-trigger 5
skill-eval report myskill --runs-trigger 1    # quick single-run trigger

Testing

  • All 645 tests pass
  • Manually verified --runs-functional and --runs-trigger produce the expected number of runs
  • Regression tested without the flags — behavior identical to before

Fixes #10

The report command now supports controlling the number of runs per
functional eval case and trigger query, matching the --runs flag
available in the standalone functional and trigger commands.

Defaults remain unchanged: 1 for functional, 3 for trigger.

Fixes aws-samples#10
@melanie531 melanie531 merged commit 13b2277 into aws-samples:main May 27, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add --runs-functional and --runs-trigger flags to report command

2 participants