Commit 3427da1
committed
feat: Add pointwise evaluation mode with pytest integration
- Add mode='pointwise' parameter to @evaluation_test decorator
- Enable elegant row-by-row evaluation where core logic is separated from test configuration
- Add comprehensive word_count example using pointwise mode with haikus dependency
- Update README.md with clean architecture documentation and Mermaid diagram
- Show parameterized evaluation components in visual diagram
- Include both pointwise and batch mode examples
- Add dataset adapter helper for word_count evaluation
- Deprecate old @reward_function pattern in favor of pytest-based approach
This provides a much more elegant API where users define just the core evaluation logic
and everything else (models, datasets, thresholds, rollout strategies) is parameterized
in the decorator, with full pytest integration for testing and CI/CD.1 parent 2a399f4 commit 3427da1
6 files changed
Lines changed: 449 additions & 549 deletions
File tree
- eval_protocol/pytest
- tests/pytest
- helper
0 commit comments