File tree Expand file tree Collapse file tree
Expand file tree Collapse file tree Original file line number Diff line number Diff line change @@ -23,8 +23,8 @@ capabilities.
2323Here's a simple test function that checks if a model's response contains ** bold** text formatting:
2424
2525``` python test_bold_format.py
26- from eval_protocol.models import EvaluateResult, EvaluationRow
27- from eval_protocol.pytest import default_single_turn_rollout_processor , evaluation_test
26+ from eval_protocol.models import EvaluateResult, EvaluationRow, Message
27+ from eval_protocol.pytest import SingleTurnRolloutProcessor , evaluation_test
2828
2929@evaluation_test (
3030 input_messages = [
@@ -33,8 +33,8 @@ from eval_protocol.pytest import default_single_turn_rollout_processor, evaluati
3333 Message(role = " user" , content = " Explain why **evaluations** matter for building AI agents. Make it dramatic!" ),
3434 ],
3535 ],
36- model = [ " accounts/fireworks/models/llama-v3p1-8b-instruct" ],
37- rollout_processor = default_single_turn_rollout_processor ,
36+ completion_params = [{ " model " : " accounts/fireworks/models/llama-v3p1-8b-instruct" } ],
37+ rollout_processor = SingleTurnRolloutProcessor() ,
3838 mode = " pointwise" ,
3939)
4040def test_bold_format (row : EvaluationRow) -> EvaluationRow:
You can’t perform that action at this time.
0 commit comments