Skip to content

Commit 20ab7c1

Browse files
committed
fixing num_runs
1 parent 54b5cf7 commit 20ab7c1

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

eval_protocol/benchmarks/test_tau_bench_airline.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -119,7 +119,7 @@ def tau_bench_airline_to_evaluation_row(data: List[Dict[str, Any]]) -> List[Eval
119119
rollout_processor=MCPGymRolloutProcessor(),
120120
rollout_processor_kwargs={"domain": "airline"},
121121
passed_threshold={"success": 0.4, "standard_error": 0.02},
122-
num_runs=2,
122+
num_runs=8,
123123
mode="pointwise",
124124
max_concurrent_rollouts=50,
125125
server_script_path=_get_server_script_path(),

0 commit comments

Comments
 (0)