fix(devserver): honor evaluator.project_id when request omits it#372
Conversation
The dev-server's run_eval built EvalAsync(...) kwargs with
{**eval_kwargs, ..., "project_id": eval_data.get("project_id")}
The trailing key always wins in dict-spread merging, so a request body
that omits project_id silently overrode the registered evaluator's
project_id to None. EvalAsync(name=..., project_id=None) then fell back
to using the eval name as the project name (per Eval(... project_id) docs:
"If specified, uses the given project ID instead of the evaluator's name
to identify the project."), so experiments routed into a per-evaluator-name
auto-created project instead of the project the evaluator was registered
against.
Use evaluator.project_id as a fallback when the request omits it. An
explicit project_id in the request still takes precedence.
Tests:
- test_eval_falls_back_to_evaluator_project_id_when_request_omits_it —
registers an evaluator with a known project_id, posts /eval without
project_id, asserts EvalAsync receives the registered id.
- test_eval_request_project_id_overrides_evaluator — confirms an
explicit request-level project_id still wins.
bff47a1 to
69505ff
Compare
|
Updated to use the cleaner two-arg project_id = eval_data.get("project_id", evaluator.project_id)The previous Both tests still pass (the regression test omits the key; the override test passes a non-empty string — neither exercises the empty-string edge case where the two forms diverge). |
|
gonna push up a small adjustment for this, and we can merge it in. As always, thanks for the PR Will Frey (@willfrey)!
I know this is more correct, but our backend implementation actually uses |
|
Thank you! Adjust away :) I appreciate your responsiveness! |
7c04444
into
braintrustdata:main
Summary
The dev-server's
run_evalbuildsEvalAsync(...)kwargs with:{**eval_kwargs, ..., "project_id": eval_data.get("project_id")}The trailing key always wins in dict-spread merging, so a request body that omits
project_idsilently overrides the registered evaluator'sproject_idtoNone.EvalAsync(name=..., project_id=None)then falls back to using the eval name as the project name (perEval(...)docstring: "If specified, uses the given project ID instead of the evaluator's name to identify the project."), so experiments route into a per-evaluator-name auto-created project instead of the project the evaluator was registered against.This bites consumers who mount the dev-server behind a custom auth layer and trigger evals from anything other than the Braintrust playground UI: every triggered run lands in a fresh eval-name-keyed project rather than the canonical project the registered
Evaluator(project_id=...)named.Fix
Fall back to
evaluator.project_idwhen the request omits it. An explicit request-levelproject_idstill takes precedence (no behavior change for the playground UI flow).Test plan
test_eval_falls_back_to_evaluator_project_id_when_request_omits_it— registers an evaluator with a knownproject_id, POSTs/evalwithoutproject_id, assertsEvalAsyncreceives the registered id. (Fails onmain, passes with this fix.)test_eval_request_project_id_overrides_evaluator— confirms an explicit request-levelproject_idstill wins.py/src/braintrust/devserver/test suite green (21 passed, 2 pre-existing skips).nox -s pylintpasses.