Bug
braintrust eval --filter metadata.name='...' crashes when data contains EvalCase objects:
AttributeError: 'EvalCase' object has no attribute 'get'
Cause
run_evaluator_task normalizes both dicts and EvalCase objects:
if isinstance(datum, dict):
datum = EvalCase.from_dict(datum)
But evaluate_filter runs before this normalization and assumes dict-only input:
def evaluate_filter(object, filter: Filter):
key = object
for p in filter.path:
key = key.get(p) # fails — EvalCase has no .get()
The filtered_iterator sits between the raw data and run_evaluator_task, so EvalCase objects hit .get() before they can be normalized.
Reproduction
from braintrust import Eval, EvalCase
Eval(
"test",
data=[EvalCase(input="hello", metadata={"name": "foo"})],
task=lambda x: x,
scores=[],
)
braintrust eval --filter metadata.name='foo' path/to/above.py
Affected versions
0.11.0 through 0.12.1 (latest).
Suggested fix
Normalize to a Mapping before filtering (matching run_evaluator_task's approach), or use __getitem__ which SerializableDataClass already provides. Using Mapping is more precise since .get() is defined on Mapping, not just dict:
from collections.abc import Mapping
def evaluate_filter(object, filter: Filter):
key = object if isinstance(object, Mapping) else object.as_dict()
for p in filter.path:
key = key.get(p)
if key is None:
return False
return filter.pattern.match(serialize_json_with_plain_string(key)) is not None
Bug
braintrust eval --filter metadata.name='...'crashes whendatacontainsEvalCaseobjects:Cause
run_evaluator_tasknormalizes both dicts andEvalCaseobjects:But
evaluate_filterruns before this normalization and assumes dict-only input:The
filtered_iteratorsits between the raw data andrun_evaluator_task, soEvalCaseobjects hit.get()before they can be normalized.Reproduction
Affected versions
0.11.0 through 0.12.1 (latest).
Suggested fix
Normalize to a
Mappingbefore filtering (matchingrun_evaluator_task's approach), or use__getitem__whichSerializableDataClassalready provides. UsingMappingis more precise since.get()is defined onMapping, not justdict: