Distributed Tracing for remote scorers #33

realark · 2026-02-04T21:32:50Z

Use real braintrust api client in TestHarness (pointed at VCR)
do distributed traces for remote scorers
create one span per scorer in evals

produces eval traces like this:

https://www.braintrust.dev/app/braintrustdata.com/p/andrew-misc/trace?object_type=experiment&object_id=ff8cb35c-4e3a-4eb4-ad45-16ce97a9a3b8&r=0b3378817b514de1c5367fb9ba07c60c&s=d5042e3790a0e674

src/main/java/dev/braintrust/api/BraintrustApiClient.java

delner

Personally not a big fan of explicit tracing code in the eval scorer implementation, but will defer on this (not blocking.)

delner · 2026-02-06T15:55:18Z

src/main/java/dev/braintrust/eval/ScorerBrainstoreImpl.java

+     * @return parent object for distributed tracing, or null if tracing context not available
+     */
+    @Nullable
+    private Map<String, Object> buildParentSpanComponents() {


IMO, having tracing in the Eval scorer implementation seems like a bit of a code smell... it's creating hard coupling between Evals and OpenTracing which can be used separately from one another.

I'd recommend decoupling this through some appropriate abstraction; perhaps dependency injection, composition or some other pattern.

I don't follow? Evals and otel already have a coupling in the sense that the data created by evals is captured mostly through otel traces, but in terms of what appears in the public apis for evals this doesn't change anything.

https://github.com/braintrustdata/braintrust-sdk-java/blob/main/src/main/java/dev/braintrust/eval/Scorer.java

I could add an additional method allowing explicit parent info to be passed when scoring:

// something like this List<Score> score(TaskResult<INPUT, OUTPUT> taskResult, ParentInfo parentInfo);

But I'm not sure what that buys us? In this case it would make the surface area of the public api a bit larger and would only be invoked by callers within otel traces.

It would make sense in the context of a larger refactor to decouple otel from evals though. That seems beyond the scope of this PR so I'll press on, but let's chat about it some time

src/main/java/dev/braintrust/trace/BraintrustShutdownHook.java

VCR otel export

0d06e27

realark added the enhancement New feature or request label Feb 4, 2026

realark force-pushed the ark/vcr-otel branch 3 times, most recently from 907fb9c to 6ed7152 Compare February 5, 2026 19:06

realark added 2 commits February 5, 2026 18:35

do a distributed trace when invoking remote scorers

a4981f1

record cassettes

61f2cee

realark force-pushed the ark/vcr-otel branch from 6ed7152 to 61f2cee Compare February 6, 2026 01:37

realark commented Feb 6, 2026

View reviewed changes

src/main/java/dev/braintrust/api/BraintrustApiClient.java Show resolved Hide resolved

realark marked this pull request as ready for review February 6, 2026 01:40

realark requested a review from delner February 6, 2026 01:45

realark changed the title ~~Distributed Tracing for remote LLM scorers~~ Distributed Tracing for remote scorers Feb 6, 2026

delner approved these changes Feb 6, 2026

View reviewed changes

realark merged commit 1a2b623 into main Feb 6, 2026
1 check passed

realark deleted the ark/vcr-otel branch February 6, 2026 17:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Distributed Tracing for remote scorers #33

Distributed Tracing for remote scorers #33

Uh oh!

realark commented Feb 4, 2026 •

edited

Loading

Uh oh!

Uh oh!

delner left a comment

Uh oh!

delner Feb 6, 2026

Uh oh!

realark Feb 6, 2026

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Distributed Tracing for remote scorers #33

Distributed Tracing for remote scorers #33

Uh oh!

Conversation

realark commented Feb 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

delner left a comment

Choose a reason for hiding this comment

Uh oh!

delner Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

realark Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

realark commented Feb 4, 2026 •

edited

Loading