-
Notifications
You must be signed in to change notification settings - Fork 1
Distributed Tracing for remote scorers #33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
907fb9c to
6ed7152
Compare
delner
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally not a big fan of explicit tracing code in the eval scorer implementation, but will defer on this (not blocking.)
| * @return parent object for distributed tracing, or null if tracing context not available | ||
| */ | ||
| @Nullable | ||
| private Map<String, Object> buildParentSpanComponents() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, having tracing in the Eval scorer implementation seems like a bit of a code smell... it's creating hard coupling between Evals and OpenTracing which can be used separately from one another.
I'd recommend decoupling this through some appropriate abstraction; perhaps dependency injection, composition or some other pattern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't follow? Evals and otel already have a coupling in the sense that the data created by evals is captured mostly through otel traces, but in terms of what appears in the public apis for evals this doesn't change anything.
I could add an additional method allowing explicit parent info to be passed when scoring:
// something like this
List<Score> score(TaskResult<INPUT, OUTPUT> taskResult, ParentInfo parentInfo);
But I'm not sure what that buys us? In this case it would make the surface area of the public api a bit larger and would only be invoked by callers within otel traces.
It would make sense in the context of a larger refactor to decouple otel from evals though. That seems beyond the scope of this PR so I'll press on, but let's chat about it some time
produces eval traces like this:
https://www.braintrust.dev/app/braintrustdata.com/p/andrew-misc/trace?object_type=experiment&object_id=ff8cb35c-4e3a-4eb4-ad45-16ce97a9a3b8&r=0b3378817b514de1c5367fb9ba07c60c&s=d5042e3790a0e674