feat: add claude-cli provider for LLM scoring without API keys#43
Conversation
Adds a new "claude-cli" provider that shells out to the locally installed `claude` binary. This enables LLM scoring for users who are already authenticated via the CLI (e.g. company or team subscriptions) without requiring an explicit API key.
dacharyc
left a comment
There was a problem hiding this comment.
Thanks for this contribution! The claude-cli provider is a good addition; being able to score without managing API keys lowers the barrier to entry nicely. Merging as-is.
I'm going to make two small additions in a follow-up commit:
- Adding
--bareto the CLI args. Without it,claude -ploads CLAUDE.md files, project memory, rules, and other local context into every call. The current scoring system was designed so that each file gets evaluated in isolation through a clean API request with only the system prompt and the file content. Local project context would act as a hidden confounder, potentially biasing scores in ways that aren't reproducible across environments. Adding--baremakes the CLI provider behave like the Anthropic and OpenAI providers: just the prompt, the content, and nothing else. - Adding a preflight check for the claude binary. Right now if claude isn't installed, the error surfaces at scoring time as a generic exec failure. Adding an
exec.LookPath("claude")check inNewClientwill catch this early with a clear message.
Thanks again for the PR!
|
Well, dang. Just following up here, @marc0olo - looks like in addition to stripping the extra context (good for avoiding scoring interference!), the So I'll skip adding the flag, but have instead added notes to the README and relevant spots that having this additional information in context can affect scoring. I've also filed this issue to request a bare-equivalent flag that retains auth, so if Anthropic adds this in the future, I'll revisit adding a variant of this flag to bring the scoring to parity with the API providers: anthropics/claude-code#38022 |
Adds a new "claude-cli" provider that shells out to the locally installed
claudebinary. This enables LLM scoring for users who are already authenticated via the CLI (e.g. company or team subscriptions) without requiring an explicit API key.