Skip to content

feat: add claude-cli provider for LLM scoring without API keys#43

Merged
dacharyc merged 1 commit intoagent-ecosystem:mainfrom
marc0olo:feat/claude-cli-provider
Mar 23, 2026
Merged

feat: add claude-cli provider for LLM scoring without API keys#43
dacharyc merged 1 commit intoagent-ecosystem:mainfrom
marc0olo:feat/claude-cli-provider

Conversation

@marc0olo
Copy link
Copy Markdown
Contributor

Adds a new "claude-cli" provider that shells out to the locally installed claude binary. This enables LLM scoring for users who are already authenticated via the CLI (e.g. company or team subscriptions) without requiring an explicit API key.

Adds a new "claude-cli" provider that shells out to the locally
installed `claude` binary. This enables LLM scoring for users who
are already authenticated via the CLI (e.g. company or team
subscriptions) without requiring an explicit API key.
Copy link
Copy Markdown
Member

@dacharyc dacharyc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this contribution! The claude-cli provider is a good addition; being able to score without managing API keys lowers the barrier to entry nicely. Merging as-is.

I'm going to make two small additions in a follow-up commit:

  1. Adding --bare to the CLI args. Without it, claude -p loads CLAUDE.md files, project memory, rules, and other local context into every call. The current scoring system was designed so that each file gets evaluated in isolation through a clean API request with only the system prompt and the file content. Local project context would act as a hidden confounder, potentially biasing scores in ways that aren't reproducible across environments. Adding --bare makes the CLI provider behave like the Anthropic and OpenAI providers: just the prompt, the content, and nothing else.
  2. Adding a preflight check for the claude binary. Right now if claude isn't installed, the error surfaces at scoring time as a generic exec failure. Adding an exec.LookPath("claude") check in NewClient will catch this early with a clear message.

Thanks again for the PR!

@dacharyc dacharyc merged commit 225a296 into agent-ecosystem:main Mar 23, 2026
2 checks passed
@dacharyc
Copy link
Copy Markdown
Member

Well, dang. Just following up here, @marc0olo - looks like in addition to stripping the extra context (good for avoiding scoring interference!), the --bare flag also strips auth, which negates the benefit of adding the claude-cli provider.

So I'll skip adding the flag, but have instead added notes to the README and relevant spots that having this additional information in context can affect scoring.

I've also filed this issue to request a bare-equivalent flag that retains auth, so if Anthropic adds this in the future, I'll revisit adding a variant of this flag to bring the scoring to parity with the API providers: anthropics/claude-code#38022

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants