feat: add claude-cli provider for LLM scoring without API keys by marc0olo · Pull Request #43 · agent-ecosystem/skill-validator

marc0olo · 2026-03-23T11:43:21Z

Adds a new "claude-cli" provider that shells out to the locally installed claude binary. This enables LLM scoring for users who are already authenticated via the CLI (e.g. company or team subscriptions) without requiring an explicit API key.

Adds a new "claude-cli" provider that shells out to the locally installed `claude` binary. This enables LLM scoring for users who are already authenticated via the CLI (e.g. company or team subscriptions) without requiring an explicit API key.

dacharyc

Thanks for this contribution! The claude-cli provider is a good addition; being able to score without managing API keys lowers the barrier to entry nicely. Merging as-is.

I'm going to make two small additions in a follow-up commit:

Adding --bare to the CLI args. Without it, claude -p loads CLAUDE.md files, project memory, rules, and other local context into every call. The current scoring system was designed so that each file gets evaluated in isolation through a clean API request with only the system prompt and the file content. Local project context would act as a hidden confounder, potentially biasing scores in ways that aren't reproducible across environments. Adding --bare makes the CLI provider behave like the Anthropic and OpenAI providers: just the prompt, the content, and nothing else.
Adding a preflight check for the claude binary. Right now if claude isn't installed, the error surfaces at scoring time as a generic exec failure. Adding an exec.LookPath("claude") check in NewClient will catch this early with a clear message.

Thanks again for the PR!

dacharyc · 2026-03-24T00:35:34Z

Well, dang. Just following up here, @marc0olo - looks like in addition to stripping the extra context (good for avoiding scoring interference!), the --bare flag also strips auth, which negates the benefit of adding the claude-cli provider.

So I'll skip adding the flag, but have instead added notes to the README and relevant spots that having this additional information in context can affect scoring.

I've also filed this issue to request a bare-equivalent flag that retains auth, so if Anthropic adds this in the future, I'll revisit adding a variant of this flag to bring the scoring to parity with the API providers: anthropics/claude-code#38022

This was referenced Mar 23, 2026

Replace custom validator with skill-validator CLI dfinity/icskills#114

Closed

feat: replace custom validator with skill-validator CLI dfinity/icskills#115

Merged

dacharyc approved these changes Mar 23, 2026

View reviewed changes

dacharyc merged commit 225a296 into agent-ecosystem:main Mar 23, 2026
2 checks passed

dacharyc mentioned this pull request Mar 24, 2026

fix: add preflight check and --bare for claude CLI client #44

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add claude-cli provider for LLM scoring without API keys#43

feat: add claude-cli provider for LLM scoring without API keys#43
dacharyc merged 1 commit intoagent-ecosystem:mainfrom
marc0olo:feat/claude-cli-provider

marc0olo commented Mar 23, 2026

Uh oh!

dacharyc left a comment

Uh oh!

Uh oh!

dacharyc commented Mar 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

marc0olo commented Mar 23, 2026

Uh oh!

dacharyc left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

dacharyc commented Mar 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants