Skip to content

feat(cli): add HuggingFace dataset import command#986

Merged
christso merged 4 commits intomainfrom
feat/978-huggingface-import
Apr 8, 2026
Merged

feat(cli): add HuggingFace dataset import command#986
christso merged 4 commits intomainfrom
feat/978-huggingface-import

Conversation

@christso
Copy link
Copy Markdown
Collaborator

@christso christso commented Apr 8, 2026

Summary

  • Adds agentv import huggingface CLI command to import datasets from HuggingFace Hub into AgentV EVAL.yaml format
  • Python script (scripts/import-huggingface.py) uses datasets library to load from HuggingFace and converts instances to EVAL.yaml files
  • Supports SWE-bench-style datasets with automatic field mapping: instance_id -> test id, problem_statement -> input, FAIL_TO_PASS -> code-grader assertions, repo -> docker workspace config
  • Extensible schema converter registry for supporting additional dataset formats in the future

Closes #978

Test plan

  • bun run build passes
  • bun run test — all 1901 tests pass
  • bun run typecheck passes
  • bun run lint passes
  • bun run validate:examples — all 55 example eval files valid
  • E2E: agentv import huggingface --repo SWE-bench/SWE-bench_Verified --split test --limit 2 --output /tmp/test/ produces valid EVAL.yaml files
  • Generated EVAL.yaml files pass agentv validate

Red/Green UAT

Red (before): agentv import huggingface command does not exist — running it produces an error.

Green (after):

$ bun apps/cli/src/cli.ts import huggingface --repo SWE-bench/SWE-bench_Verified --split test --limit 2 --output /tmp/agentv-hf-test/
Importing from HuggingFace: SWE-bench/SWE-bench_Verified (split=test)...
Loading dataset SWE-bench/SWE-bench_Verified (split=test)...
Converting 2 instances...
Created 2 EVAL.yaml files in /tmp/agentv-hf-test/
Imported 2 eval(s) from SWE-bench/SWE-bench_Verified → /tmp/agentv-hf-test/

$ bun apps/cli/src/cli.ts validate /tmp/agentv-hf-test-git/astropy__astropy-12907.EVAL.yaml
Validation Summary
✓ /tmp/agentv-hf-test-git/astropy__astropy-12907.EVAL.yaml
Total files: 1 | Valid: 1 | Invalid: 0

🤖 Generated with Claude Code

Add `agentv import huggingface` to import datasets from HuggingFace Hub
into AgentV EVAL.yaml format. Supports SWE-bench-style datasets with
automatic field mapping (instance_id -> test id, problem_statement ->
input, FAIL_TO_PASS -> code-grader assertions, repo -> docker workspace).

The command shells out to a Python script via `uv run` (per repo
convention for Python scripts). The script uses inline PEP 723 metadata
so `uv` auto-installs `datasets` and `pyyaml` dependencies.

Closes #978

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages Bot commented Apr 8, 2026

Deploying agentv with  Cloudflare Pages  Cloudflare Pages

Latest commit: aa7072a
Status: ✅  Deploy successful!
Preview URL: https://b7f07d5b.agentv.pages.dev
Branch Preview URL: https://feat-978-huggingface-import.agentv.pages.dev

View logs

christso and others added 3 commits April 8, 2026 22:40
- Handle missing uv with clear error message
- Surface child process stderr on failure
- Add PASS_TO_PASS regression test assertion
- Wrap datasets import in try/except ImportError
- Validate --limit is positive

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Replace template literal with string literal (no interpolation needed)
- Fix execFile formatting to match biome style

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
base_commit is not informational metadata — it's required to reproduce
the evaluation environment. SWE-bench builds Docker images with
`git reset --hard {base_commit}` and resets test files to this commit
before running tests. Place it in workspace.docker where it belongs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@christso christso marked this pull request as ready for review April 8, 2026 23:01
@christso christso merged commit f5df231 into main Apr 8, 2026
4 checks passed
@christso christso deleted the feat/978-huggingface-import branch April 8, 2026 23:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: HuggingFace dataset import command

1 participant