DV-Smith is a framework that converts SystemVerilog/UVM testbenches into containerized verification tasks (DV gyms), similar to SWE-smith/SWE-Gym but for hardware verification.
DV-Smith automates the process of:
- Analyzing UVM repositories to discover tests, sequences, and covergroups
- Building DV gyms with isolated tasks for each test
- Evaluating agent solutions based on coverage and health metrics
- Python 3.8+
- Git
- Anthropic API key (required for AI-powered analysis)
- (Optional) Simulators: Questa/ModelSim, Xcelium, VCS, or Verilator
# Clone the repository
git clone https://github.com/yourusername/dv-smith.git
cd dv-smith
# Install with uv (recommended)
uv venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
uv pip install -e .
# Or with pip
python -m venv .venv
source .venv/bin/activate
pip install -e .
# Required: Set up Anthropic API key for Claude-powered analysis
echo "ANTHROPIC_API_KEY=your-key-here" > .envImportant: Get your API key from https://console.anthropic.com/settings/keys. The API key is required for repository analysis and task generation using Claude 3.5 Sonnet.
The first step is to analyze a UVM testbench repository and create a profile:
# Ingest from GitHub
dvsmith ingest https://github.com/mbits-mirafra/apb_avip --name apb_avip
# Or ingest from local path
dvsmith ingest /path/to/local/repo --name my_benchWhat happens during ingest:
- Claude 3.5 Sonnet analyzes the UVM repository structure
- Discovers tests, sequences, covergroups, and base classes
- Detects build system (Makefile, CMake, etc.)
- Identifies supported simulators
- Creates a profile YAML file in
dvsmith_workspace/profiles/
Example output:
[dv-smith] Ingesting repository: https://github.com/mbits-mirafra/apb_avip
[dv-smith] Using Claude 3.5 Sonnet for AI-powered analysis...
[AI Analyzer] Gathering repository structure...
[AI Analyzer] Identifying key directories...
[AI Analyzer] Analyzing test files...
[AI Analyzer] Found 10 tests
[AI Analyzer] Found 12 sequences
[AI Analyzer] Found 2 covergroups
[AI Analyzer] Detected simulators: ['questa', 'vcs', 'xcelium']
[dv-smith] Profile saved to: dvsmith_workspace/profiles/apb_avip.yaml
Once you have a profile, build the gym:
# Build gym for all detected simulators
dvsmith build apb_avip
# Or specify specific simulators
dvsmith build apb_avip --sim xcelium,questaWhat happens during build:
- Clones/copies the repository
- Removes original test files (keeps them as reference)
- Generates task specifications (Markdown files) for each test
- Sets up smoke tests for validation
- Creates directory structure
Example output:
[dv-smith] Building gym: apb_avip
[dv-smith] Repository: https://github.com/mbits-mirafra/apb_avip
[TaskGen] Processing 24 tests
[TaskGen] Smoke tests: ['base_test']
[TaskGen] Generated: task_001_8b_write.md
[TaskGen] Generated: task_002_16b_write.md
...
[TaskGen] Generated 23 tasks
[dv-smith] Gym built successfully at: dvsmith_workspace/gyms/apb_avip
Your gym is now ready! The directory structure looks like:
dvsmith_workspace/
├── profiles/
│ └── apb_avip.yaml # Profile configuration
├── gyms/
│ └── apb_avip/
│ ├── repo/ # Cloned repository (tests removed)
│ ├── tasks/ # Task specifications
│ │ ├── task_001_8b_write.md
│ │ ├── task_002_16b_write.md
│ │ └── ...
│ ├── smoke_tests/ # Smoke test reference files
│ └── README.md # Gym information
└── artifacts/ # Evaluation results (created later)
Tasks are specified in Markdown format:
cat dvsmith_workspace/gyms/apb_avip/tasks/task_001_8b_write.mdEach task includes:
- Goal: What functionality to implement
- Description: Detailed requirements
- Hints: Helpful pointers
- Acceptance Criteria: Coverage targets and constraints
- Scoring Weights: How the solution will be graded
To solve a task manually:
-
Navigate to the gym repository:
cd dvsmith_workspace/gyms/apb_avip/repo -
Create your test file in the appropriate directory (see profile for paths)
-
Implement your UVM test and sequences
-
Create a patch file:
git diff > solution.patch
Solution validation now runs through the Terminal Bench harness. Once you have
your patch, use the tb CLI to execute the official checks:
tb run \
--dataset-path dvsmith_workspace/terminal_bench_tasks/axi4_avip \
--task-id assertion-master_assertionsRefer to the Terminal Bench documentation for simulator configuration, artifact uploads, and additional run options.
# Ingest multiple repositories
dvsmith ingest https://github.com/mbits-mirafra/apb_avip --name apb
dvsmith ingest https://github.com/mbits-mirafra/axi4_avip --name axi4
dvsmith ingest https://github.com/mbits-mirafra/spi_avip --name spi
# Build all gyms
dvsmith build apb
dvsmith build axi4
dvsmith build spi# Use the Claude SDK agent (autonomous code generation)
python examples/agents/claude_sdk_agent.py \
dvsmith_workspace/gyms/apb_avip/tasks/task_008_8b_write.md \
solutions/task_008
# Validate the agent's solution with the Terminal Bench harness
tb run \
--dataset-path dvsmith_workspace/terminal_bench_tasks/apb_avip \
--task-id task_008_8b_write# Evaluate all tasks in a gym
for task in dvsmith_workspace/gyms/apb_avip/tasks/*.md; do
task_id=$(basename "$task" .md)
python your_agent.py "$task" "solutions/$task_id"
tb run --dataset-path dvsmith_workspace/terminal_bench_tasks/apb_avip --task-id "$task_id"
doneProfiles are stored in dvsmith_workspace/profiles/<name>.yaml:
name: apb_avip
repo_path: https://github.com/mbits-mirafra/apb_avip
commit: main
# Discovered information
tests:
- name: apb_8b_write_test
file: src/hvlTop/tb/test/apb_8b_write_test.sv
base_class: base_test
sequences:
- name: apb_8b_write_seq
file: src/hvlTop/sequences/apb_8b_write_seq.sv
covergroups:
- apb_master_coverage.apb_tx_cg
# Build configuration
build:
questa:
work_dir: sim/questa_sim
compile_cmd: make -C sim/questa_sim compile
run_cmd: make -C sim/questa_sim simulate TEST={test} SEED={seed}
xcelium:
work_dir: sim/cadence_sim
compile_cmd: make -C sim/cadence_sim compile
run_cmd: make -C sim/cadence_sim simulate TEST={test} SEED={seed}You can manually edit profiles to customize paths and commands.
# Required for AI-powered analysis (Claude 3.5 Sonnet)
ANTHROPIC_API_KEY=your-key-here
# Optional: Simulator paths (if not in PATH)
QUESTA_HOME=/path/to/questa
XCELIUM_HOME=/path/to/xceliumDV-Smith provides full visibility into AI operations to help you understand and debug the analysis process.
All AI interactions are automatically logged to ~/.dvsmith/ai_calls.jsonl. You can view these logs using the ai-logs command:
# View recent AI calls (last 10 by default)
dvsmith ai-logs
# Show more entries
dvsmith ai-logs --tail 20
# Show full details including prompts and responses
dvsmith ai-logs --tail 5 --fullWhat's logged:
- Timestamp of each AI call
- Response model type (TestInfo, DirectoryInfo, BuildInfo, etc.)
- Call duration in milliseconds
- Success/error status
- Complete prompts and responses (with --full flag)
Example output:
[dv-smith] AI call logs: /home/user/.dvsmith/ai_calls.jsonl
[1] 2025-10-03T10:03:10.044668
Model: TestInfo
Duration: 14601ms
Status: ✓ Success
[2] 2025-10-03T10:03:35.238062
Model: TestFileList
Duration: 19051ms
Status: ✓ Success
During the ingest process, DV-Smith shows exactly which test files the AI identifies:
[AI Analyzer] Analyzing test files...
[AI Analyzer] AI identified 10 test files:
- dvsmith_workspace/clones/apb_avip/src/hvl_top/test/apb_16b_read_test.sv
- dvsmith_workspace/clones/apb_avip/src/hvl_top/test/apb_16b_write_test.sv
- dvsmith_workspace/clones/apb_avip/src/hvl_top/test/apb_24b_write_test.sv
- dvsmith_workspace/clones/apb_avip/src/hvl_top/test/apb_32b_write_test.sv
...
This transparency helps you:
- Verify that the AI correctly identified all test files
- Debug issues when tests are missing or incorrectly classified
- Understand the AI's decision-making process
- Track performance and identify slow API calls
If the AI analysis produces unexpected results:
-
Check the logs to see what the AI identified:
dvsmith ai-logs --tail 20 --full
-
Look for errors in the log entries:
grep -i error ~/.dvsmith/ai_calls.jsonl -
Verify API key is working correctly:
echo $ANTHROPIC_API_KEY python -c "from anthropic import Anthropic; Anthropic().messages.create(model='claude-3-5-sonnet-20241022', max_tokens=10, messages=[{'role':'user','content':'test'}])"
-
Review response models to understand what data the AI returned
Solution: Check your API key and network connection:
# Verify API key is set correctly
echo $ANTHROPIC_API_KEY
# Check if key is in .env file
cat .env | grep ANTHROPIC_API_KEY
# Test API connectivity
python -c "from anthropic import Anthropic; client = Anthropic(); print('API OK')"Note: The AI analyzer requires a valid Anthropic API key. Unlike some other tools, DV-Smith does not have a static fallback - AI analysis is essential for accurate repository understanding.
Possible causes:
- Tests use non-standard naming (not
*test*.svor*Test.sv) - Tests are in unexpected directories
- Repository structure is unusual
Solution: Provide hints in a JSON file:
{
"tests_dir": "custom/path/to/tests",
"sequences_dir": "custom/path/to/sequences",
"base_test": "my_custom_base_test"
}Then ingest with hints:
dvsmith ingest /path/to/repo --name my_bench --hints hints.jsonCheck:
- Simulator is installed and in PATH
- Compile command in profile is correct
- Dependencies are available
Debug:
# Check simulator availability
which xrun # For Xcelium
which vsim # For Questa
# Try manual compilation
cd dvsmith_workspace/gyms/<name>/repo
make -C sim/cadence_sim compile- Writing Agents: Learn how to create agents that solve tasks
- Understanding Evaluation: Deep dive into scoring and metrics
- Advanced Usage: Custom adapters, hooks, and integrations
- Check the FAQ
- Open an issue on GitHub
- Join our community discussions