-
Notifications
You must be signed in to change notification settings - Fork 18
[RFC] Testbench architecture: expressiveness vs. compilation model #42
Description
Context
In a real NoC design project (8×8 mesh, 64 nodes), we encountered fundamental limitations with the current @testbench approach when trying to verify complex dynamic behaviors like backpressure handling, multi-round traffic injection, and protocol-level assertions.
The current @testbench API (t.drive(), t.expect(), t.drive_when()) works well for simple smoke tests but struggles with scenarios that require:
- Reactive stimulus: deciding what to send based on DUT output (e.g. credit-based flow control)
- Complex state machines: multi-phase test sequences with branching logic
- Randomized/constrained verification: randomized traffic patterns with coverage tracking
- Reusable verification components: scoreboards, monitors, protocol checkers
This issue discusses possible architectural directions.
Option A: Full pyCircuit TB — extend MLIR to support dynamic behavior
Extend the @testbench DSL so it can express arbitrary Python-like control flow, which gets compiled via MLIR into C++/SystemVerilog testbenches.
Pros:
- Single-language design: everything stays in Python/pyCircuit
- Deterministic: compiled TB has no runtime overhead
- Portable: same TB works with C++ sim and Verilog sim
Cons:
- Requires MLIR to support a much richer subset of Python (loops with dynamic bounds, conditional branching on DUT outputs, data structures, etc.)
- Significant compiler complexity — essentially building a Python→C++ transpiler within the MLIR framework
- May never match the expressiveness of native C++/Python
Variant A2: Python-in-the-loop simulation
Instead of compiling Python to C++, keep the testbench in Python and call the C++ or Verilog model via FFI (similar to cocotb/PyVerilator).
- Pros: Full Python expressiveness, rapid iteration
- Cons: Performance overhead from Python↔C++ boundary crossing on every cycle; may be acceptable for small designs but problematic for large ones (our 64-node NoC runs ~10M cycles)
Option B: Hybrid — pyCircuit DUT + external hand-written TB
pyCircuit focuses on DUT compilation and provides optional lightweight TB for smoke tests. For serious verification, users write their own C++ or SV testbench against the generated model.
This requires:
buildcommand without mandatory@testbench(see PR feat: add drive_when conditional drive API for reactive testbenches #39 discussion / planned PR)- A well-documented C++ model API so external TBs can instantiate and drive the DUT
- Optionally, pyCircuit generates a TB skeleton (clock/reset boilerplate, port declarations) that users extend
Pros:
- No compiler complexity — leverage existing C++/SV verification ecosystem
- Full expressiveness: users can use UVM, cocotb, or any framework
- Natural for agent-assisted workflows: an AI agent can generate C++ TBs using the documented model API
Cons:
- Two-language workflow (Python for design, C++ for verification)
- Users need to understand the generated C++ model API
Option C: TB skeleton generation + plugin points
A middle ground: pyCircuit generates a complete TB harness (clock, reset, port bindings, VCD tracing) but provides explicit plugin points where users inject custom stimulus/checking logic.
@testbench
def tb(t: Tb) -> None:
t.clock("clk")
t.reset("rst", active_low=True)
# Built-in: simple drives/expects for smoke test
t.drive("start", 1, at=0)
t.expect("done", 1, at=100)
# Plugin: user-provided C++ function called every cycle
t.on_cycle("my_stimulus.cpp::inject_traffic")
# Plugin: user-provided checker
t.on_cycle("my_checker.cpp::check_output")The generated TB calls user-supplied C++ functions at the appropriate points, with access to the DUT port struct.
Pros:
- Best of both worlds: pyCircuit handles boilerplate, user handles complex logic
- Progressive complexity: start with
t.drive()/t.expect(), upgrade to plugins when needed - Single build flow
Cons:
- API design complexity (what context to pass to plugins, how to handle state)
- Still requires C++ for complex verification
Our Experience
In practice, we ended up with Option B organically:
- Used
@testbench+drive_whenfor initial smoke tests - Wrote a separate C++ testbench for multi-round all-to-all traffic verification
- Used Verilator with a hand-written SV testbench for final validation
The main pain point was that build currently requires @testbench, making it impossible to compile just the DUT without a dummy testbench function.
Questions for Discussion
- What is the long-term vision for
@testbench— is it intended to grow into a full verification language, or remain a lightweight smoke-test tool? - Is there interest in supporting
buildwithout@testbench(DUT-only compilation)? - Would a plugin/hook mechanism (Option C) be worth exploring?
- For Python-in-the-loop (Option A2), are there known performance benchmarks for the FFI overhead?