Skip to content

[RFC] Testbench architecture: expressiveness vs. compilation model #42

@winfredsu

Description

@winfredsu

Context

In a real NoC design project (8×8 mesh, 64 nodes), we encountered fundamental limitations with the current @testbench approach when trying to verify complex dynamic behaviors like backpressure handling, multi-round traffic injection, and protocol-level assertions.

The current @testbench API (t.drive(), t.expect(), t.drive_when()) works well for simple smoke tests but struggles with scenarios that require:

  • Reactive stimulus: deciding what to send based on DUT output (e.g. credit-based flow control)
  • Complex state machines: multi-phase test sequences with branching logic
  • Randomized/constrained verification: randomized traffic patterns with coverage tracking
  • Reusable verification components: scoreboards, monitors, protocol checkers

This issue discusses possible architectural directions.


Option A: Full pyCircuit TB — extend MLIR to support dynamic behavior

Extend the @testbench DSL so it can express arbitrary Python-like control flow, which gets compiled via MLIR into C++/SystemVerilog testbenches.

Pros:

  • Single-language design: everything stays in Python/pyCircuit
  • Deterministic: compiled TB has no runtime overhead
  • Portable: same TB works with C++ sim and Verilog sim

Cons:

  • Requires MLIR to support a much richer subset of Python (loops with dynamic bounds, conditional branching on DUT outputs, data structures, etc.)
  • Significant compiler complexity — essentially building a Python→C++ transpiler within the MLIR framework
  • May never match the expressiveness of native C++/Python

Variant A2: Python-in-the-loop simulation

Instead of compiling Python to C++, keep the testbench in Python and call the C++ or Verilog model via FFI (similar to cocotb/PyVerilator).

  • Pros: Full Python expressiveness, rapid iteration
  • Cons: Performance overhead from Python↔C++ boundary crossing on every cycle; may be acceptable for small designs but problematic for large ones (our 64-node NoC runs ~10M cycles)

Option B: Hybrid — pyCircuit DUT + external hand-written TB

pyCircuit focuses on DUT compilation and provides optional lightweight TB for smoke tests. For serious verification, users write their own C++ or SV testbench against the generated model.

This requires:

  1. build command without mandatory @testbench (see PR feat: add drive_when conditional drive API for reactive testbenches #39 discussion / planned PR)
  2. A well-documented C++ model API so external TBs can instantiate and drive the DUT
  3. Optionally, pyCircuit generates a TB skeleton (clock/reset boilerplate, port declarations) that users extend

Pros:

  • No compiler complexity — leverage existing C++/SV verification ecosystem
  • Full expressiveness: users can use UVM, cocotb, or any framework
  • Natural for agent-assisted workflows: an AI agent can generate C++ TBs using the documented model API

Cons:

  • Two-language workflow (Python for design, C++ for verification)
  • Users need to understand the generated C++ model API

Option C: TB skeleton generation + plugin points

A middle ground: pyCircuit generates a complete TB harness (clock, reset, port bindings, VCD tracing) but provides explicit plugin points where users inject custom stimulus/checking logic.

@testbench
def tb(t: Tb) -> None:
    t.clock("clk")
    t.reset("rst", active_low=True)
    
    # Built-in: simple drives/expects for smoke test
    t.drive("start", 1, at=0)
    t.expect("done", 1, at=100)
    
    # Plugin: user-provided C++ function called every cycle
    t.on_cycle("my_stimulus.cpp::inject_traffic")
    
    # Plugin: user-provided checker
    t.on_cycle("my_checker.cpp::check_output")

The generated TB calls user-supplied C++ functions at the appropriate points, with access to the DUT port struct.

Pros:

  • Best of both worlds: pyCircuit handles boilerplate, user handles complex logic
  • Progressive complexity: start with t.drive()/t.expect(), upgrade to plugins when needed
  • Single build flow

Cons:

  • API design complexity (what context to pass to plugins, how to handle state)
  • Still requires C++ for complex verification

Our Experience

In practice, we ended up with Option B organically:

  1. Used @testbench + drive_when for initial smoke tests
  2. Wrote a separate C++ testbench for multi-round all-to-all traffic verification
  3. Used Verilator with a hand-written SV testbench for final validation

The main pain point was that build currently requires @testbench, making it impossible to compile just the DUT without a dummy testbench function.

Questions for Discussion

  1. What is the long-term vision for @testbench — is it intended to grow into a full verification language, or remain a lightweight smoke-test tool?
  2. Is there interest in supporting build without @testbench (DUT-only compilation)?
  3. Would a plugin/hook mechanism (Option C) be worth exploring?
  4. For Python-in-the-loop (Option A2), are there known performance benchmarks for the FFI overhead?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions