LOOM simulation is the execution-side validation layer for mapped designs. The normative execution core is a shared cycle-accurate kernel that is reused by:
- standalone mapped simulation
- runtime replay
- gem5 embedded execution
The simulated object is the mapped ADG hardware topology plus mapping overlay, not the original software DFG alone.
Standalone simulation must support synthetic or generated inputs, including:
- scalar and stream test vectors
- pre-filled external memory contents
- golden reference comparison inputs
The simulation interface must also support externally provided data when used by host-side or gem5-driven flows.
The simulation layer provides a session-like abstraction with operations for:
- build from mapped state
- config load
- input injection
- external-memory binding
- invoke or reset
- output retrieval
- trace and performance retrieval
This contract is shared conceptually with
spec-host-accel-interface.md and is realized
today by SimSession.
LOOM's intended simulation model is cycle-accurate with two phases per cycle:
- drive boundary inputs
- evaluate combinational behavior of switches, PEs, FIFOs, and memory fronts
- propagate valid, ready, and data
- iterate until the combinational state converges or a limit is reached
Normative rule:
max_comb_iterations = 4- failure to converge within that bound is a structural simulation error
- commit transfers
- update FIFO state
- update PE or temporal instruction state
- update memory completion state
- collect boundary outputs
The separation between combinational convergence and sequential state advance is normative. Tokens produced during commit are visible starting in the next cycle, not earlier in the same cycle.
The cycle kernel tracks:
donequiescentdeadlock
Normative rules:
quiescentmeans no more transfer or fire can occur in the current mapped hardware statedonemeans all software-visible completion obligations extracted from the mapping overlay are satisfied and the kernel is quiescentdeadlockmeans the kernel is quiescent while completion obligations are still unsatisfied
Completion obligations come from the mapping overlay and include only software-visible outputs and memory side effects. Residual non-obligation control tails do not block completion.
Successful completion is additionally audited as:
- software-visible obligations satisfied
- hardware empty
where "hardware empty" means the mapped ADG has no live channel token and no module-internal pending work remaining.
The simulation architecture should accommodate at least these conceptual module types:
- PE simulation
- switch simulation
- FIFO simulation
- memory and extmemory simulation
- stream and dataflow helper primitives
The simulator's FU execution model must recognize the current validated
fabric.function_unit body op set. Current supported compute ops include
arith.minimumf, math.absf, math.floor, math.fma, math.rsqrt,
math.sqrt, and the other operations listed in
spec-fabric-function_unit-ops.md.
An unsupported FU-body op must be rejected before execution begins rather than
being silently ignored.
The shared cycle kernel currently exposes an interface equivalent in responsibility to:
build(staticModel)configure(configImage)setInputTokens(portIdx, tokens)setMemoryRegionBacking(regionId, bytes, size)runUntilBoundary(maxCycles)getLastBoundaryReason()getOutputTokens(portIdx)getTraceDocument()getCurrentCycle()
The current boundary reasons are:
NeedMemIssueWaitMemRespInvocationDoneDeadlockBudgetHit
In the embedded gem5 path, NeedMemIssue and WaitMemResp are the normative
memory boundaries between the shared kernel and the device-side DMA adapter.
Simulation channels conceptually carry:
- valid
- ready
- data
- optional tag
The exact in-memory C++ representation may evolve, but these fields define the architectural behavior.
LOOM simulation should support:
- event traces
- per-node activity or stall accounting
- invocation start and completion markers
- a machine-readable result artifact containing resolved output tokens and post-execution memory-region snapshots
Current standalone artifact families include:
<mixed>.sim.setup.json<mixed>.sim.result.json<mixed>.sim.report.json<mixed>.sim.trace<mixed>.sim.stat<mixed>.simimage.json<mixed>.simimage.bin
The trace artifact is a versioned JSON document. The top-level fields are:
versiontrace_kindproducerepoch_idinvocation_idcore_idmodulesevents
This versioned trace is the contract consumed by HTML playback.
These outputs are relevant both for debugging and for later visualization or performance analysis.
Simulation must support validation by:
- output-port comparison
- post-execution memory comparison
This is required because many accelerator kernels communicate results primarily through memory side effects.
LOOM now emits a runtime image that captures the mapped static model and decoded control bindings needed by the shared kernel. The runtime manifest records:
sim_image_jsonsim_image_bin
The runtime image is the preferred handoff artifact for gem5 embedded execution.