LOOM targets a split execution model:
- the host program runs on the CPU side
- the selected kernel region is executed by the mapped accelerator
The host-accelerator contract must work in both standalone simulation and gem5.
LOOM may generate a separate host-side file that replaces the original direct kernel call with an accelerator invocation sequence.
The host-side flow is:
- prepare input buffers and scalar arguments
- load accelerator configuration
- bind external memory regions
- launch execution
- wait for completion
- read back outputs or memory side effects
The runtime-facing simulation contract is centered on a session abstraction with the following responsibilities:
- build from mapped DFG and ADG state
- load configuration words
- inject scalar or stream inputs
- bind a backing memory region for external memories
- invoke execution
- collect outputs, traces, and statistics
The exact C++ surface may evolve, but these responsibilities are stable.
The same simulation core should serve both:
- standalone LOOM simulation
- gem5 device-backed execution
The difference is only where inputs, memory, and completion handling come from. The mapped accelerator semantics must stay identical across the two environments.
LOOM must treat memory side effects as first-class outputs. A kernel that produces its final result by storing into external memory is not fully validated by checking return tokens alone.
Therefore the host contract must support:
- output-port comparison
- post-execution memory comparison
- Top-level pipeline: spec-loom.md
- Exploration and validation loop: spec-dse.md
- Mapper outputs: spec-mapper-output.md