TensorRT engines support two execution modes, sync and async. Sync is already supported and doesn't require any CUDA Streams to run.
For async to be supported we need to have support for creating and passing CUDA streams to the execution context along with the data to be executed.
We may be able to get this support via https://github.com/bheisler/RustaCUDA since it's already wrapping the CUDA API.
TensorRT engines support two execution modes, sync and async. Sync is already supported and doesn't require any CUDA Streams to run.
For async to be supported we need to have support for creating and passing CUDA streams to the execution context along with the data to be executed.
We may be able to get this support via https://github.com/bheisler/RustaCUDA since it's already wrapping the CUDA API.