Code for paper "Joint Architecture Design and Workload Partitioning for DNN Inference on Industrial IoT Clusters"
-
Updated
Aug 22, 2025 - Python
Code for paper "Joint Architecture Design and Workload Partitioning for DNN Inference on Industrial IoT Clusters"
Trying to shard big embedding tables in multiple devices paying attention to the communication aspects of parallel inference
Add a description, image, and links to the parallel-inference topic page so that developers can more easily learn about it.
To associate your repository with the parallel-inference topic, visit your repo's landing page and select "manage topics."