[DistInf] Enable RDMA over Ionic AINICs for MoRI EP disaggregated inference#147
Draft
raviguptaamd wants to merge 2 commits intoROCm:developfrom
Draft
[DistInf] Enable RDMA over Ionic AINICs for MoRI EP disaggregated inference#147raviguptaamd wants to merge 2 commits intoROCm:developfrom
raviguptaamd wants to merge 2 commits intoROCm:developfrom
Conversation
added 2 commits
April 16, 2026 04:11
…erence Enable MoRI IO KV cache transfer over Ionic RDMA NICs on clusters where public IPs are not routable between compute nodes. Key changes: - Mount host RDMA libraries (libionic, libibverbs, librdmacm) and provider directory into the container so MORI IO can discover Ionic NICs - Set VLLM_HOST_IP to each node's overlay IP so MoRIIO control plane (ZMQ handshake, block allocation notifications, proxy registration) routes through the overlay network instead of unreachable public IPs - Pass through MORI RDMA env vars (MORI_IB_GID_INDEX, MORI_RDMA_DEVICES, MORI_IO_LOG_LEVEL) from the launcher into the container - Switch from docker to podman for rootless container execution - Use --overlap on srun commands to avoid blocking the SLURM job step - Prefer 10.x.x.x overlay IPs for MASTER_ADDR and inter-node comms - Prefer MODEL_DIR for model path resolution before standard paths - Add PYTHONUNBUFFERED=1 for real-time Python log output - Add launch_mori_1p1d.sh convenience launcher for 1P/1D benchmarks - Update Dockerfile to install MORI from pinned commit on main Tested: DeepSeek-V3 1P/1D on 2x MI355X nodes with Ionic AINICs, full benchmark suite (ISL/OSL: 1024/1024, 8192/1024, 1024/8192, concurrency: 8-512), all requests successful with 0 failures. Made-with: Cursor
…Ionic AINIC Extends the Ionic AINIC RDMA support to multi-node disaggregated inference with 2 Prefill + 2 Decode nodes (DP=16). Key changes: - Remove 1P/1D restriction from run_xPyD_models.slurm and vllm_disagg_mori_ep.sh to allow xP>1 / yD>1 topologies - Add --ulimit memlock=-1:-1 to podman for large RDMA memory registrations (>32GB) required by MoRI IO - Pass NCCL_IB_HCA, NCCL_IB_GID_INDEX, NCCL_NET_GDR_LEVEL, NCCL_CROSS_NIC, and MORI_SOCKET_IFNAME into containers for proper multi-node RCCL and MoRI bootstrap over Ionic AINICs - Add apply_moriio_2pd_patches.sh for runtime vLLM patches (PR vllm-project/vllm#39276) fixing engine_id collisions and MoRIIO robustness in multi-node DP configurations - Restrict --kv-transfer-config to master nodes only (child nodes join via --headless and participate in EP all-to-all) - Add launch_mori_2p2d.sh example launcher for 2P/2D benchmarks Tested on AAC MI355X cluster with Ionic RDMA NICs achieving balanced RDMA traffic across all 4 nodes and 1,344 tok/s total throughput on DeepSeek-V3-5layer. Made-with: Cursor
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
libionic.so,libibverbs,librdmacmand provider directories) into the container so MORI IO can discover Ionic NICsVLLM_HOST_IPto each node's overlay IP so MoRIIO control plane (ZMQ handshake, block allocation notifications, proxy registration) routes through the routable overlay network instead of unreachable public IPsMORI_IB_GID_INDEX,MORI_RDMA_DEVICES,MORI_IO_LOG_LEVEL) from the launcher into the container--overlaponsruncommands to avoid blocking the SLURM job step10.x.x.xoverlay IPs forMASTER_ADDRand inter-node communicationMODEL_DIRfor model path resolution before standard pathsPYTHONUNBUFFERED=1for real-time Python log outputlaunch_mori_1p1d.shconvenience launcher for 1P/1D benchmarksProblem
On clusters with Ionic AINICs (back-end RDMA) and Broadcom NICs (front-end overlay network), the MoRIIO connector's
get_ip()returns the public IP which is not routable between compute nodes. This causes the decode node to be unable to send block allocation notifications back to the prefill node, creating a circular deadlock where both sides hang indefinitely waiting for KV transfer.Solution
VLLM_HOST_IPper node to the overlay IP (10.x.x.x) —get_ip()checks this env var firstmori::io::RdmaManagercan discover Ionic NICsMORI_IB_GID_INDEX=1to select the correct RoCE v2 GID for IonicTest Plan
RdmaBackendlogs (nic=ionic)Made with Cursor