feat(megatron): add nccl_comm_warmup to avoid iteration-1 NCCL cudaMalloc OOM (#6387)#9602
Open
yuchenwang3 wants to merge 1 commit into
Open
feat(megatron): add nccl_comm_warmup to avoid iteration-1 NCCL cudaMalloc OOM (#6387)#9602yuchenwang3 wants to merge 1 commit into
yuchenwang3 wants to merge 1 commit into