[https://nvbugs/6050483][fix] Pin diffusers to 0.37.1 to fix UniPC scheduler device mismatch#13017
Conversation
…heduler device mismatch The VBench integration tests install diffusers from git HEAD, which pulled in huggingface/diffusers#13356. That PR changed `UniPCMultistepScheduler.multistep_uni_p_bh_update` to use `torch.stack(rks)` on a list mixing CPU tensors (from `self.sigmas`, intentionally kept on CPU) with a CUDA tensor (`torch.ones((), device=device)`), causing `RuntimeError: Expected all tensors to be on the same device`. Pin diffusers to 0.37.1 (latest stable release) which uses the original correct pattern: `torch.tensor(rks, device=device)`. Verified on B200: - diffusers 0.38.0.dev0 → crashes at UniPC step 1 - diffusers 0.37.1 → passes all steps Signed-off-by: Chang Liu <9713593+chang-l@users.noreply.github.com>
📝 WalkthroughWalkthroughThe Changes
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~3 minutes 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
/bot run --stage-list "DGX_B200-4_GPUs-PyTorch-Post-Merge-1, DGX_B200-4_GPUs-PyTorch-Post-Merge-2" |
|
PR_Github #43116 [ run ] triggered by Bot. Commit: |
|
PR_Github #43116 [ run ] completed with state
|
| # Pin diffusers to 0.37.1 to avoid device-mismatch regression in | ||
| # UniPCMultistepScheduler.multistep_uni_p_bh_update introduced by | ||
| # huggingface/diffusers#13356 (torch.stack on mixed CPU/CUDA rks). | ||
| llm_venv.run_cmd(["-m", "pip", "install", "diffusers==0.37.1"]) |
There was a problem hiding this comment.
Do we need to ping https://github.com/NVIDIA/TensorRT-LLM/blob/main/requirements.txt#L7 ?
| def _visual_gen_deps(llm_venv): | ||
| """Install av + diffusers + ffmpeg once per session (shared by all video-gen fixtures).""" | ||
| llm_venv.run_cmd(["-m", "pip", "install", "av"]) | ||
| llm_venv.run_cmd(["-m", "pip", "install", "git+https://github.com/huggingface/diffusers.git"]) |
There was a problem hiding this comment.
And we need to unwaive the waived corresponding tests.
… waivers Remove 5 test waivers added for nvbug 6050483 so CI will exercise the previously-failing visual gen tests with the pinned diffusers version. Removed waivers: - examples/test_visual_gen.py::test_vbench_dimension_score_wan22_a14b_nvfp4 - examples/test_visual_gen.py::test_vbench_dimension_score_wan22_a14b_fp8 - examples/test_visual_gen.py::test_vbench_dimension_score_wan - visual_gen/test_visual_gen_benchmark.py::test_online_benchmark[openai-videos] - visual_gen/test_visual_gen_benchmark.py::test_offline_benchmark Signed-off-by: Chang Liu <9713593+chang-l@users.noreply.github.com>
f8abb37 to
dc7ae45
Compare
Summary
UniPCMultistepScheduler.multistep_uni_p_bh_updateintroduced by huggingface/diffusers#13356torch.tensor(rks, device=device)withtorch.stack(rks)whererksis a list mixing CPU tensors (fromself.sigmas, intentionally kept on CPU) with a CUDA tensor, causingRuntimeError: Expected all tensors to be on the same devicetest_vbench_dimension_score_wan22_a14b_nvfp4,test_vbench_dimension_score_wan22_a14b_fp8,test_vbench_dimension_score_wan,test_online_benchmark,test_offline_benchmark)Root Cause
The tests installed diffusers from
git+https://github.com/huggingface/diffusers.git(bleeding edge). Diffusers commitb114620(Apr 3, 2026) changed the UniPC scheduler to:Previously:
Test plan
diffusers==0.37.1hasAutoencoderKLWan,FlowMatchEulerDiscreteScheduler,UniPCMultistepScheduler(all needed classes)/bot run --stage-list "DGX_B200-4_GPUs-PyTorch-Post-Merge-1, DGX_B200-4_GPUs-PyTorch-Post-Merge-2"🤖 Generated with Claude Code
Summary by CodeRabbit