Skip to content

[https://nvbugs/6050483][fix] Pin diffusers to 0.37.1 to fix UniPC scheduler device mismatch#13017

Open
chang-l wants to merge 2 commits intoNVIDIA:mainfrom
chang-l:fix/nvbug-6050483-pin-diffusers-unipc-scheduler
Open

[https://nvbugs/6050483][fix] Pin diffusers to 0.37.1 to fix UniPC scheduler device mismatch#13017
chang-l wants to merge 2 commits intoNVIDIA:mainfrom
chang-l:fix/nvbug-6050483-pin-diffusers-unipc-scheduler

Conversation

@chang-l
Copy link
Copy Markdown
Collaborator

@chang-l chang-l commented Apr 14, 2026

Summary

Root Cause

The tests installed diffusers from git+https://github.com/huggingface/diffusers.git (bleeding edge). Diffusers commit b114620 (Apr 3, 2026) changed the UniPC scheduler to:

# NEW (broken) - rks has CPU tensors from self.sigmas loop + CUDA tensor from ones()
rks.append(torch.ones((), device=device))
rks = torch.stack(rks)   # CRASH: mixed CPU/CUDA

Previously:

# OLD (correct) - coerces entire list to target device
rks.append(1.0)
rks = torch.tensor(rks, device=device)

Test plan

  • Verified diffusers==0.37.1 has AutoencoderKLWan, FlowMatchEulerDiscreteScheduler, UniPCMultistepScheduler (all needed classes)
  • Reproduced crash with diffusers 0.38.0.dev0 on B200 (UniPC step 1 crashes)
  • Verified fix with diffusers 0.37.1 on B200 (all scheduler steps pass)
  • CI: /bot run --stage-list "DGX_B200-4_GPUs-PyTorch-Post-Merge-1, DGX_B200-4_GPUs-PyTorch-Post-Merge-2"

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Tests
    • Updated test dependency configuration to use a stable released version instead of development source, improving test consistency and reproducibility.

…heduler device mismatch

The VBench integration tests install diffusers from git HEAD, which
pulled in huggingface/diffusers#13356. That PR changed
`UniPCMultistepScheduler.multistep_uni_p_bh_update` to use
`torch.stack(rks)` on a list mixing CPU tensors (from `self.sigmas`,
intentionally kept on CPU) with a CUDA tensor (`torch.ones((), device=device)`),
causing `RuntimeError: Expected all tensors to be on the same device`.

Pin diffusers to 0.37.1 (latest stable release) which uses the
original correct pattern: `torch.tensor(rks, device=device)`.

Verified on B200:
- diffusers 0.38.0.dev0 → crashes at UniPC step 1
- diffusers 0.37.1       → passes all steps

Signed-off-by: Chang Liu <9713593+chang-l@users.noreply.github.com>
@chang-l chang-l requested a review from a team as a code owner April 14, 2026 00:01
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 14, 2026

📝 Walkthrough

Walkthrough

The _visual_gen_deps test fixture dependency installation is modified to pin the diffusers package to a specific released version (0.37.1) instead of installing directly from the Hugging Face GitHub repository development source.

Changes

Cohort / File(s) Summary
Test Fixture Configuration
tests/integration/defs/examples/test_visual_gen.py
Updated diffusers dependency installation from git source (git+https://github.com/huggingface/diffusers.git) to fixed release version (diffusers==0.37.1).

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main change: pinning diffusers to version 0.37.1 to fix a UniPC scheduler device mismatch issue.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Description check ✅ Passed The PR description is comprehensive and follows the template structure with clear sections for Summary, Root Cause, and Test plan.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@chang-l
Copy link
Copy Markdown
Collaborator Author

chang-l commented Apr 14, 2026

/bot run --stage-list "DGX_B200-4_GPUs-PyTorch-Post-Merge-1, DGX_B200-4_GPUs-PyTorch-Post-Merge-2"

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #43116 [ run ] triggered by Bot. Commit: afe37cb Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #43116 [ run ] completed with state SUCCESS. Commit: afe37cb
/LLM/main/L0_MergeRequest_PR pipeline #33750 (Partly Tested) completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

# Pin diffusers to 0.37.1 to avoid device-mismatch regression in
# UniPCMultistepScheduler.multistep_uni_p_bh_update introduced by
# huggingface/diffusers#13356 (torch.stack on mixed CPU/CUDA rks).
llm_venv.run_cmd(["-m", "pip", "install", "diffusers==0.37.1"])
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

def _visual_gen_deps(llm_venv):
"""Install av + diffusers + ffmpeg once per session (shared by all video-gen fixtures)."""
llm_venv.run_cmd(["-m", "pip", "install", "av"])
llm_venv.run_cmd(["-m", "pip", "install", "git+https://github.com/huggingface/diffusers.git"])
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And we need to unwaive the waived corresponding tests.

… waivers

Remove 5 test waivers added for nvbug 6050483 so CI will exercise the
previously-failing visual gen tests with the pinned diffusers version.

Removed waivers:
- examples/test_visual_gen.py::test_vbench_dimension_score_wan22_a14b_nvfp4
- examples/test_visual_gen.py::test_vbench_dimension_score_wan22_a14b_fp8
- examples/test_visual_gen.py::test_vbench_dimension_score_wan
- visual_gen/test_visual_gen_benchmark.py::test_online_benchmark[openai-videos]
- visual_gen/test_visual_gen_benchmark.py::test_offline_benchmark

Signed-off-by: Chang Liu <9713593+chang-l@users.noreply.github.com>
@chang-l chang-l requested a review from a team as a code owner April 14, 2026 16:33
@chang-l chang-l force-pushed the fix/nvbug-6050483-pin-diffusers-unipc-scheduler branch from f8abb37 to dc7ae45 Compare April 14, 2026 16:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants