Fix NVFP4 FakeTensor compatibility with PyTorch 2.9.1 by wobba · Pull Request #26 · deepbeepmeep/mmgp

wobba · 2026-01-20T09:07:52Z

Add @torch.compiler.disable decorator to _lora_linear_forward method to prevent torch.compile from tracing through NVFP4 quantized weight operations, avoiding FakeTensor mixing errors in PyTorch 2.9.1+.

This allows NVFP4 4-bit quantization to work correctly with RTX 50xx series GPUs and their optimized lightx2v kernels.

Add @torch.compiler.disable decorator to _lora_linear_forward method to prevent torch.compile from tracing through NVFP4 quantized weight operations, avoiding FakeTensor mixing errors in PyTorch 2.9.1+. This allows NVFP4 4-bit quantization to work correctly with RTX 50xx series GPUs and their optimized lightx2v kernels.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix NVFP4 FakeTensor compatibility with PyTorch 2.9.1#26

Fix NVFP4 FakeTensor compatibility with PyTorch 2.9.1#26
wobba wants to merge 1 commit intodeepbeepmeep:mainfrom
wobba:fix-nvfp4-pytorch291

wobba commented Jan 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

wobba commented Jan 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant