Skip to content

Fix NVFP4 FakeTensor compatibility with PyTorch 2.9.1#26

Open
wobba wants to merge 1 commit intodeepbeepmeep:mainfrom
wobba:fix-nvfp4-pytorch291
Open

Fix NVFP4 FakeTensor compatibility with PyTorch 2.9.1#26
wobba wants to merge 1 commit intodeepbeepmeep:mainfrom
wobba:fix-nvfp4-pytorch291

Conversation

@wobba
Copy link
Copy Markdown

@wobba wobba commented Jan 20, 2026

Add @torch.compiler.disable decorator to _lora_linear_forward method to prevent torch.compile from tracing through NVFP4 quantized weight operations, avoiding FakeTensor mixing errors in PyTorch 2.9.1+.

This allows NVFP4 4-bit quantization to work correctly with RTX 50xx series GPUs and their optimized lightx2v kernels.

Add @torch.compiler.disable decorator to _lora_linear_forward method to prevent
torch.compile from tracing through NVFP4 quantized weight operations, avoiding
FakeTensor mixing errors in PyTorch 2.9.1+.

This allows NVFP4 4-bit quantization to work correctly with RTX 50xx series GPUs
and their optimized lightx2v kernels.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant