nvfp4
Here are 8 public repositories matching this topic...
A production-ready Docker setup for ComfyUI that unlocks the full potential of NVIDIA Blackwell GPUs (RTX 50 series) through 4-bit quantization with NVFP4.
-
Updated
Jan 28, 2026 - Dockerfile
Code for the paper "ARCQuant: Boosting NVFP4 Quantization with Augmented Residual Channels for LLMs"
-
Updated
Feb 6, 2026 - Cuda
LLM fine-tuning with LoRA + NVFP4/MXFP8 on NVIDIA DGX Spark (Blackwell GB10)
-
Updated
Dec 22, 2025 - Python
Production LLM deployment specs for NVIDIA Blackwell GPUs (RTX Pro 6000, DGX Spark). Includes vLLM configurations, benchmarks, load balancer, and throughput calculators for NVFP4/FP8/MoE models.
-
Updated
Jan 23, 2026 - Python
🔧 Fine-tune large language models efficiently on NVIDIA DGX Spark with LoRA adapters and optimized quantization for high performance.
-
Updated
Feb 6, 2026 - Python
Improve this page
Add a description, image, and links to the nvfp4 topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the nvfp4 topic, visit your repo's landing page and select "manage topics."