nvfp4

Here are 8 public repositories matching this topic...

intel / auto-round

🎯An accuracy-first, highly efficient quantization toolkit for LLMs, designed to minimize quality degradation across Weight-Only Quantization, MXFP4, NVFP4, GGUF, and adaptive schemes.

transformers rounding quantization int4 llms vllm gguf vlms sglang mxfp4 nvfp4

Updated Feb 6, 2026
Python

taishan1994 / LLM-Quantization

Star

记录量化LLM中的总结。

quantization llm gptq quarot qwen3 nvfp4

Updated Jan 8, 2026
Python

ChiefNakor / comfyui-blackwell-docker

Star

A production-ready Docker setup for ComfyUI that unlocks the full potential of NVIDIA Blackwell GPUs (RTX 50 series) through 4-bit quantization with NVFP4.

docker pytorch nvidia image-generation nvidia-cuda ai-art stable-diffusion comfyui flux-ai nvidia-blackwell nvfp4

Updated Jan 28, 2026
Dockerfile

actypedef / ARCQuant

Star

Code for the paper "ARCQuant: Boosting NVFP4 Quantization with Augmented Residual Channels for LLMs"

quantization mixed-precision blackwell llm llm-inference microscaling nvfp4

Updated Feb 6, 2026
Cuda

waybarrios / dgx-spark-finetune-llm

Star

LLM fine-tuning with LoRA + NVFP4/MXFP8 on NVIDIA DGX Spark (Blackwell GB10)

deep-learning pytorch nvidia lora quantization fine-tuning blackwell llm nvfp4 dgx-spark transformer-engine mxfp8

Updated Dec 22, 2025
Python

PrimitiveContext / blackwell

Star

Production LLM deployment specs for NVIDIA Blackwell GPUs (RTX Pro 6000, DGX Spark). Includes vLLM configurations, benchmarks, load balancer, and throughput calculators for NVFP4/FP8/MoE models.

benchmark nvidia moe blackwell vllm sglang mxfp4 nvfp4 dgx-spark rtx-pro-6000 msi-edgexpert

Updated Jan 23, 2026
Python

MoHussein197 / dgx-spark-finetune-llm

Star

🔧 Fine-tune large language models efficiently on NVIDIA DGX Spark with LoRA adapters and optimized quantization for high performance.

deep-learning pytorch nvidia lora quantization fine-tuning blackwell llm nvfp4 dgx-spark transformer-engine mxfp8

Updated Feb 6, 2026
Python

CodeBarrie / WanGP-Pinokio-RTX50XX-Upgrade

Star

WanGP v10.61 RTX 50XX Pinokio Upgrade - Python 3.11, PyTorch 2.10, CUDA 13.0, NVFP4 kernels. Copy files to wan.git folder, Reset + Install + Update

cuda pytorch ai-video pinokio nvfp4 wangp rtx50xx

Updated Feb 4, 2026
JavaScript

Improve this page

Add a description, image, and links to the nvfp4 topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the nvfp4 topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

nvfp4

Here are 8 public repositories matching this topic...

intel / auto-round

taishan1994 / LLM-Quantization

ChiefNakor / comfyui-blackwell-docker

actypedef / ARCQuant

waybarrios / dgx-spark-finetune-llm

PrimitiveContext / blackwell

MoHussein197 / dgx-spark-finetune-llm

CodeBarrie / WanGP-Pinokio-RTX50XX-Upgrade

Improve this page

Add this topic to your repo