support gpt-oss mxfp4 directly loading by xin3he · Pull Request #1401 · intel/auto-round

xin3he · 2026-02-04T10:13:41Z

Description

Please briefly describe your main changes, the motivation.

Type of Change

Related Issues

Fixes or relates to #

Checklist Before Submitting

My code has been tested locally.
Documentation has been updated as needed.
New or updated tests are included where applicable.

Copilot

Pull request overview

This PR adds support for directly loading GPT-OSS models quantized with MXFP4 format by automatically detecting MXFP4 quantization and applying dequantization during model loading.

Changes:

Updated model references in test files from local/unsloth paths to official OpenAI model identifiers
Added MXFP4 quantization detection and automatic dequantization support in model loading utilities

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File	Description
test/test_cuda/models/test_moe_model.py	Updated GPT-OSS model reference from local path to OpenAI identifier
test/test_cpu/models/test_moe_model.py	Updated GPT-OSS model reference from unsloth path to OpenAI identifier
auto_round/utils/model.py	Added MXFP4 detection function and integrated dequantization config into model loading

Copilot · 2026-02-04T10:14:18Z

auto_round/utils/model.py


+def _is_mxfp4_model(model_path: str) -> bool:
+    """Check if the model is quantized with MXFP4."""
+    supported_model_types = ["gpt_oss"]


The supported model types are hardcoded in this function. Consider making this a module-level constant or configuration parameter to improve maintainability and make it easier to add support for additional model types in the future.

Copilot · 2026-02-04T10:14:19Z

auto_round/utils/model.py

+            quantization_config = Mxfp4Config(dequantized=True)
+            logger.info("Detected MXFP4 quantized model, using Mxfp4Config(dequantized=True) for loading.")
+        except ImportError:
+            logger.warning("Mxfp4Config not available in current transformers version, loading without dequantization.")


The warning message could be more actionable by suggesting which transformers version is required for MXFP4 support. Consider adding version information to help users understand what upgrade is needed.

Suggested change

logger.warning("Mxfp4Config not available in current transformers version, loading without dequantization.")

required_tf_version = "4.46.0"

logger.warning(

"Mxfp4Config is not available in the current transformers installation "

f"(transformers=={transformers.__version__}). MXFP4 dequantization requires "

f"transformers>={required_tf_version}. The model will be loaded without "

"MXFP4 dequantization. Please upgrade transformers, for example with "

f'`pip install -U "transformers>={required_tf_version}"`.'

)

Signed-off-by: He, Xin3 <xin3.he@intel.com>

support gpt-oss mxfp4 directly loading

8139928

Copilot AI review requested due to automatic review settings February 4, 2026 10:13

Copilot AI reviewed Feb 4, 2026

View reviewed changes

update with load_kwargs

65ae337

Signed-off-by: He, Xin3 <xin3.he@intel.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support gpt-oss mxfp4 directly loading#1401

support gpt-oss mxfp4 directly loading#1401
xin3he wants to merge 2 commits intomainfrom
xinhe/gpt-oss

xin3he commented Feb 4, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 4, 2026

Uh oh!

Copilot AI Feb 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

-            logger.warning("Mxfp4Config not available in current transformers version, loading without dequantization.")
+            required_tf_version = "4.46.0"
+            logger.warning(
+                "Mxfp4Config is not available in the current transformers installation "
+                f"(transformers=={transformers.__version__}). MXFP4 dequantization requires "
+                f"transformers>={required_tf_version}. The model will be loaded without "
+                "MXFP4 dequantization. Please upgrade transformers, for example with "
+                f'`pip install -U "transformers>={required_tf_version}"`.'
+            )

Conversation

xin3he commented Feb 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of Change

Related Issues

Checklist Before Submitting

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

xin3he commented Feb 4, 2026 •

edited

Loading