Feature Description
Current, Omni model is becoming popular, but AutoRound is based on traditional LLM structure and cannot handle it.
We need to support this structure, at least quantize Qwen3OmniMoeThinkerTextModel and Qwen3OmniMoeTalkerModel sucessfully.
Motivation and Use Case
from auto_round import AutoRound
model_name_or_path = "Qwen/Qwen3-Omni-30B-A3B-Instruct"
ar = AutoRound(
model=model_name_or_path,
scheme="W4A16",
iters=50,
lr=5e-3,
)
ar.quantize_and_save(format="auto_round", output_dir="tmp_w4a16")
Related to: #775, #862, #1186
Alternatives Considered
No response
Definition of Done
No response
Additional Context
No response
Feature Description
Current, Omni model is becoming popular, but AutoRound is based on traditional LLM structure and cannot handle it.
We need to support this structure, at least quantize
Qwen3OmniMoeThinkerTextModelandQwen3OmniMoeTalkerModelsucessfully.Motivation and Use Case
Related to: #775, #862, #1186
Alternatives Considered
No response
Definition of Done
No response
Additional Context
No response