Skip to content

[Feature]: Support Omni model quantization #1387

@xin3he

Description

@xin3he

Feature Description

Current, Omni model is becoming popular, but AutoRound is based on traditional LLM structure and cannot handle it.
We need to support this structure, at least quantize Qwen3OmniMoeThinkerTextModel and Qwen3OmniMoeTalkerModel sucessfully.

Motivation and Use Case

from auto_round import AutoRound

model_name_or_path = "Qwen/Qwen3-Omni-30B-A3B-Instruct"

ar = AutoRound(
    model=model_name_or_path,
    scheme="W4A16",
    iters=50,
    lr=5e-3,
)
ar.quantize_and_save(format="auto_round", output_dir="tmp_w4a16")

Related to: #775, #862, #1186

Alternatives Considered

No response

Definition of Done

No response

Additional Context

No response

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions