Skip to content

Support Qwen3 Omni model quantization#1404

Draft
lvliang-intel wants to merge 4 commits intomainfrom
lvl/support_omni
Draft

Support Qwen3 Omni model quantization#1404
lvliang-intel wants to merge 4 commits intomainfrom
lvl/support_omni

Conversation

@lvliang-intel
Copy link
Contributor

@lvliang-intel lvliang-intel commented Feb 4, 2026

Description

Added Qwen3‑Omni quantization support by integrating a custom MLLM processor/template, special forward handling for thinker/talker calibration, and model‑specific block discovery.

Type of Change

  • Bug fix
  • New feature
  • Documentation update
  • Performance improvement
  • Code refactoring
  • Other (please specify):

Related Issues

#1387

Fixes or relates to #

Checklist Before Submitting

  • My code has been tested locally.
  • Documentation has been updated as needed.
  • New or updated tests are included where applicable.

lvliang-intel and others added 4 commits February 4, 2026 14:50
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
@wenhuach21
Copy link
Contributor

Thank you for the PR! Could you help verify all inferences (vLLM, Transformers 4, and Transformers 5) before merging?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants