refine qwen3_vl_moe experts forward by WeiweiZhang1 · Pull Request #1413 · intel/auto-round

WeiweiZhang1 · 2026-02-05T14:38:37Z

Description

Please briefly describe your main changes, the motivation.

Type of Change

Related Issues

Fixes or relates to #

Checklist Before Submitting

My code has been tested locally.
Documentation has been updated as needed.
New or updated tests are included where applicable.

The measured accuracy of the quantized model are the same as before.

Signed-off-by: WeiweiZhang1 <weiwei1.zhang@intel.com>

Copilot

Pull request overview

This PR refines the expert forward pass logic in the Qwen3 VL MoE (Mixture of Experts) implementation, optimizing how experts are selected and invoked during inference.

Changes:

Wrapped expert mask computation in torch.no_grad() to prevent gradient tracking for this operation
Restructured the expert iteration logic to skip unused experts when calibrate_all_experts is False
Simplified tensor indexing by removing unnecessary .squeeze(0) operation

auto_round/modeling/fused_moe/qwen3_vl_moe.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

yiliu30

LGTM

auto_round/modeling/fused_moe/qwen3_vl_moe.py

Copilot AI review requested due to automatic review settings February 5, 2026 14:38

WeiweiZhang1 added 2 commits February 5, 2026 22:38

refine_qwen3_vl_moe_experts_forward

58bae4c

Signed-off-by: WeiweiZhang1 <weiwei1.zhang@intel.com>

Merge branch 'main' into refine_qwen3_vl_moe_experts_forward

0e09f9f

Copilot AI reviewed Feb 5, 2026

View reviewed changes

auto_round/modeling/fused_moe/qwen3_vl_moe.py Show resolved Hide resolved

auto_round/modeling/fused_moe/qwen3_vl_moe.py Outdated Show resolved Hide resolved

Update auto_round/modeling/fused_moe/qwen3_vl_moe.py

60b60bf

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

WeiweiZhang1 requested review from wenhuach21 and yiliu30 February 6, 2026 01:38

yiliu30 approved these changes Feb 6, 2026

View reviewed changes

auto_round/modeling/fused_moe/qwen3_vl_moe.py Show resolved Hide resolved

wenhuach21 approved these changes Feb 6, 2026

View reviewed changes

WeiweiZhang1 merged commit 2824c37 into main Feb 6, 2026
29 checks passed

WeiweiZhang1 deleted the refine_qwen3_vl_moe_experts_forward branch February 6, 2026 05:40

WeiweiZhang1 added this to the 0.10.0 milestone Feb 6, 2026

WeiweiZhang1 mentioned this pull request Feb 6, 2026

[Bug]: enhance Qwen3-VL-Moe replacemodule forward efficiency #1416

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refine qwen3_vl_moe experts forward#1413

refine qwen3_vl_moe experts forward#1413
WeiweiZhang1 merged 3 commits intomainfrom
refine_qwen3_vl_moe_experts_forward

WeiweiZhang1 commented Feb 5, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

yiliu30 left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

WeiweiZhang1 commented Feb 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of Change

Related Issues

Checklist Before Submitting

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

yiliu30 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

WeiweiZhang1 commented Feb 5, 2026 •

edited

Loading