Add SDPA decomposition for Metal backend with unsupported head_dim by seyeong-han · Pull Request #221 · huggingface/optimum-executorch

seyeong-han · 2026-03-02T22:14:45Z

The Metal SDPA kernel only supports head_dim in {64, 96, 128}. Models like Gemma3 (head_dim=256) crash at runtime. This decomposes SDPA into matmul + softmax when the model's head_dim is unsupported, following the pattern from voxtral_realtime's Metal export.

Changes:

metal.py: Add _sdpa_decomposition and _linear_bias_decomposition, applied via run_decompositions() before lowering. Conditional on head_dim not in {64, 96, 128}. Force use_custom_sdpa=False for Metal.
integrations.py: Guard get_custom_sdpa_for_ring_kv_cache() and RemoveRedundantTransposes imports behind use_custom_sdpa check to avoid triggering torchao import chain when custom SDPA is not used.

The Metal SDPA kernel only supports head_dim in {64, 96, 128}. Models like Gemma3 (head_dim=256) crash at runtime. This decomposes SDPA into matmul + softmax when the model's head_dim is unsupported, following the pattern from voxtral_realtime's Metal export. Changes: - metal.py: Add _sdpa_decomposition and _linear_bias_decomposition, applied via run_decompositions() before lowering. Conditional on head_dim not in {64, 96, 128}. Force use_custom_sdpa=False for Metal. - integrations.py: Guard get_custom_sdpa_for_ring_kv_cache() and RemoveRedundantTransposes imports behind use_custom_sdpa check to avoid triggering torchao import chain when custom SDPA is not used. This PR was authored with the assistance of Claude. Made-with: Cursor

seyeong-han mentioned this pull request Mar 2, 2026

Add Metal backend support for Gemma3 runner pytorch/executorch#17797

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add SDPA decomposition for Metal backend with unsupported head_dim#221

Add SDPA decomposition for Metal backend with unsupported head_dim#221
seyeong-han wants to merge 1 commit intohuggingface:mainfrom
seyeong-han:gemma3-metal-sdpa

seyeong-han commented Mar 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

seyeong-han commented Mar 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant