Skip to content

Commit 8cfb7c0

Browse files
authored
Refactor: Qwen3 decode with 3-scope architecture and TILELET rename (#99)
- qwen3_32b_decode.py: Refactored into 3 scopes for better incore * Scope 1: Input RMSNorm + Q/K/V projection * Scope 2: Attention (K RoPE + cache, QK matmul, softmax, SV matmul) * Scope 3: Output projection, residual, RMSNorm, MLP - Updated HIDDEN size from 5120 to 8192 (64 heads × 128 dim) - Renamed qwen3_32b_decode_tilelet.py to qwen3_32b_decode_mixed.py for clearer TILELET-aware version naming - Adjusted tiling constants for each scope
1 parent 0d48e70 commit 8cfb7c0

2 files changed

Lines changed: 647 additions & 312 deletions

File tree

0 commit comments

Comments
 (0)