Skip to content

opt commonir, change scalar compute to vectorize compute#142

Merged
hellozmz merged 3 commits intomainfrom
zmz/common_vector_loop
Feb 27, 2026
Merged

opt commonir, change scalar compute to vectorize compute#142
hellozmz merged 3 commits intomainfrom
zmz/common_vector_loop

Conversation

@hellozmz
Copy link
Collaborator

@hellozmz hellozmz commented Feb 25, 2026

测试数据

【原始测试数据】
Average execution time over 100 runs:
  TileLang: 3.3752 ms
  Triton:   0.1431 ms
  PyTorch:  0.0164 ms

Throughput Analysis:
  Triton:   87.90 GB/s
  TileLang: 3.73 GB/s
  PyTorch:  765.88 GB/s

Performance comparison relative to PyTorch:
  Triton is 8.71x SLOWER than PyTorch
  TileLang is 205.44x SLOWER than PyTorch
  Triton is 23.58x FASTER than TileLang
==================================================
【优化后测试数据】
Average execution time over 100 runs:
  TileLang: 0.1119 ms
  Triton:   0.1828 ms
  PyTorch:  0.0152 ms

Throughput Analysis:
  Triton:   68.82 GB/s
  TileLang: 112.48 GB/s
  PyTorch:  829.82 GB/s

Performance comparison relative to PyTorch:
  Triton is 12.06x SLOWER than PyTorch
  TileLang is 7.38x SLOWER than PyTorch
  Triton is 1.63x SLOWER than TileLang

性能变化

性能提升 3.3752/0.1119= 30

mlir 变化

image

@hellozmz hellozmz merged commit 0719da5 into main Feb 27, 2026
5 checks passed
@hellozmz hellozmz deleted the zmz/common_vector_loop branch February 27, 2026 08:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant