Pinned Loading
-
vkdispatch
vkdispatch PublicPython framework for GPU compute, with runtime-generated kernels, FFTs, and reductions.
-
VkDispatchPerformanceTests
VkDispatchPerformanceTests PublicFFT performance tests for the vkdispatch paper
Cuda
-
MetalThreadGroupCompilerBugRepro
MetalThreadGroupCompilerBugRepro PublicRepro for a Metal threadgroup barrier bug on Apple Silicon triggered by maxTotalThreadsPerThreadgroup <= 32
Swift
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.



