Conversation
There was a problem hiding this comment.
CUDA.jl Benchmarks
Details
| Benchmark suite | Current: f90472d | Previous: 9c24e73 | Ratio |
|---|---|---|---|
latency/precompile |
43977295720.5 ns |
44501126522 ns |
0.99 |
latency/ttfp |
13145443593 ns |
13149867592 ns |
1.00 |
latency/import |
3766140874 ns |
3767927411.5 ns |
1.00 |
integration/volumerhs |
9437395 ns |
9440140.5 ns |
1.00 |
integration/byval/slices=1 |
146003 ns |
145804 ns |
1.00 |
integration/byval/slices=3 |
423230 ns |
422996 ns |
1.00 |
integration/byval/reference |
144129 ns |
143875 ns |
1.00 |
integration/byval/slices=2 |
284679 ns |
284373 ns |
1.00 |
integration/cudadevrt |
102728 ns |
102603 ns |
1.00 |
kernel/indexing |
13691 ns |
13604 ns |
1.01 |
kernel/indexing_checked |
14437 ns |
14041.5 ns |
1.03 |
kernel/occupancy |
634.7218934911242 ns |
654.3878787878788 ns |
0.97 |
kernel/launch |
2272.8888888888887 ns |
2065.9 ns |
1.10 |
kernel/rand |
17457 ns |
14529 ns |
1.20 |
array/reverse/1d |
19014 ns |
18833 ns |
1.01 |
array/reverse/2dL_inplace |
66383 ns |
66297 ns |
1.00 |
array/reverse/1dL |
69213 ns |
69017 ns |
1.00 |
array/reverse/2d |
20940 ns |
21208 ns |
0.99 |
array/reverse/1d_inplace |
9019 ns |
8801 ns |
1.02 |
array/reverse/2d_inplace |
10640 ns |
10457 ns |
1.02 |
array/reverse/2dL |
72945 ns |
73233 ns |
1.00 |
array/reverse/1dL_inplace |
66337 ns |
66238 ns |
1.00 |
array/copy |
18123 ns |
18166 ns |
1.00 |
array/iteration/findall/int |
145268.5 ns |
146211.5 ns |
0.99 |
array/iteration/findall/bool |
130269 ns |
130874 ns |
1.00 |
array/iteration/findfirst/int |
83676 ns |
84566 ns |
0.99 |
array/iteration/findfirst/bool |
81053 ns |
81494 ns |
0.99 |
array/iteration/scalar |
67363 ns |
66998 ns |
1.01 |
array/iteration/logical |
200756.5 ns |
198961 ns |
1.01 |
array/iteration/findmin/1d |
83353 ns |
84192 ns |
0.99 |
array/iteration/findmin/2d |
116414 ns |
117391 ns |
0.99 |
array/reductions/reduce/Int64/1d |
39232 ns |
38940 ns |
1.01 |
array/reductions/reduce/Int64/dims=1 |
42353.5 ns |
42402.5 ns |
1.00 |
array/reductions/reduce/Int64/dims=2 |
58978 ns |
59096.5 ns |
1.00 |
array/reductions/reduce/Int64/dims=1L |
87227 ns |
87158 ns |
1.00 |
array/reductions/reduce/Int64/dims=2L |
84412 ns |
84522.5 ns |
1.00 |
array/reductions/reduce/Float32/1d |
33877 ns |
34365.5 ns |
0.99 |
array/reductions/reduce/Float32/dims=1 |
39302 ns |
49003 ns |
0.80 |
array/reductions/reduce/Float32/dims=2 |
56362 ns |
56392.5 ns |
1.00 |
array/reductions/reduce/Float32/dims=1L |
51579 ns |
51750 ns |
1.00 |
array/reductions/reduce/Float32/dims=2L |
69583 ns |
70137.5 ns |
0.99 |
array/reductions/mapreduce/Int64/1d |
38982 ns |
38678 ns |
1.01 |
array/reductions/mapreduce/Int64/dims=1 |
51400 ns |
49417 ns |
1.04 |
array/reductions/mapreduce/Int64/dims=2 |
58902 ns |
59199 ns |
0.99 |
array/reductions/mapreduce/Int64/dims=1L |
87221 ns |
87193 ns |
1.00 |
array/reductions/mapreduce/Int64/dims=2L |
84498 ns |
84515 ns |
1.00 |
array/reductions/mapreduce/Float32/1d |
33969 ns |
34022 ns |
1.00 |
array/reductions/mapreduce/Float32/dims=1 |
39127.5 ns |
39691.5 ns |
0.99 |
array/reductions/mapreduce/Float32/dims=2 |
56116 ns |
55971 ns |
1.00 |
array/reductions/mapreduce/Float32/dims=1L |
51476 ns |
51491 ns |
1.00 |
array/reductions/mapreduce/Float32/dims=2L |
68985.5 ns |
68932 ns |
1.00 |
array/broadcast |
20849 ns |
20437 ns |
1.02 |
array/copyto!/gpu_to_gpu |
10632 ns |
10680.333333333334 ns |
1.00 |
array/copyto!/cpu_to_gpu |
212916 ns |
217910 ns |
0.98 |
array/copyto!/gpu_to_cpu |
282303 ns |
285564 ns |
0.99 |
array/accumulate/Int64/1d |
118120.5 ns |
118803 ns |
0.99 |
array/accumulate/Int64/dims=1 |
79213 ns |
79869.5 ns |
0.99 |
array/accumulate/Int64/dims=2 |
155247 ns |
155687 ns |
1.00 |
array/accumulate/Int64/dims=1L |
1705262 ns |
1694846 ns |
1.01 |
array/accumulate/Int64/dims=2L |
960497 ns |
961414 ns |
1.00 |
array/accumulate/Float32/1d |
100730 ns |
101291.5 ns |
0.99 |
array/accumulate/Float32/dims=1 |
76297 ns |
76639 ns |
1.00 |
array/accumulate/Float32/dims=2 |
143951 ns |
144870 ns |
0.99 |
array/accumulate/Float32/dims=1L |
1584441.5 ns |
1584515.5 ns |
1.00 |
array/accumulate/Float32/dims=2L |
656314 ns |
657191.5 ns |
1.00 |
array/construct |
1272.3 ns |
1305.2 ns |
0.97 |
array/random/randn/Float32 |
42774 ns |
37110.5 ns |
1.15 |
array/random/randn!/Float32 |
29702 ns |
30378 ns |
0.98 |
array/random/rand!/Int64 |
34748 ns |
31479 ns |
1.10 |
array/random/rand!/Float32 |
8185.25 ns |
8318 ns |
0.98 |
array/random/rand/Int64 |
37159.5 ns |
31553 ns |
1.18 |
array/random/rand/Float32 |
12276 ns |
12376 ns |
0.99 |
array/permutedims/4d |
54725.5 ns |
51624 ns |
1.06 |
array/permutedims/2d |
52225 ns |
52729 ns |
0.99 |
array/permutedims/3d |
52803 ns |
52893 ns |
1.00 |
array/sorting/1d |
2734309.5 ns |
2734663 ns |
1.00 |
array/sorting/by |
3303053 ns |
3304625 ns |
1.00 |
array/sorting/2d |
1067014 ns |
1068662 ns |
1.00 |
cuda/synchronization/stream/auto |
994.5 ns |
1017.4166666666666 ns |
0.98 |
cuda/synchronization/stream/nonblocking |
7503.5 ns |
7603.1 ns |
0.99 |
cuda/synchronization/stream/blocking |
810.6881720430108 ns |
808.4591836734694 ns |
1.00 |
cuda/synchronization/context/auto |
1151.9 ns |
1174.2 ns |
0.98 |
cuda/synchronization/context/nonblocking |
7846.4 ns |
7803.2 ns |
1.01 |
cuda/synchronization/context/blocking |
900.9183673469388 ns |
886.6666666666666 ns |
1.02 |
This comment was automatically generated by workflow using github-action-benchmark.
|
The self-tests fail because the linear algebra functions (e.g. matrix exponential) as implemented in How should this be handled? Rewrite |
|
I think it's JuliaGPU/GPUArrays.jl#679. |
|
The buildkite error is This seems unrelated to my changes, except that I am now running CI tests on Julia 1.12 and Julia 1.13... |
|
I guess #3025 needs to be active for all LLVM versions. |
|
Good news: CUDA.jl now works for Julia 1.12. |
|
I think it's texture interpolation that is broken on 1.13. This line segfaults LLVM: in |
|
We will need to update KernelAbstractions.jl as well JuliaGPU/KernelAbstractions.jl#679. |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #3020 +/- ##
==========================================
+ Coverage 89.46% 89.48% +0.01%
==========================================
Files 148 148
Lines 13047 13039 -8
==========================================
- Hits 11673 11668 -5
+ Misses 1374 1371 -3 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
All green! |
|
It seems this PR has stalled. Can I do something to get a review or to get it merged? |
|
Can we merge either this PR, or #3031 instead? |
maleadt
left a comment
There was a problem hiding this comment.
I'll look into pushing this over the finish.
.buildkite/pipeline.yml
Outdated
| matrix: | ||
| setup: | ||
| cuda: | ||
| # - "13.1" |
| @@ -2082,58 +2082,58 @@ end | |||
| const nvmlGpuFabricInfoV_t = nvmlGpuFabricInfo_v3_t | |||
|
|
|||
| @checked function nvmlInit_v2() | |||
| @gcsafe_ccall (libnvml()).nvmlInit_v2()::nvmlReturn_t | |||
There was a problem hiding this comment.
All these require matching changes in the wrapper generator.
lib/nvml/NVML.jl
Outdated
| # NVSMI dir isn't added to PATH by the installer; add it to Julia's DLL search path. | ||
| nvsmi = joinpath(get(ENV, "ProgramFiles", raw"C:\Program Files"), "NVIDIA Corporation", "NVSMI") | ||
| if isdir(nvsmi) && !(nvsmi in Libdl.DL_LOAD_PATH) | ||
| pushfirst!(Libdl.DL_LOAD_PATH, nvsmi) |
There was a problem hiding this comment.
This is not equivalent. Why can't it be done globally, setting the constant string?
test/core/device/ldg.jl
Outdated
| ir = sprint(io->CUDA.code_llvm(io, CUDA.pointerref_ldg, Tuple{Core.LLVMPtr{Int,AS.Global},Int,Val{1}})) | ||
| @test occursin("@llvm.nvvm.ldg", ir) | ||
| if Base.libllvm_version >= v"20" | ||
| @test occursin("load i64, ptr addrspace(1)", ir) |
There was a problem hiding this comment.
This should test for the replacement pattern, which is an !invariant.load
test/base/texture.jl
Outdated
| using Interpolations | ||
|
|
||
| # Texture interpolation crashes LLVM in Julia 1.13 | ||
| VERSION < v"1.13-" && @testset "texture" begin |
test/base/texture.jl
Outdated
| using Interpolations | ||
|
|
||
| # Texture interpolation crashes LLVM in Julia 1.13 | ||
| VERSION < v"1.13-" && @testset "texture" begin |
ff4bccd to
55978dc
Compare
Co-authored by: Erik Schnetter <schnetter@gmail.com> Co-authored by: KARLO\karlo <karlo.sepetanc@live.com>
08c9eb6 to
f90472d
Compare
|
Could you please edit commit msg. This is my first contribut here. "Co-authored by" should be "Co-authored-by" for Github to recognize. |
Co-authored-by: Karlo Sepetanc <karlo.sepetanc@live.com> Co-authored-by: Tim Besard <tim.besard@gmail.com>
Closes #3019.