Add Base.min override for Float16 and extend LLVM version guard to v20. by maleadt · Pull Request #3038 · JuliaGPU/CUDA.jl

maleadt · 2026-03-03T12:16:49Z

LLVM 20 lowers Base.min(::Float16, ::Float16) to min.NaN.f16, a PTX instruction requiring sm_80+, causing failures on Turing (sm_75) GPUs. Add a Julia-level override matching the existing Base.max workaround, and extend the version guard from LLVM 18 to 20 since the upstream fix (llvm/llvm-project@6f318d47) only landed in LLVM 21.

As observed in #3020

LLVM 20 lowers Base.min(::Float16, ::Float16) to min.NaN.f16, a PTX instruction requiring sm_80+, causing failures on Turing (sm_75) GPUs. Add a Julia-level override matching the existing Base.max workaround, and extend the version guard from LLVM 18 to 20 since the upstream fix (llvm/llvm-project@6f318d47) only landed in LLVM 21. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

codecov · 2026-03-03T15:29:31Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 89.33%. Comparing base (1810b7a) to head (de9be6a).
⚠️ Report is 4 commits behind head on master.

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #3038      +/-   ##
==========================================
- Coverage   89.49%   89.33%   -0.17%     
==========================================
  Files         148      148              
  Lines       13047    13047              
==========================================
- Hits        11676    11655      -21     
- Misses       1371     1392      +21

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

github-actions

CUDA.jl Benchmarks

Details

Benchmark suite	Current: `de9be6a`	Previous: `1810b7a`	Ratio
`latency/precompile`	`44675226940.5` ns	`44300180944.5` ns	`1.01`
`latency/ttfp`	`13291171512` ns	`13138137112` ns	`1.01`
`latency/import`	`3784128603` ns	`3757487166.5` ns	`1.01`
`integration/volumerhs`	`9440873.5` ns	`9441754.5` ns	`1.00`
`integration/byval/slices=1`	`145616` ns	`145846` ns	`1.00`
`integration/byval/slices=3`	`422814` ns	`423265` ns	`1.00`
`integration/byval/reference`	`143792` ns	`143916` ns	`1.00`
`integration/byval/slices=2`	`284069` ns	`284641` ns	`1.00`
`integration/cudadevrt`	`102357` ns	`102633` ns	`1.00`
`kernel/indexing`	`13245` ns	`13466` ns	`0.98`
`kernel/indexing_checked`	`14083` ns	`13982` ns	`1.01`
`kernel/occupancy`	`635.202380952381` ns	`699.625850340136` ns	`0.91`
`kernel/launch`	`2025.5` ns	`2067.8` ns	`0.98`
`kernel/rand`	`14585` ns	`16244` ns	`0.90`
`array/reverse/1d`	`18615` ns	`18605` ns	`1.00`
`array/reverse/2dL_inplace`	`66177` ns	`66133` ns	`1.00`
`array/reverse/1dL`	`68804` ns	`68870` ns	`1.00`
`array/reverse/2d`	`21266` ns	`20781` ns	`1.02`
`array/reverse/1d_inplace`	`10491` ns	`10493.666666666666` ns	`1.00`
`array/reverse/2d_inplace`	`11367` ns	`10765` ns	`1.06`
`array/reverse/2dL`	`73210` ns	`72777.5` ns	`1.01`
`array/reverse/1dL_inplace`	`66188` ns	`66166` ns	`1.00`
`array/copy`	`18366` ns	`18321` ns	`1.00`
`array/iteration/findall/int`	`145622.5` ns	`145251` ns	`1.00`
`array/iteration/findall/bool`	`130340` ns	`130303` ns	`1.00`
`array/iteration/findfirst/int`	`85134` ns	`83996` ns	`1.01`
`array/iteration/findfirst/bool`	`82631` ns	`81209` ns	`1.02`
`array/iteration/scalar`	`67040` ns	`64953` ns	`1.03`
`array/iteration/logical`	`197058.5` ns	`197334` ns	`1.00`
`array/iteration/findmin/1d`	`83432` ns	`85667.5` ns	`0.97`
`array/iteration/findmin/2d`	`117087` ns	`117130` ns	`1.00`
`array/reductions/reduce/Int64/1d`	`38905` ns	`38913` ns	`1.00`
`array/reductions/reduce/Int64/dims=1`	`41600` ns	`41855` ns	`0.99`
`array/reductions/reduce/Int64/dims=2`	`58808` ns	`59043` ns	`1.00`
`array/reductions/reduce/Int64/dims=1L`	`87117` ns	`87102` ns	`1.00`
`array/reductions/reduce/Int64/dims=2L`	`84669` ns	`84295` ns	`1.00`
`array/reductions/reduce/Float32/1d`	`34237` ns	`33785` ns	`1.01`
`array/reductions/reduce/Float32/dims=1`	`43934` ns	`48986` ns	`0.90`
`array/reductions/reduce/Float32/dims=2`	`56239` ns	`56655` ns	`0.99`
`array/reductions/reduce/Float32/dims=1L`	`51394` ns	`51438` ns	`1.00`
`array/reductions/reduce/Float32/dims=2L`	`69575` ns	`69460.5` ns	`1.00`
`array/reductions/mapreduce/Int64/1d`	`39210.5` ns	`38699` ns	`1.01`
`array/reductions/mapreduce/Int64/dims=1`	`46057` ns	`41686` ns	`1.10`
`array/reductions/mapreduce/Int64/dims=2`	`58993` ns	`58974` ns	`1.00`
`array/reductions/mapreduce/Int64/dims=1L`	`87229` ns	`87184` ns	`1.00`
`array/reductions/mapreduce/Int64/dims=2L`	`84397` ns	`84571` ns	`1.00`
`array/reductions/mapreduce/Float32/1d`	`34022` ns	`33512` ns	`1.02`
`array/reductions/mapreduce/Float32/dims=1`	`39843` ns	`47745` ns	`0.83`
`array/reductions/mapreduce/Float32/dims=2`	`55903` ns	`56241` ns	`0.99`
`array/reductions/mapreduce/Float32/dims=1L`	`51260` ns	`51435` ns	`1.00`
`array/reductions/mapreduce/Float32/dims=2L`	`69261` ns	`69604` ns	`1.00`
`array/broadcast`	`20628` ns	`20361` ns	`1.01`
`array/copyto!/gpu_to_gpu`	`10673.333333333334` ns	`10601.666666666666` ns	`1.01`
`array/copyto!/cpu_to_gpu`	`213909` ns	`214964` ns	`1.00`
`array/copyto!/gpu_to_cpu`	`283527` ns	`282717` ns	`1.00`
`array/accumulate/Int64/1d`	`118150.5` ns	`118054` ns	`1.00`
`array/accumulate/Int64/dims=1`	`79533` ns	`78929` ns	`1.01`
`array/accumulate/Int64/dims=2`	`155242` ns	`155861` ns	`1.00`
`array/accumulate/Int64/dims=1L`	`1697447` ns	`1705368` ns	`1.00`
`array/accumulate/Int64/dims=2L`	`960552` ns	`960330.5` ns	`1.00`
`array/accumulate/Float32/1d`	`100637.5` ns	`100426` ns	`1.00`
`array/accumulate/Float32/dims=1`	`76099` ns	`75943` ns	`1.00`
`array/accumulate/Float32/dims=2`	`144215` ns	`143974` ns	`1.00`
`array/accumulate/Float32/dims=1L`	`1584181` ns	`1584300` ns	`1.00`
`array/accumulate/Float32/dims=2L`	`656485` ns	`658063` ns	`1.00`
`array/construct`	`1291` ns	`1252.6` ns	`1.03`
`array/random/randn/Float32`	`36310` ns	`35435` ns	`1.02`
`array/random/randn!/Float32`	`30120` ns	`29972` ns	`1.00`
`array/random/rand!/Int64`	`34550` ns	`28260` ns	`1.22`
`array/random/rand!/Float32`	`8320.166666666668` ns	`8310` ns	`1.00`
`array/random/rand/Int64`	`36976` ns	`29927` ns	`1.24`
`array/random/rand/Float32`	`12342` ns	`12324` ns	`1.00`
`array/permutedims/4d`	`50805` ns	`51686` ns	`0.98`
`array/permutedims/2d`	`52400` ns	`52279` ns	`1.00`
`array/permutedims/3d`	`52639` ns	`52911` ns	`0.99`
`array/sorting/1d`	`2734832` ns	`2735042.5` ns	`1.00`
`array/sorting/by`	`3304279` ns	`3304486.5` ns	`1.00`
`array/sorting/2d`	`1067131` ns	`1066581` ns	`1.00`
`cuda/synchronization/stream/auto`	`1064.090909090909` ns	`993.5882352941177` ns	`1.07`
`cuda/synchronization/stream/nonblocking`	`7534.299999999999` ns	`7392.700000000001` ns	`1.02`
`cuda/synchronization/stream/blocking`	`821.4470588235295` ns	`811.8282828282828` ns	`1.01`
`cuda/synchronization/context/auto`	`1150.3` ns	`1160.9` ns	`0.99`
`cuda/synchronization/context/nonblocking`	`7125.9` ns	`7875.6` ns	`0.90`
`cuda/synchronization/context/blocking`	`894.469387755102` ns	`899.7058823529412` ns	`0.99`

This comment was automatically generated by workflow using github-action-benchmark.

github-actions bot reviewed Mar 3, 2026

View reviewed changes

maleadt merged commit 444d208 into master Mar 3, 2026
3 checks passed

maleadt deleted the tb/llvm_minmax branch March 3, 2026 16:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Base.min override for Float16 and extend LLVM version guard to v20.#3038

Add Base.min override for Float16 and extend LLVM version guard to v20.#3038
maleadt merged 1 commit intomasterfrom
tb/llvm_minmax

maleadt commented Mar 3, 2026

Uh oh!

codecov bot commented Mar 3, 2026 •

edited

Loading

Uh oh!

github-actions bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

maleadt commented Mar 3, 2026

Uh oh!

codecov bot commented Mar 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

CUDA.jl Benchmarks

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

codecov bot commented Mar 3, 2026 •

edited

Loading