Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions lib/cublas/linalg.jl
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,16 @@ end
LinearAlgebra.norm(x::DenseCuArray{<:CublasFloat}) = nrm2(x)
LinearAlgebra.BLAS.asum(x::StridedCuArray{<:CublasFloat}) = asum(length(x), x)

function LinearAlgebra.norm(x::DenseCuArray{<:CublasFloat}, p::Integer)
Comment thread
SomTambe marked this conversation as resolved.
if p==1
return CUBLAS.asum(length(x),x)
elseif p==2
return LinearAlgebra.norm(x)
else
return LinearAlgebra.tr(LinearAlgebra.Diagonal(abs.(x))^p)^(1/p)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that these operations are implemented using CUBLAS, so are themselves restricted to the types CUBLAS supports (plain, basic C types). Ideally we'd have something more generic. What about #84 (comment), is that not valid or slower?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Such a generic versions could also go into GPUArrays.jl, while a specialized version for CUBLAS-supported types could live in CUDA.jl.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maleadt When we use the method you mentioned in #84, it throws out a scalar indexing warning, which wouldn't work when CUDA.allowscalar(false) is asserted. That is why I have implemented it using CUBLAS based operations.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maleadt Could you elaborate on the types CUBLAS does not support?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mapreduce is supported by CUDA.jl, I don't understand how that would trigger scalar indexing?

If you look at the CUBLAS docs, you'll see it's a C library that only supports a limited set of element types. That's why we have type unions like CUBLASFloat in CUDA.jl

end
end

function LinearAlgebra.axpy!(alpha::Number, x::StridedCuArray{T}, y::StridedCuArray{T}) where T<:CublasFloat
length(x)==length(y) || throw(DimensionMismatch("axpy arguments have lengths $(length(x)) and $(length(y))"))
axpy!(length(x), alpha, x, y)
Expand Down
9 changes: 9 additions & 0 deletions test/cublas.jl
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
using CUDA.CUBLAS
using CUDA.CUBLAS: band, bandex
using GPUArrays

using LinearAlgebra

Expand Down Expand Up @@ -84,6 +85,14 @@ end
mul!(y, f(A), x, Ts(1), Ts(2))
@test Array(dy) ≈ y
end

@testset "norm" begin
x, y, z = CUDA.rand(elty, 1), CUDA.rand(elty, 2), CUDA.rand(elty, m)
@disallowscalar @test typeof(norm(x,1)) <: Real
@disallowscalar @test typeof(norm(y,2)) <: Real
@disallowscalar @test typeof(norm(z,m)) <: Real
end

@testset "banded methods" begin
# bands
ku = 2
Expand Down