Skip to content

Change request: Give A2D matrices a trivial constructor #125

@A-CGray

Description

@A-CGray

I was recently trying to created an array of A2D matrices in shared memory inside a GPU kernel, e.g:

__shared__ A2D::Mat<double, numNodes, numStates> localNodeStates[elemPerBlock];

However, when I did this I got the following:

../element/FEKernels.h(738): warning #20054-D: dynamic initialization is not supported for a function-scope static __shared__ variable within a __device__/__global__ function

I think the issue is that A2D matrices do not have a trivial constructor, the default constructor sets all entries to zero, so when you try to create one of the matrices in shared memory, CUDA doesn't know which thread should call it's constructor.

Cuda does support dynamic allocation of shared memory that would get around this issue, but IMO it's a pretty nasty approach as it requires passing in the correct allocation sizes when you call the kernel.

I verified that you can successfully create A2D matrices in shared memory if you replace the current default constructor with default:

// Current constructor
// A2D_FUNCTION Mat() {
//   for (int i = 0; i < M * N; i++) {
//     A[i] = 0.0;
//   }
// }

// New constructor
A2D_FUNCTION Mat() = default;

This however has the downside that we can no longer rely on matrices being zeroed out by default, which might break things elsewhere in A2D.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions