Skip to content

Optimize GPU boundary exchanges via NVSHMEM #1

@bentsherman

Description

@bentsherman

NVSHMEM is an implementation of OpenSHMEM for Nvidia GPUs:

https://developer.nvidia.com/nvshmem
https://docs.nvidia.com/hpc-sdk/nvshmem/api/docs/index.html

It is essentially an alternative to MPI that allows the GPUs to communicate directly with the interconnect, instead of going through the CPU for MPI communications. The API is very similar to MPI but with slightly different terminology (init, finalize, PEs, teams, put/get, collective ops). Additionally, the memory model is slightly different.

This would be a great way to optimize the boundary exchanges, which currently represent the majority of communication overhead in the multi-GPU scenario. A big downside is that you probably can't have MPI and NSHMEM in the same program. You might be able to have a wrapper library that defers to either MPI or NVSHMEM based on whether or not GPUs are enabled, but more likely you will need to have separate binaries for cpu/gpu.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions