Skip to content

AMLattanzi/cuda_perf_tests

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 

Repository files navigation

cuda_perf_tests

Basic performance tests comparing optimized to naive kernel implementations. All tests were performed with CUDA Toolkit 12.6 and the reported timings were obtained from runs on the Perlmutter HPC system.

Getting started

Prerequisites

Functioning device & host (nvcc & g++) compilers are all that is needed to run the examples.

Getting cuda_perf_tests

The following command may be utilized to clone the repository

git clone https://github.com/AMLattanzi/cuda_perf_tests.git

Running an example

The code tree is given below where each subdirectory inside the src directory contains a particular test whose timings are documented in README and a shell file make.sh for compilation.

cuda_perf_tests/
└── src
    ├── kernel_concur
    ├── lambda_kernel
    ├── matrix_add
    ├── matrix_mult
    └── mpi_host_device

To compile and run a given example inside the src directory, one may execute the following commands:

cd src/<case_name>
source make.sh
./<case_name>.exe

About

Basic performance tests comparing optimized to naive kernel implementations.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors