Handle CUDA 13 CI compatibility in gpu-app-collection#85
Handle CUDA 13 CI compatibility in gpu-app-collection#85SamThe602 wants to merge 4 commits intoaccel-sim:devfrom
Conversation
| config.MEM_CLK_FREQUENCY = deviceProp.memoryClockRate * 1e-3f; | ||
| config.MEM_BITWIDTH = deviceProp.memoryBusWidth; | ||
| config.CLK_FREQUENCY = clockRateKHz * 1e-3f; | ||
| if (memoryClockRateKHz > 0) |
There was a problem hiding this comment.
What is with all the zeros?
There was a problem hiding this comment.
Just to clarify my intent: the > 0 means if CUDA returns positive, we overwrite the default. If it returns 0, it's because the query is unsupported; it leaves it alone. I made this choice because before we used deviceProp.memoryClockRate and deviceProp.memoryBusWidth, this needed to be switched to cudaDeviceGetAttribute()
I don't think its a correctness issue but yea its not the most readable I can change it if you'd like.
|
|
||
| ARCH?=sm_80 sm_90a sm_100a sm_101 sm_120 | ||
| ifneq ($(filter 13 14 15 16,$(CUDA_VERSION_MAJOR)),) | ||
| ARCH?=sm_80 sm_90a sm_100a sm_110 sm_120 |
There was a problem hiding this comment.
both the if and the else set the same thing.
| hid); | ||
|
|
||
| cudaThreadSynchronize(); | ||
| cudaDeviceSynchronize(); |
There was a problem hiding this comment.
Does this exist in the old API?
There was a problem hiding this comment.
Yes. It's in CUDA 11.7.1 headers.
| ifneq ($(filter 13 14 15 16,$(CUDA_VERSION_MAJOR)),) | ||
| ci: rodinia_2.0-ft rodinia-3.1 GPU_Microbenchmark cutlass_mini | ||
| else | ||
| ci: rodinia_2.0-ft rodinia-3.1 GPU_Microbenchmark cutlass_mini cuda_samples |
There was a problem hiding this comment.
Samples should build it; it is an NVIDIA collection and should work at different CUDA versions.
There was a problem hiding this comment.
It fails with nvcc fatal: Unsupported gpu architecture 'compute_50' when in CUDA 13.1.1 environment.
How would you like me to handle this?
| cuda_samples: | ||
| mkdir -p $(BINDIR)/$(BINSUBDIR)/ | ||
| mkdir -p ./cuda/cuda-samples/build && cd ./cuda/cuda-samples/build && cmake .. && $(MAKE) | ||
| if [ ${CUDA_VERSION_MAJOR} -ge 13 ]; then \ |
There was a problem hiding this comment.
If we have the stuff in setup_environment, do we still need this?
There was a problem hiding this comment.
I think that setup_environment is ok for normal Makefile builds.
cuda_samples is a separate case, I believe cause it hardcodes 'CMAKE_CUDA_ARCHITECTURES' not sure.
^ could be completely wrong about this.
| unsigned L2_BANKS = 0; // L2 Cache Banks (LTCs) | ||
| }; | ||
| inline GpuConfig config; | ||
|
|
There was a problem hiding this comment.
If it's 0, the uBench will be wrong. Why is it failing?
If the syntax has changed, just update it to use the new syntax. There must still be a way to query device attributes.
There was a problem hiding this comment.
Wait, I don't think I changed the FBP_COUNT = 0 and L2_BANKS = 0
I think in the simulator path, they are overwritten from gpgpusim.config. In the hardware path filled with queryGrInfo().
There was a problem hiding this comment.
These apps also run on real GPUs, not just in gpgpu-sim. The ubench uses device query to determine kernel parameters. So these value must be correct (matching the HW) for these kernels to perform as expected. This is not the config in the simulation.
There was a problem hiding this comment.
Yes, it seems like a pre-existing issue in commit 4becbe3.
The only changes I made to stuff like this in the file were to the clock rate, memory clock rate, and memory bus width.
There was a problem hiding this comment.
Oh, I think I misunderstood your point, yes, that's true. When I go over this, I can make that fix even though it was pre-existing.
src/Makefile
Outdated
| ci: rodinia_2.0-ft rodinia-3.1 GPU_Microbenchmark cutlass_mini | ||
| else | ||
| ci: rodinia_2.0-ft rodinia-3.1 GPU_Microbenchmark cutlass_mini cuda_samples | ||
| endif |
There was a problem hiding this comment.
we need cuda samples. If you need another commit, checkout in the makefile, or just update the submodule.
Updating the submodule might not be backward compatible. So checking out a newer commit might be better.
There was a problem hiding this comment.
And this is probably the reason device query is failing?
There was a problem hiding this comment.
There are two deviceQuery binaries in the repo. One in cuda-samples and GPU_Microbenchmark.
GPU_Microbenchmark builds under CUDA 13.1.1. The CUDA-samples one is failing under CUDA 13.1.1. For the reason I replied above, because the CMake files are hardcoding compute_50.
I asked Tim, but I would also like to hear your thoughts on how you guys want me to handle this issue.
*** Never mind, I didn't notice, you answered in your original comment yes, I'll update the submodule to avoid this. Wasn't sure if you guys wanted me to touch this.
No description provided.