Add CUDA 13 runtime support and CI lane#134
Add CUDA 13 runtime support and CI lane#134SamMausberg wants to merge 1 commit intoaccel-sim:devfrom
Conversation
| } | ||
| CUctx_st *context = GPGPUSim_Context(ctx); | ||
| function_info *entry = context->get_kernel(hostFun); | ||
| #if CUDART_VERSION < 10000 |
There was a problem hiding this comment.
This is probably here to ensure that sub 10 CUDAs work.
There was a problem hiding this comment.
Before I changed it new CUDA launch relied on __cudaPopCallConfiguration() being no-op. Meaning that the launch config would sit on g_cuda_launch-stack, so that cudaLaunchKernelInternal() did not need to call cudaConfigureCallInternal() for CUDART_VERSION >= 10000, you can check this from before my changes in libcuda/cuda_runtime_api.cc in line 2042.
I made __cudaPopCallConfiguration() to actually take the saved launch config and return gridDim, blockDim, sharedMem, stream, which is what newer CUDA expects. The modern pattern is __cudaPushCallConfiguration(..) -> __cudaPopCallConfiguration(...) -> cudaLaunchKernel(...).
To further clarify, when __cudaPopCallConfiguration(), cudaLaunchKernelInternal() then must rebuild the launch state, which is why the call is unconditional. So we did not remove support for < 10 CUDA, we just made the path consistent. I also reran the 11.7 lane after this, and it passed with short-tests-cmake.sh 10/10.
No description provided.