Conversation
Remove cuda-cudart from run requirements in ucxx-tests conda recipe since static linking embeds the runtime into binaries.
robertmaynard
left a comment
There was a problem hiding this comment.
I don't think we can do this at the ucxx level.
UCX depends on libcudart https://github.com/search?q=repo%3Aopenucx%2Fucx%20cudart&type=code
UCXX only links dynamically to UCX. I'm definitely ignorant of linking details, but if UCXX switches to link statically to CUDA, does it become a problem with UCXX linking dynamically to UCX, and in turn UCX link dynamically to CUDA? That is what the current linkage tree in UCXX looks like. |
My concern is: The goal of this is to improve minor version compatibility. If UCX is loading the shared library version of cudart, it is a failure point of minor version compatibility. So if UCXX isn't extending the subset of libcudart symbols usage it doesn't need to change ( since either project will fail ). With these changes we will have |
UCXX shouldn't be extending symbol usage, except for those indirectly being used via RMM. |
|
@robertmaynard Are you concluding this PR should be closed with no changes made? I am happy to cut this from the broader scope if that's the right outcome. |
|
I think the real change we need here is to discuss statically linking cudart in UCX. This may also necessitate introducing a new approach in UCX for detecting the availability of the CUDA transport, but I would hope not. I haven't looked in a while, but IIRC the libuct_cuda.so binary is dynamically linked to the driver libcuda.so, and the dlopen of that transport governs whether CUDA support is available. Beyond that, if UCX also uses cudart symbols but statically links to it then we should be on the MVC happy path. |
Summary
cuda-cudartfrom run requirements inucxx-testsconda recipeWith static linking of the CUDA runtime, the runtime is embedded in the binaries and the
cuda-cudartpackage is not needed at runtime.Part of rapidsai/build-planning#235