[GPU][BSE-5357] Make sure GPU buffers are ready before calling MPI#1075
Merged
scott-routledge2 merged 11 commits intomainfrom Apr 1, 2026
Merged
[GPU][BSE-5357] Make sure GPU buffers are ready before calling MPI#1075scott-routledge2 merged 11 commits intomainfrom
scott-routledge2 merged 11 commits intomainfrom
Conversation
DrTodd13
approved these changes
Mar 31, 2026
|
|
||
| // Make sure GPU buffers are ready before passing to MPI | ||
| // TODO(BSE-5359): Make this check async | ||
| CHECK_CUDA(cudaStreamSynchronize(stream)); |
Collaborator
There was a problem hiding this comment.
Can we just check if the stream is done and not do this function or say that we have nothing to send rather than just waiting here? Is there a reason that waiting is required here?
Contributor
There was a problem hiding this comment.
I don't think there's a reason we can't. I created a followup issue for that (BSE-5359) which can be done next.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #1075 +/- ##
==========================================
- Coverage 68.18% 62.13% -6.06%
==========================================
Files 195 195
Lines 68080 68104 +24
Branches 9708 9713 +5
==========================================
- Hits 46423 42318 -4105
- Misses 18841 22860 +4019
- Partials 2816 2926 +110 |
IsaacWarren
approved these changes
Apr 1, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Changes included in this PR
Resolves segfault inside UCX for multi-node / multi GPU TPCH Q5.
Reference: https://github.com/rapidsai/rapidsmpf/blob/a1580b7f67619e7dcafd32d4a1b933f1fce61a84/cpp/include/rapidsmpf/memory/buffer.hpp#L250
Testing strategy
Multi GPU TPCH where 2 ranks are assigned to the same GPU reliably reproduces the issue. Though this can occur with multiple GPUs and one rank per GPU. Both cases now should pass.
User facing changes
Checklist
[run CI]in your commit message.