Allow explicit data transfers to GPUs#156620
Conversation
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
e475c46 to
da102aa
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
Vendoring llvm/llvm-project#198033 for now. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
abc274d to
1d8d1e7
Compare
This comment has been minimized.
This comment has been minimized.
1d8d1e7 to
a94ef31
Compare
This comment has been minimized.
This comment has been minimized.
4b77bad to
319ef7d
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
fba7eb2 to
358171b
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
2f1d614 to
bbe3882
Compare
This comment has been minimized.
This comment has been minimized.
bbe3882 to
d290591
Compare
This comment has been minimized.
This comment has been minimized.
d290591 to
e8ad696
Compare
This comment has been minimized.
This comment has been minimized.
e8ad696 to
ce8db44
Compare
This comment has been minimized.
This comment has been minimized.
ce8db44 to
fe82262
Compare
This comment has been minimized.
This comment has been minimized.
fe82262 to
7a44fd7
Compare
This comment has been minimized.
This comment has been minimized.
7a44fd7 to
cf43198
Compare
This comment has been minimized.
This comment has been minimized.
cf43198 to
fae2f07
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
4f5c325 to
6c8bec9
Compare
This comment has been minimized.
This comment has been minimized.
|
The job Click to see the possible cause of the failure (guessed by this bot) |
| cpu_ptr: *const T, | ||
| _marker: PhantomData<&'a T>, |
There was a problem hiding this comment.
What's the intent behind this? How is it different from just having a &'a T field?
There was a problem hiding this comment.
No references since they require a valid pointer at all times. However, all writes to it happen to the GPU version, which lives in a different Address space, so we treat the cpu address of the pointer merely as a key. I'm also considering to directly store the GPU address of this pointer which would make this even clearer being UB as a reference
| // This exists so MIR creates Drop terminators for PreloadMut. | ||
| // rustc codegen intercepts those terminators and emits the | ||
| // offload return mapper. |
There was a problem hiding this comment.
why is this not just an intrinsic call here?
There was a problem hiding this comment.
Partly just experimenting, partly because intrinsics recently changed a bit, they got updated for more explicit Place handling, about which I didn't want to think for my mvp. I'll update them to intrinsics after my deadline.
| return false; | ||
| }; | ||
|
|
||
| Some(adt_def.did()) == tcx.lang_items().preload_mut_type() |
There was a problem hiding this comment.
use tcx.is_lang_item
|
|
||
| #[lang = "preload"] | ||
| #[unstable(feature = "offload", issue = "124509")] | ||
| pub fn preload<'a, T: ?Sized>(x: &'a T) -> Preload<'a, T> { |
There was a problem hiding this comment.
Yea I think these should just be intrinsics instead of catching lang item calls during codegen of call terminators.
View all comments
So far we had our offload intrinsics handle data movement automatically to/from the gpu.
That's convenient (and reasonably fast once our LLVM opts land). However, Rust generally also allows being explicit. That might give perf benefits (where our LLVM opts fail), and it could also be nice for modelling, when passing data around but still preventing CPU users from accesing it.
Tracking Issue for GPU-offload #131513