Conversation
…t host code doesn't need to be marked appropriately
adamdempsey90
left a comment
There was a problem hiding this comment.
Do you have to do this manual switch on every function that could be called from python?
Every C++ function. The alternative would be to do a for loop in Python directly, which would be a bit of a hack, but maybe worth it. That would prevent the Python from doing efficient EOSPAC calls though. |
|
@adamdempsey90 do you object to this? I don't love it, but I do think it's the simplest/cleanest path to getting things working the way we need them to. |
I don't know, I'm not a huge fan of having to write the same for loops twice. Do you understand the root cause of why the normal portable lambda can't be used? |
Neither am I.
Yes. Because it marks it as |
Maybe an alternative then is to have a small struct that wraps the numpy pointer (?) in an unmanaged host view? |
That imposes a requirement on stride, which we currently don't impose. Though perhaps one could be clever. |
What do you mean? There's |
I should say, you just set the stride at runtime to whatever the numpy array has, but the View would be |
We have to thread the numpy language for stride/layout into the Kokkos unmanaged view. |
Ah. messages crossing in flight. Yeah this is what I mean---the alternative is a bunch of additional helper code in the python module. Does numpy always provide contiguous memory? It never does some weird array of arrays thing? |
From some consulting with GPT/Gemini and looking at https://pybind11.readthedocs.io/en/stable/advanced/pycpp/numpy.html, I think you should be able to write a very minimal array wrapper class that basically copies the strides/offsets/whatever else you need and the pointer and then provide If this works, it seems worth the effort now so that we don't have to always add double loops for every new function we want the python api to have access to. |
PR Summary
This MR is a follow up of #630 and issue #628 . I found I was still seeing compilation errors on GPU builds with python active, and the reason was that the lambda was marked
__host__ __device__even though the execution space was host. This MR adds a constexpr branch to switch to a host-only lambda when appropriate.PR Checklist
make formatcommand after configuring withcmake.plan_historiesfolder, with a filename the same as the MR number.If preparing for a new release, in addition please check the following: