Symplectic integration in a user-defined magnetostatic vector potential#964
Symplectic integration in a user-defined magnetostatic vector potential#964cemitch99 wants to merge 64 commits intoBLAST-ImpactX:developmentfrom
Conversation
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
| // Evaluate the vector potential and its derivatives | ||
| auto const ax = m_dfunc_ax(x, y, z) * m_scale; | ||
| auto const ay = m_dfunc_ay(x, y, z) * m_scale; | ||
| auto const daxdx = m_dfunc_daxdx(x, y, z) * m_scale; | ||
| auto const daxdy = m_dfunc_daxdy(x, y, z) * m_scale; | ||
| auto const daydx = m_dfunc_daydx(x, y, z) * m_scale; | ||
| auto const daydy = m_dfunc_daydy(x, y, z) * m_scale; | ||
| auto const dazdx = m_dfunc_dazdx(x, y, z) * m_scale; | ||
| auto const dazdy = m_dfunc_dazdy(x, y, z) * m_scale; |
There was a problem hiding this comment.
@WeiqunZhang I fear we use too many parsers for the CUDA CI (?) 😢
There was a problem hiding this comment.
Possible hack to confirm
| // Evaluate the vector potential and its derivatives | |
| auto const ax = m_dfunc_ax(x, y, z) * m_scale; | |
| auto const ay = m_dfunc_ay(x, y, z) * m_scale; | |
| auto const daxdx = m_dfunc_daxdx(x, y, z) * m_scale; | |
| auto const daxdy = m_dfunc_daxdy(x, y, z) * m_scale; | |
| auto const daydx = m_dfunc_daydx(x, y, z) * m_scale; | |
| auto const daydy = m_dfunc_daydy(x, y, z) * m_scale; | |
| auto const dazdx = m_dfunc_dazdx(x, y, z) * m_scale; | |
| auto const dazdy = m_dfunc_dazdy(x, y, z) * m_scale; | |
| // Evaluate the vector potential and its derivatives | |
| auto const ax = 0_prt; | |
| auto const ay = 0_prt; | |
| auto const daxdx = 0_prt; | |
| auto const daxdy = 0_prt; | |
| auto const daydx = 0_prt; | |
| auto const daydy = 0_prt; | |
| auto const dazdx = 0_prt; | |
| auto const dazdy = 0_prt; |
There was a problem hiding this comment.
We probably could try to merge them into one Parser. Let me think about it.
There was a problem hiding this comment.
Or maybe it does not help. I think we need to force noinline them.
There was a problem hiding this comment.
How about this?
amrex::GpuArray<amrex::ParserExecutor<3>,8> df{m_dfunc_ax, m_dfunc_ay, ....}; // on host
// on device
amrex::Real results[8];
for (int i = 0; i < 8; ++i) {
results[i] = df[i](x,y,z) * m_scale;
}
There was a problem hiding this comment.
You might add a pragma to make sure the loop is not unrolled.
There was a problem hiding this comment.
As for the noinline approach, you can add a helper in your code that that marked as noinline. Something like
AMREX_GPU_HOST_DEVICE AMREX_NO_INLINE
template <int N, typename... T>
auto call_parser (ParserExecutor<N> const& f, T... xyz)
{
return f(xyz...);
}
There was a problem hiding this comment.
I think that might be good!
So far, reducing the compile to -j1 for CUDA helped, but calling a non-inline for this case would be super helpful.
@WeiqunZhang do you like to pus hthis to the PR or a follow-up?
There was a problem hiding this comment.
I can put the helper function in amrex. Then we can use it here or in another PR. (I tested it. It did work.)
|
Something is still off with the Python tests. Besides the tolerance issues, they seem to run significantly longer than their app/executable counterparts? |
| This element requires these additional parameters: | ||
|
|
||
| * ``<element_name>.ds`` (``float``, in meters) the segment length | ||
| * ``<element_name>.unit`` (``integer``) specification of units for the vector potential (default: ``0``) |
There was a problem hiding this comment.
This is called unit while the inputs file (app API) calls it units
Was not the same input as used in the app example
Was not the same input as used in the app example
Was not the same input as used in the app example
|
@cemitch99 the three python run files are not yet 100% identical with the app inputs files, which should be the origin of the failing tests. I fixed the things I spotted, but some differences remain. |
This works on lambdas, functors, normal functions. But it does not work on overloaded functions like std::sin. If needed, one could however wrap functions like std::sin inside a lambda function. Here is the motivation behind this PR. In this impactx PR (BLAST-ImpactX/impactx#964), a GPU kernel uses 8 amrex::Parser's. The CUDA CI fails if more than one job is used in build. Apparently the kernel is too big because all those parser functions are inlined. This PR provides a way to reduce the size by forcing noinline.
This works on lambdas, functors, normal functions. But it does not work on overloaded functions like std::sin. If needed, one could however wrap functions like std::sin inside a lambda function. It also does not work with normal functions for SYCL and one would have to wrap it inside a lambda. Here is the motivation behind this PR. In this impactx PR (BLAST-ImpactX/impactx#964), a GPU kernel uses 8 amrex::Parser's. The CUDA CI fails if more than one job is used in build. Apparently the kernel is too big because all those parser functions are inlined. This PR provides a way to reduce the size by forcing noinline.
|
The three new tests currently fail only in the case OpenMP / GCC w/ MPI w/ Python. Note that the execution time (of the Python test) is very long - 32.42 sec for the solenoid example, which is only 0.56 sec on macOS / AppleClang (with similar behavior for the other two tests). Also, although the solenoid example runs successfully, the initial and final beam moments agree to all digits (and they should not), which appears to indicate that no tracking push was applied to the beam. |
|
Thanks! If they run a bit on the longer end, let us add the |
9a1e4af to
fa61eba
Compare
This PR adds a new element that allows the user to track through a region with a specified magnetostatic vector potential. Symplectic integration is performed using the exact form of the nonlinear relativistic Hamiltonian. We use the semiexplicit integrator appearing in:
B. Jayawardana and T. Ohsawa, "Semiexplicit symplectic integrators for non-separable Hamiltonian systems," Math. Comput. 92, pp. 251-281 (2022),
https://doi.org/10.1090/mcom/3778
To do: