More efficient unweighting using the GPU#642
Draft
hageboeck wants to merge 4 commits intomadgraph5:masterfrom
Draft
More efficient unweighting using the GPU#642hageboeck wants to merge 4 commits intomadgraph5:masterfrom
hageboeck wants to merge 4 commits intomadgraph5:masterfrom
Conversation
std::copy implementations are supposed to use memmove where possible (dependending on the template parameters). Therefore, a manual check of the copied types is unnecessary. When fortran type and C++ type are identical, std::copy automatically decays to memcpy.
Add kernels and bridge code to compute event weights on GPU. Using the weights of Jacobians and PDF from Fortran, the GPU can compute the total event weight in device memory. A second kernel computes the maximum of each batch, and returns this to the host.
- For each batch, compute the maximum event weight on the GPU - Transfer this into a common block for the unweighting steps - This allows for rejecting events a lot earlier (instead of writing them to tmp)
Now that the max event weight can be computed in each batch, the unweight fudge factor for accepting / rejecting an event can be chosen much closer to one. Here we go on the conservative side, where we accept about twice as many events than go to the final sample.
6aed8a8 to
82963b7
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Here, the unweighting in gg --> ggtt is improved. By computing the maximum event weight for each batch on the GPU, the unweighting function can much earlier reject candidate events based on their weights.
This speeds up the FORTRAN part by almost 3x with
-O2and 2x with-O3.Some details might still need to be ironed out, so keeping this as draft for now.
Here is a diff between before/after on
-O2 -g:And with
-O3: