Refactor AdePT into 2 libraries 4: Fix transport lifetime ownership and shutdown#521
Merged
agheata merged 1 commit intoapt-sim:masterfrom Mar 19, 2026
Merged
Conversation
|
Can one of the admins verify this patch? |
2 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR cleans up the lifetime of the shared
AdePTTransport:Instead of keeping the transport alive through a static owning
shared_ptr, the static storage now only keeps aweak_ptr, while the real ownership remains with theAdePTTrackingManagerinstances.This avoids destroying the transport very late during process teardown, where CUDA stream cleanup can already be in an invalid shutdown state.
Another explicit guard was added in the AsyncAdePTTransport:
In this wait loop for work of the G4 workers:
The GPU steering thread can enter the sleep at shutdown, because the needTransport is already false, but the runTransport is not yet switched to false in the
AsyncAdePTDestructor. If it is then switched to false by the worker in the shutdown, then one more iteration is run, while CUDA might already be in teardown, leading to a crash. This is prevented by a simple guard.This PR is needed for upcoming restructuring, which exposed this life-time-management problem
It was verified that this PR