Open
Conversation
Collaborator
Author
|
Could we trigger more involved automated testing? |
Collaborator
|
Yes, More involved unit testing would be nice. |
Collaborator
Author
|
Okay, then the other goal is to use this branch as a better base for exampi and openmpi support. @xuyao0127 @JainTwinkle should add |
64562f5 to
567dbd1
Compare
b84a6b3 to
eeeb8a8
Compare
e9a56c5 to
0358def
Compare
6d384c4 to
2dba7c4
Compare
95229dd to
4f1a368
Compare
670cbd3 to
5aacf1f
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Intro
This is a branch of the
dev/gc00/improve-virtualidscode intended for longer-term maintainability, review, and integration into main MANA. It is also intended to be the base for OpenMPI, ExaMPI, etc. development.What's different from dev/gc00/improve-virtualids?
get_vcomm_internalbefore, this is a better solution because the semantics match the old style. Seegrant_ggidand its usages.dev/gc00/improve-virtualidshad slightly different ADD_NEW semantics than main, because the equivalent ofrealIdExists(real_id)would be hacky to write. I've written up how thisrealIdExistsfunctionality might look like, but without a two-level table, it is very bad for big apps, because we cannot iterate through only the descriptors of a single type. Please look at this part, @xuyao0127. If we cannot develop and perfect a two-level table like this by camera-ready for SC23, we could revert to the old behavior. I don't think it caused problems, but it is different. Alternatively, as a workaround we could maintain several maps for the different types as before (but do away with string comparison and template class still).FIXMEannotates points of concern.Otherwise, the branches are the same.
Please, read over the FIXME comments I have written in reviewing, they contain points of interest and important questions/concerns.
Contrib
Unfortunately, this is a pretty big PR. I tried to organize the commits so it is clear what commit is in which phase.
This PR does three main things:
1. Replace the old virtualid machinery
(map from ints to ints, written in a C++ template and using dmtcp's
virtualidtable.h) to a mapping from int to struct pointer, with each struct pointer containing both the Real ID for a some Virtual ID, and any metadata information required. I tried to write this part in a C-style. The ugliest part has to be the macros, but I thought that this was the cleanest way to write the code which depends on the particular MPI types, because they are not defined the same way even in the same implementation. For instance,MPI_Fileis a pointer even in MPICH. It also integrates well with the existing macros. Like @xuyao0127 says, the definition and usage of these macros are kind of strange and obtuse, or at least poorly documented (VIRTUAL_TO_REAL(real) == real). However, changing their behavior would mean changing them literally everywhere, making this PR even bigger. It could be a later project.The struct pointer architecture enables objectives 2 and 3 of the PR. It's also the part that enables portability between MPI implementations (with the exception of one FIXME comment in
init_comm_world: @xuyao0127 , @JainTwinkle , I would appreciate your feedback/work here. I think it would be great if the MANA codebase was truly 100% the same between ExaMPI, OpenMPI, MPICH. Therefore, we need to integrate thelh_constants_mapinto every version.If we only want to demonstrate the capability of checkpoint-restart with multiple MPI (without also improving performance in points 2, 3 and really flexing struct VIDs) this is all we have to test and enhance, just the first commit here.
2. Integrate the old ggid machinery into the system in point 1.
This means that we don't have to look up the same virtual id multiple times. An optimization.
3. Replace the log-replay architecture with a decode-recode architecture.
This works readily for communicators, groups, and operators, because they can all be constructed in one go, when given the proper arguments. It is more difficult for datatypes, because datatypes cannot be "fully deserialized" in the same way. While operators also cannot be deserialized, it is not a problem because there is only way to create a new operator, and operators are not recursively defined, so it is trivial to just hook into operator creation. Datatypes that are "doubly-derived", i.e., user datatypes built using other user datatypes, do not immediately break down into MPI primitives.
One way to solve this problem would be to maintain a dependency tree of datatypes: every wrapper that creates a datatype can record the virtual id, and mark that it is a dependency. Another way may be to implement our own full-deserialization, but this is probably technically difficult. Yet another is to preserve the log-replay system for datatypes exclusively. Either way, the point of this part is to enable faster runtime (we don't need to record MPI calls), faster restart (we don't have to replay MPI calls) at the cost of making checkpoint time slightly slower (we have to record the information for everything that's alive at checkpoint time).
Future
Where to go from here:
lh_constants_map