Motivation
In general, implementing shmem_quiet based memory ordering semantics is expensive. With the introduction of system processors with weak memory model, and support for multiple NICs per node, the cost of performing remote completion and committing any previously posted RMA and AMO events is getting really expensive. This introduces the need for performing dummy read-like operations to commit any outstanding operations into the remote targets memory.
Solution
As part of this proposal, we would like to introduce explicit options to perform local completion in OpenSHMEM. To complete the API we also would like to introduce the option to explicitly perform the remote commit operation. We can implement the existing shmem_quiet semantics as a combination of the local completion and remote commit operation.
Proposed API
The following new routines are proposed:
# Additions to OpenSHMEM Memory Ordering Operations
void shmem_local_complete(void);
void shmem_ctx_local_complete(shmem_ctx_t ctx);
void shmem_remote_commit(void);
void shmem_ctx_remote_commit(shmem_ctx_t ctx);
# Additions to OpenSHMEM collective operations
void shmem_team_remote_commit(shmem_team_t team);
API Semantics
shmem_local_complete and shmem_ctx_local_complete
The shmem_local_complete routine ensures the local completion of all operations on symmetric data objects issued by the calling PE on a given context. By local completion, the shmem_local_complete routine ensures the completion of all previously posted operations on symmetric data objects, but it does not guarantee any visibility of those operations when it returns from shmem_local_complete. With the local completion the symmetric data objects from all previously posted operations are ready to be reusable for performing other operations.
shmem_remote_commit and shmem_ctx_remote_commit
The shmem_remote_visible routine ensures the global visibility of all previously locally completed operations. It is to be noted that, this routine ensure only global visibility of only the previously locally completed operation. The local completion can be attained implicitly through the OpenSHMEM routines (like blocking put and AMO) or explicitly calling the shmem_local_complete operations.
shmem_team_remote_commit
This is a collective variant of the shmem_remote_commit operation. This routine registers the arrival of a PE at a shmem_team_remote_commit operation and blocks the PE until all other PEs arrive at the same shmem_team_remote_commit operation and also ensures that any locally completed operation on all PEs are made globally visible
Motivation
In general, implementing
shmem_quietbased memory ordering semantics is expensive. With the introduction of system processors with weak memory model, and support for multiple NICs per node, the cost of performing remote completion and committing any previously posted RMA and AMO events is getting really expensive. This introduces the need for performing dummy read-like operations to commit any outstanding operations into the remote targets memory.Solution
As part of this proposal, we would like to introduce explicit options to perform local completion in OpenSHMEM. To complete the API we also would like to introduce the option to explicitly perform the remote commit operation. We can implement the existing
shmem_quietsemantics as a combination of the local completion and remote commit operation.Proposed API
The following new routines are proposed:
API Semantics
shmem_local_completeandshmem_ctx_local_completeThe
shmem_local_completeroutine ensures the local completion of all operations on symmetric data objects issued by the calling PE on a given context. By local completion, theshmem_local_completeroutine ensures the completion of all previously posted operations on symmetric data objects, but it does not guarantee any visibility of those operations when it returns fromshmem_local_complete. With the local completion the symmetric data objects from all previously posted operations are ready to be reusable for performing other operations.shmem_remote_commitandshmem_ctx_remote_commitThe
shmem_remote_visibleroutine ensures the global visibility of all previously locally completed operations. It is to be noted that, this routine ensure only global visibility of only the previously locally completed operation. The local completion can be attained implicitly through the OpenSHMEM routines (like blocking put and AMO) or explicitly calling theshmem_local_completeoperations.shmem_team_remote_commitThis is a collective variant of the
shmem_remote_commitoperation. This routine registers the arrival of a PE at ashmem_team_remote_commitoperation and blocks the PE until all other PEs arrive at the sameshmem_team_remote_commitoperation and also ensures that any locally completed operation on all PEs are made globally visible