Multi-phased Hierarchical Barrier#1229
Draft
bcmIntc wants to merge 1 commit into
Draft
Conversation
80162ed to
82cca95
Compare
5153417 to
af29269
Compare
02d3ea1 to
a69dc25
Compare
Adds --enable-hierarchical-barrier, a three-phase barrier that keeps intranode traffic off the NIC by using CPU atomics over XPMEM for gather/fanout and restricts NIC puts to the internode phase (node roots only). Phase 1 (intranode gather): local PEs signal up a k-ary tree. Each PE writes to its OWN up-slot in local_pSync; the parent reads each child's slot individually. Slots are padded to one cache line (HIER_SLOT_STRIDE=8 longs, 64 bytes) so no two PEs share a line, eliminating the MESI serialization that would occur if all children wrote to a single counter. Signal values increment monotonically via hier_sense, avoiding explicit slot resets between calls (sense alternation). Phase 2 (internode dissemination): node roots run a put-based binary dissemination across the NIC. After each round the slot is reset via a CPU store rather than a self-put, saving ceil(log2(N_nodes)) NIC round-trips per barrier (12 at 4096 nodes). Phase 3 (intranode fanout): node root CPU-stores an ack into each child's down-slot; children relay down the k-ary tree with reset-before-signal ordering. Down-slots are in the upper half of local_pSync, laid out with the same per-PE cache-line padding as up-slots. AUTO selection activates when local PE count >= SHMEM_HIER_BARRIER_THRESHOLD (default 2). Also selectable via SHMEM_BARRIER_ALGORITHM=hierarchical. New infrastructure: - src/shr_transport.h — XPMEM CPU pointer mapping; self-access returns the address directly without an XPMEM lookup - src/runtime_util.c — global hostname exchange so each PE can identify its node root - configure.ac — --enable-hierarchical-barrier requires --with-xpmem and a network transport
a69dc25 to
dbd6c8e
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a three-phase hierarchical barrier (--enable-hierarchical-barrier) that keeps intranode traffic off the NIC: phases 1 and 3 use CPU atomics over XPMEM; phase 2 restricts NIC puts to node roots only.
the MESI serialization of a shared counter.
round-trips per barrier.
AUTO selection activates when local PE count ≥ SHMEM_HIER_BARRIER_THRESHOLD (default 2). Also selectable via SHMEM_BARRIER_ALGORITHM=hierarchical.
Test plan