Skip to content

tests/perf: add shmem_barrier_perf latency benchmark#61

Draft
bcmIntc wants to merge 1 commit into
openshmem-org:mainfrom
bcmIntc:bcm_barrier_perf_test
Draft

tests/perf: add shmem_barrier_perf latency benchmark#61
bcmIntc wants to merge 1 commit into
openshmem-org:mainfrom
bcmIntc:bcm_barrier_perf_test

Conversation

@bcmIntc
Copy link
Copy Markdown
Collaborator

@bcmIntc bcmIntc commented Apr 30, 2026

Summary

  • Adds shmem_barrier_perf to test/performance/shmem_perf_suite/ — a barrier latency benchmark that gives more diagnostic signal than a simple PE 0 stopwatch.
  • Each PE times its own wait inside shmem_barrier_all(). Per-iteration max and min reductions across all PEs report the true slowest and fastest PE latency for each
    call. A sum reduction gives mean PE wait. All three are summarized as min/avg/max over all iterations.
  • After the timing loop, shmem_double_fcollect gathers each PE's average latency to PE 0, which scans to identify and report the slowest and fastest PE by rank — useful
    for diagnosing load imbalance or NIC locality issues vs. collective algorithm overhead.

Test plan

  • Build with and without --enable-hierarchical-barrier to confirm it compiles in both configurations
  • Run on 2 PEs (smoke test): verify output is well-formed and min <= avg <= max for all rows
  • Run at target scale (PPN >= 64, multiple nodes): confirm slowest/fastest PE ranks are reported and the gap between slowest-PE and mean-PE-wait rows is sensible
  • Compare slowest-PE avg latency before and after enabling the hierarchical barrier to validate the improvement

Measures shmem_barrier_all() latency with per-PE timing and global
reductions to identify the slowest and fastest PE by rank. Reports
per-iteration min/avg/max for the slowest PE, fastest PE, and mean
PE wait over all iterations.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@bcmIntc bcmIntc force-pushed the bcm_barrier_perf_test branch from b7317a4 to dd78d34 Compare April 30, 2026 12:46
@bcmIntc bcmIntc self-assigned this Apr 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant