Add hybrid MPI+OpenMP support by toretto-uk · Pull Request #826 · hemelb-codes/hemelb

toretto-uk · 2025-08-17T17:04:48Z

Overview

This PR introduces hybrid parallelism by integrating OpenMP into the existing MPI-based code. Computationally intensive collision and streaming parts were parallelised with OpenMP loops with the intention to better exploit shared-memory parallelism within nodes.

Enabling OpenMP is configurable via -DHEMELB_USE_OPENMP=ON/OFF build option. OpenMP is disabled by default.

Results

The pure MPI reference implementation consistently delivers the best performance and scalability across all tested configurations, compilers and platforms. However, at low node counts, the OpenMP version shows promising results, slightly outperforming the pure MPI version. That suggests that potentially, on a larger input geometry with more lattice sites per rank (more iterations for the OpenMP loops), it could still be beneficial to use OpenMP.

For full performance comparison please find the plots below.

ARCHER2

Figure 1: Hybrid parallelism: speedup for the retina dataset (40,000 time steps) on ARCHER2 using GNU compilers, 128 execution units per node.

Figure 2: Hybrid parallelism: speedup for the retina dataset (40,000 time steps) on ARCHER2 using Cray compilers, 128 execution units per node.

Figure 3: Hybrid Parallelism: simulation time on 4 nodes on ARCHER2 using GNU compilers, 128 execution units per node.

Cirrus

Figure 4: Hybrid parallelism: speedup for the retina dataset (40,000 time steps) on Cirrus using GNU compilers, 128 execution units per node.

rupertnash

Thanks for the work! I will have to review your dissertation to get the full details of the change in performance, but there are a few minor code problem before this can be considered for a merge. I notice that you haven't touched the initialisation code (in lb::InitialCondition) which can make a huge difference in performance when NUMA effects are in play (due to the typical first touch page allocation strategy used)

rupertnash · 2025-09-23T09:51:42Z

Need to use find_package(OpenMP) and then target_link_libraries(... OpenMP::OpenMP_CXX)

rupertnash · 2025-09-23T09:51:49Z

Unacceptable use of ifdef in new code. Should refactor to minimise the code that is different when OpenMP enabled. If different code is required, use if constexpr

toretto-uk added 2 commits August 12, 2025 21:44

Add hybrid MPI+OpenMP support

3f5442b

Update compile_options.yml

587a4d0

toretto-uk marked this pull request as ready for review September 16, 2025 07:39

rupertnash requested changes Sep 23, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add hybrid MPI+OpenMP support#826

Add hybrid MPI+OpenMP support#826
toretto-uk wants to merge 2 commits into
hemelb-codes:mainfrom
toretto-uk:hybrid-parallelism-pr

toretto-uk commented Aug 17, 2025 •

edited

Loading

Uh oh!

rupertnash left a comment

Uh oh!

rupertnash Sep 23, 2025

Uh oh!

rupertnash Sep 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

toretto-uk commented Aug 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Results

ARCHER2

Cirrus

Uh oh!

rupertnash left a comment

Choose a reason for hiding this comment

Uh oh!

rupertnash Sep 23, 2025

Choose a reason for hiding this comment

Uh oh!

rupertnash Sep 23, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

toretto-uk commented Aug 17, 2025 •

edited

Loading