Skip to content

Cache interactions while walking the tree (dual tree walk)#797

Open
Yrisch wants to merge 9 commits intodanieljprice:masterfrom
Yrisch:cache_SFMM
Open

Cache interactions while walking the tree (dual tree walk)#797
Yrisch wants to merge 9 commits intodanieljprice:masterfrom
Yrisch:cache_SFMM

Conversation

@Yrisch
Copy link
Collaborator

@Yrisch Yrisch commented Feb 11, 2026

Description:
This PR introduces an optimisation that caches interactions computed during the neighbour search. This allows threads to fetch shared interactions from cache if they are already computed. Special care has been taken to synchronise threads with atomic operations during writing and reading the cache.

The first tests I performed show a 7 to 8 % improvement on a collapsing cloud simulation with 3 M SPH particles. I tried to shuffle leaf nodes to maybe increase cache hits without any improvements. This should be extensively profiled in the future, but should already be useful in the code.

Components modified:

  • Setup (src/setup)
  • Main code (src/main)
  • Moddump utilities (src/utils/moddump)
  • Analysis utilities (src/utils/analysis)
  • Test suite (src/tests)
  • Documentation (docs/)
  • Build/CI (build/ or github actions)

Type of change:

  • Bug fix
  • Physics improvements
  • Better initial conditions
  • Performance improvements
  • Documentation update
  • Better testing
  • Code cleanup / refactor
  • Other (please describe)

Testing:
All gravity tests should pass as before...

Did you run the bots? no

Did you update relevant documentation in the docs directory? no

Did you add comments such that the purpose of the code is understandable? yes

Is there a unit test that could be added for this feature/bug? no

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @Yrisch, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the performance of the K-D tree algorithm by introducing an intelligent caching system for interaction computations. The optimization allows threads to share and retrieve pre-computed interaction data, thereby reducing redundant calculations. This change is particularly beneficial for large-scale simulations, leading to more efficient processing and faster execution times.

Highlights

  • Interaction Caching: Implemented a caching mechanism for interactions computed during the K-D tree neighbor search, allowing threads to reuse previously calculated values.
  • Thread Synchronization: Introduced atomic operations to ensure thread-safe writing and reading of the interaction cache, preventing race conditions in a multi-threaded environment.
  • Performance Improvement: Achieved a reported 7-8% performance improvement in a 3 million SPH particle collapsing cloud simulation, demonstrating the effectiveness of the optimization.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • src/main/dtype_kdtree.F90
    • Added tobecached and cached logical members to the kdnode type to manage caching status.
    • Included an idum integer member for memory alignment purposes.
  • src/main/kdtree.F90
    • Introduced fnodecache, a new allocatable array, to store cached interaction values.
    • Implemented allocation and deallocation routines for the fnodecache array.
    • Modified the construct_node and revtree subroutines to initialize the tobecached and cached flags for new nodes.
    • Updated the getneigh_dual subroutine to incorporate the core caching logic, utilizing atomic operations for thread synchronization when accessing the cache.
    • Adjusted the propagate_fnode_to_node subroutine's fnode argument intent and calculation to align with the new caching strategy.
    • Made the compute_M2L call in node_interaction conditional, skipping computation if the interaction is already cached.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a caching optimization for interactions computed during the dual tree walk to improve performance. However, the implementation contains critical security-relevant issues: a data race in the OpenMP caching logic and a mismatch in the MPI datatype definition for the kdnode structure. These problems can lead to data corruption and undefined behavior in parallel simulations, stemming from inadequate memory synchronization for shared data access and unaddressed updates to MPI communication routines for the new kdnode fields. Specifically, two data race conditions have been identified where shared memory is read without proper synchronization, potentially causing incorrect behavior or failed optimizations.

@Yrisch Yrisch marked this pull request as ready for review February 16, 2026 16:02
@Yrisch
Copy link
Collaborator Author

Yrisch commented Feb 16, 2026

@danieljprice Can you review this PR?
I think jobs are cancelled as they exceed the 1-hour threshold...

@danieljprice
Copy link
Owner

danieljprice commented Feb 17, 2026

@Yrisch I think you are getting a hang or extreme slowdown when run with ifx, which likely means something is not thread-safe here (like a race condition). The timeout is 6hrs so well outside the bounds of "normal behaviour"

@Yrisch
Copy link
Collaborator Author

Yrisch commented Feb 17, 2026

AH... I will check, but it is weird that it only happens with ifx?

@danieljprice
Copy link
Owner

aocc failure is fixed on master

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants