Skip to content

symmetric_heap: fix huge pages mmap and silent fallback#1227

Draft
bcmIntc wants to merge 3 commits into
Sandia-OpenSHMEM:mainfrom
bcmIntc:bcm_hugePT_fix
Draft

symmetric_heap: fix huge pages mmap and silent fallback#1227
bcmIntc wants to merge 3 commits into
Sandia-OpenSHMEM:mainfrom
bcmIntc:bcm_hugePT_fix

Conversation

@bcmIntc
Copy link
Copy Markdown
Collaborator

@bcmIntc bcmIntc commented Apr 15, 2026

Addresses two bugs in mmap_alloc() when SHMEM_SYMMETRIC_HEAP_USE_HUGE_PAGES=1:

  1. (HIGH) MAP_ANON|MAP_PRIVATE was used even when a valid hugetlbfs fd was open. On Linux, MAP_ANONYMOUS causes the kernel to ignore the fd, so huge pages were never actually used. Fix: use MAP_SHARED when fd > 0 so the mapping is backed by the hugetlbfs file.

  2. (LOW) When find_hugepage_dir() found no matching hugetlbfs mount, the code silently fell back to regular pages with no diagnostic. Fix: emit RAISE_WARN_MSG so the user knows their request was not honored.

Also moved the Linux-specific variables (file_name, fd, directory) and the mmap dispatch inside the '#ifdef linux' block so all huge page logic is cleanly scoped to Linux, with a plain anonymous mmap in the #else for all other platforms.

Addresses Issue #1207

Addresses two bugs in mmap_alloc() when SHMEM_SYMMETRIC_HEAP_USE_HUGE_PAGES=1:

1. (HIGH) MAP_ANON|MAP_PRIVATE was used even when a valid hugetlbfs fd was
   open. On Linux, MAP_ANONYMOUS causes the kernel to ignore the fd,
   so huge pages were never actually used. Fix: use MAP_SHARED when
   fd > 0 so the mapping is backed by the hugetlbfs file.

2. (LOW) When find_hugepage_dir() found no matching hugetlbfs mount, the
   code silently fell back to regular pages with no diagnostic.
   Fix: emit RAISE_WARN_MSG so the user knows their request was not
   honored.

Also move the Linux-specific variables (file_name, fd, directory)
and the mmap dispatch inside the #ifdef __linux__ / #else / #endif
block so all hugepage logic is cleanly scoped to Linux, with a plain
anonymous mmap in the #else for all other platforms.
@bcmIntc bcmIntc self-assigned this Apr 15, 2026
bcmIntc added 2 commits April 17, 2026 14:45
ftruncate() must be called on the hugetlbfs file descriptor before
mmap() to set the file size. Without it, mmap() against a hugetlbfs
fd can fail with EINVAL or produce unexpected behavior.

Also add MAP_HUGETLB alongside MAP_SHARED to make the intent explicit
and match the usage confirmed working on the target system.

On ftruncate failure, unlink the file before falling back to regular
anonymous pages to avoid leaving stale files in /dev/hugepages.
…oc warning

Guard all three DISABLE_NONFETCH_AMO blocks in shmem_comm.h with
USE_OFI to prevent compile errors (FI_ATOMIC_WRITE undeclared) and
runtime errors (transport_none: No path to peer) in non-OFI builds.

Also improve the mmap_alloc file open failure warning to include
strerror(errno) so the cause (e.g. Permission denied) is visible.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant