Skip to content

rts: halve max heap, return memory to OS after parsing spikes#59

Closed
ccomb wants to merge 1 commit into
mainfrom
rts/halve-max-heap-faster-decay
Closed

rts: halve max heap, return memory to OS after parsing spikes#59
ccomb wants to merge 1 commit into
mainfrom
rts/halve-max-heap-faster-decay

Conversation

@ccomb
Copy link
Copy Markdown
Owner

@ccomb ccomb commented May 16, 2026

Summary

On a 16 GB host, parsing-heavy startups were being killed by the kernel OOM-killer. RTS flags computed by docker/rts-flags.sh left no headroom and held memory long after the parsing spike was over.

  • -M 75 % → 50 % of RAM. The previous 75 % cap competed with MUMPS Fortran allocations (outside the GHC heap), transient parser-intermediate RSS, and kernel/page-cache needs. RSS hit the cgroup limit before -M could trigger a clean Haskell heap-exhaustion exit. 50 % leaves real headroom and lets the runtime fail gracefully on overload.
  • -I30-I0.3 (GHC default). A 30 s idle-GC delay hid live-data drops after parsing, so the runtime kept the post-spike heap inflated for far too long.
  • Added -Fd1.0 (GHC 9.10+). Decays free heap blocks back to the OS over ~1 idle period. Without it, the default decay (4.0) pins RSS near the peak for minutes after parsing finishes.

-A, -c, -F1.5, -qg0 are left alone — they trade off with parser throughput and shouldn't move without benchmarking.

Expected effect

On a 16 GB host with a parsing workload:

  • Peak RSS during parsing: ~16 GB+ (OOM) → ~8–10 GB.
  • Steady-state RSS after parsing: ~stuck near peak → drops to live-data size within seconds.

Test plan

  • Build the Docker image and run with a parsing workload on a 16 GB cgroup; confirm no OOM and post-parse RSS converges to live-data size.
  • Spot-check the printed RTS: ... -> +RTS ... summary on stderr to confirm the new flags appear and -M is half the cgroup limit.
  • Verify on a small (4 GB) host that the 2 GB floor still kicks in (-M2048M).

The previous flags caused parsing-heavy startups to be killed by the
kernel OOM-killer on a 16 GB host:

- `-M` was 75 % of RAM. Combined with MUMPS Fortran allocations (outside
  the GHC heap), parser-intermediate RSS overhead, and kernel/page-cache
  needs, RSS reached the cgroup limit before `-M` could trigger a clean
  Haskell heap-exhaustion exit. Now 50 %, leaving real headroom.
- `-I30` deferred idle major GC for 30 s, so live-data drops after a
  parsing spike weren't reflected in heap accounting promptly. Back to
  the GHC default of 0.3 s.
- Added `-Fd1.0` (GHC 9.10+): decays free heap blocks back to the OS
  over ~1 idle period instead of the default 4.0, which pins RSS near
  the peak for minutes after parsing finishes.

`-A`, `-c`, `-F1.5`, `-qg0` left as-is — they trade off with parser
throughput and shouldn't move without measurement.
ccomb added a commit that referenced this pull request May 16, 2026
Two cheap RTS tweaks (cherry-picked from #59, minus the -M change):

- Add -Fd1.0 (GHC 9.10+): decay free heap blocks back to the OS over
  ~1 idle period instead of the default 4.0, which keeps RSS pinned
  near peak for minutes after a parsing spike.
- -I30 -> -I0.3 (GHC default): trigger idle-time major GC promptly.
  The previous 30 s deferral hid live-data drops and starved -Fd of
  free blocks to release.

Keeping -M at 75 % of RAM for now: dropping to 50 % may be the right
call eventually, but the OpenBLAS musl crash that motivated the change
in #59 is fixed independently in the previous commit. Re-evaluate -M
once we have RSS curves on the 8 GB target.
@ccomb
Copy link
Copy Markdown
Owner Author

ccomb commented May 16, 2026

Superseded by #60, which cherry-picks the cheap subset (-Fd1.0 + -I0.3) and leaves -M at 75 % pending RSS measurements on the 8 GB target. The OOM-killer that motivated halving -M turned out to be the OpenBLAS musl pthread stack crash (exit 139 / SIGSEGV), now fixed in #60 commit 2.

@ccomb ccomb closed this May 16, 2026
ccomb added a commit that referenced this pull request May 16, 2026
…ory hygiene (#60)

Two independent fixes for running VoLCA on an 8 GB RAM VM.

1) OpenBLAS pthread stack on musl. Static Alpine/musl builds segfaulted
   (exit 139 / SIGSEGV) inside MUMPS factorization on the first dense
   BLAS3 call. musl's hardcoded 128 KB default pthread stack (vs glibc's
   8 MB from RLIMIT_STACK) is overflowed by OpenBLAS DYNAMIC_ARCH
   Fortran kernels with large auto-arrays. Patch driver/others/blas_server.c
   during the Docker build to call pthread_attr_setstacksize(&attr, 8 << 20)
   right after pthread_attr_init. Two grep guards bracket the sed so a
   future upstream refactor fails the Docker build loudly instead of
   silently regressing the runtime.

2) RTS memory return to OS. Add -Fd1.0 (GHC 9.10+) to decay free heap
   blocks back to the OS over ~1 idle period instead of the default 4.0,
   which keeps RSS pinned near peak for minutes after a parsing spike.
   -I30 -> -I0.3 to trigger idle-time major GC promptly; the previous
   30 s deferral hid live-data drops and starved -Fd of free blocks.
   -M kept at 75 % of RAM — re-evaluate once we have RSS curves on the
   8 GB target.

Cherry-picked the cheap RTS subset from #59 (now closed); the OpenBLAS
fix is the load-bearing change for the 8 GB target.
@ccomb ccomb deleted the rts/halve-max-heap-faster-decay branch May 16, 2026 16:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant