feat(live-fork, memfd): Back Mem Snapshot with Hugepages#230
feat(live-fork, memfd): Back Mem Snapshot with Hugepages#230theflashwin wants to merge 1 commit into
Conversation
|
Hey @theflashwin — read through the full diff. This is solid work, and unusually complete for a first PR: you didn't just ship the memfd flag, you plumbed it end-to-end through the REST API, CLI, Python SDK, TS SDK, doctor check, and a 430-line bench harness. That's senior-level scope. A few things I particularly liked:
A handful of small things — all fine to address in this PR or as follow-ups, none blocking:
Other than items 1–3 (5 minutes of polish), this is mergeable as-is. The architecture is sound, the SAFETY discipline is real, and the bench tells the story. You'll be credited in the v0.5.2 release notes (which this'll likely cut, since it's the first material feature post-v0.5.1). Welcome aboard. 🚀 |
d89316d to
065c355
Compare
|
Hi @WaylandYang ! Thanks for the input, let me know if there's anything else to change! Also, note on N=100 benchmarking, I don't have enough compute to test this out, but am very curious myself. |
|
@theflashwin nice — thanks for the quick reply! Two things on my side:
Once CI lands and the N=100 numbers are in, I think we can merge as-is (the typo / |
PR for #6.
Summary
During the branch command, we want to minimize the amount of time the parent VM is paused for us to copy over the memory to a new memory snapshot. The parent VM is paused in two places:
Copying over the VM state is infungible but takes a very small amount of time (<10 ms), while coping over the RAM is an intensive process. A big contributor to this delay is that there is high TLB pressure because we have to walk the entire VM's memory. To mitigate this, we back this copying process with huge pages.
Changes:
use_hugepagesboolean flag that enables mem_fd syscall to be calledlibc::MFD_HUGETLBcopy_via_mmapfunction because hugepages cannot be written to using thewrite()syscall, so we created this function to workaround this fact.Testing
5 new tests added to the existing memfd::tests module:
alloc_size returns an InvalidInput error immediately, without touching any mmap.
copy_via_mmap directly into a plain (non-hugetlb) memfd, reads back through the fd and asserts
byte-for-byte equality.
size_bytes() returns the original file size (not the hugepage-aligned alloc_size), and asserts
backend_path() has the correct /proc//fd/ format. Skips gracefully if HugePages_Free=0.
but with use_hugepages=true. Verifies the copy_via_mmap path (used for hugetlb memfds) produces
identical bytes to the source
Asserts region.size_bytes() returns 4096, not the 2 MiB-aligned alloc_size.
The three hugepages tests check HugePages_Free at runtime and eturn early with an eprintln! hint if the pool isn't available.
Also added a forkd doctor check to verify hugepage allocation.
Benchmarking
pause_ms)
MiB source)
Results: