Implement an alternative layout engine #468#477
Implement an alternative layout engine #468#477brenoguim wants to merge 1 commit intoNixOS:masterfrom
Conversation
| [--try-layout-engine]\n\ | ||
| [--only-layout-engine]\n\ |
There was a problem hiding this comment.
Also this will eventually need some manpage text.
There was a problem hiding this comment.
I wonder what is the best approach.
If this is to be merged, perhaps we can avoid documenting it in a first moment until more testing is done.
Then at some point we enable and document it. And further down the line we remove that fallback.
But before any of this happens, I'm currently working on testing.
I'm running some stress test (running Patchelf on all binaries in the machine) and I found a couple of issues both in the current implementation and the layout engine.
Moreover I will go through the existing code making sure that at least one test fails of we disable any of the special ifs in the code.
There was a problem hiding this comment.
It's easier to test and receive if there is documentation. If the interface is not stable than a better approach is to declare this as an unstable interface that may change.
I'm marking this as a draft PR because the main goal is to provide basis to think and discuss. The change is too agressive and so it's wiser to discuss before continuing to invest time in this approach.
In this description I'll talk about what is being done here, the pros and cons, known issues (and known bugs it fixes) and then I'll go into the details of the patch.
This patch create a separate elf-independent class that is responsible to decide where to place sections after resizing. The goal is to create a class that is easily testable, meaning that we can manually write inputs to it and verify the output.
I did not get into the part of adding these tests that would be the highlight of the patch because I wanted to share what I have so far.
Because this class is elf-independent, you can expect to find the following steps in the connection to Patchelf:
elf2layoutfunctionLayoutEngine::resizeLayoutEngine::updateFileLayoutlayout2elfThe
LayoutEnginemethods will always return false should any step fail. That allow us to fallback to the current code if the new code can't properly layout something.Because of this, this patch introduces two switches:
tryLayoutEnginewhich will trigger the new engine and fallback if it fails, andonlyLayoutEnginewhich will error out if the new engine fails. The regression tests are passing withonlyLayoutEngine.The scary part is that all tests are passing even though some things are not implemented:
normalizeNoteSegmentsis not called, so some note segments could get out of sync.note.gnu.propertyand.MIPS.abiflagsfromwriteReplacedSectionsare not calledbinutilsQuirkPaddingis not addedNow on the flip side:
I changed the CI to invoke tests with both current layout engines and the new one
If you made it until here, I'll describe a bit of the idea behind the LayoutEngine.
The interface exposes the following classes and methods:
Section: This is anything occupies space in the file. So, differently from ELF, the header, the program header table, and the section header table are also considered sections. Aside from the obvious fieldsname,type,accessandalign, it also has apinnedfield to indicate that this section should not be moved in the virtual address space.Segment: This is a load segment. The engine deals only with segments that make up the address space.Layout: A group ofSections andSegments.LayoutEngine: This uses thePImplidiom to hide all implementation details from the user and expose truly only the needed methods:constructor,resize,updateFileLayout,layoutandgetVirtualAddress.The best way to see how these objects are used is to look at the methods that create the layout from elf and the ones that read the layout to update the elf structures.
The
LayoutEngineimplementation is based on the following idea:Sections andSegments to build a datastructure that represents the virtual address space.updateFileLayoutis called, the code iterates on the virtual address space structure and updateSectionandSegmentfields.The virtual address space structure is just a vector of
VSegmentwhich can be thought of a segment that was mapped into virtual memory, and eachVSegmentcontains a vector ofVSectionswhich are the sections that were loaded by theVSegment.The whole idea is to focus on the virtual address space because that is what imposes constraints, while file layout is just a byproduct.