Skip to content

macOS Mach-O: modern format (LC_MAIN, PIE, ad-hoc signing, libSystem) for amd64 and 6l/7l#5

Open
rafael2knokia wants to merge 6 commits into
aryx:masterfrom
rafael2knokia:macho-modern
Open

macOS Mach-O: modern format (LC_MAIN, PIE, ad-hoc signing, libSystem) for amd64 and 6l/7l#5
rafael2knokia wants to merge 6 commits into
aryx:masterfrom
rafael2knokia:macho-modern

Conversation

@rafael2knokia

@rafael2knokia rafael2knokia commented Jun 17, 2026

Copy link
Copy Markdown

Modernizes the macOS Mach-O output of the amd64 (6l) and arm64 (7l)
linkers, taking -H6 from an "old Mac OS X" image (LC_UNIXTHREAD,
raw syscalls, unsigned, absolute addressing) to a current-macOS one:
LC_MAIN entry, position-independent (PIE), code-signed on arm64, and able
to call libSystem instead of issuing raw syscalls.

The amd64 output is verified running end-to-end under
darling (darling shell ./hello prints
Hello, world); the arm64 output is structurally verified (it can't be
executed on the x86 CI host).

What changed

1. Modern load commands + ad-hoc code signing (8b0a61186)

  • 6l/7l: emit LC_MAIN instead of LC_UNIXTHREAD; flag
    MH_DYLDLINK|MH_TWOLEVEL; link /usr/lib/libSystem.B.dylib; add
    LC_DYLD_INFO_ONLY, LC_BUILD_VERSION, LC_UUID.
  • 7l: emit an ad-hoc LC_CODE_SIGNATURE — an embedded SuperBlob with a
    SHA-256 CodeDirectory over the whole image (new self-contained sha256.c,
    two-pass write), since Apple Silicon's kernel (AMFI) requires every binary
    to be signed.

2. PIE / RIP-relative addressing — amd64 (d5e9d2389)

  • MOV $sym(SB) is encoded RIP-relative ([rip+disp32], a D_PCREL reloc)
    instead of an absolute disp32; header sets MH_PIE. Gated on
    HEADTYPE==6, so ELF (-H7) and Plan 9 (-H2) keep the proven absolute
    encoding (Linux output is byte-identical).

3. -I dynamic imports — call libSystem instead of raw syscalls (21020ce6c)

  • New 6l/7l -I got:remote:lib flag: the program defines an 8-byte GOT slot
    got in __DATA and calls through it; the linker emits an LC_DYLD_INFO
    non-lazy bind so dyld resolves remote from lib (libSystem, ordinal 1)
    into that slot at load.
  • tests/s/mini/hello_macos_libc_{amd64,arm64}.s call _write this way.

4. PIE / ADR addressing — arm64 (6c256ad67)

  • omovlit() materialises a symbol address with a PC-relative ADR (±1 MB)
    instead of loading its absolute value from the read-only literal pool, so
    MOV $sym(SB),R and the address half of MOV sym(SB),R become
    slide-independent (the R28/SB base is set via ADR too); header sets
    MH_PIE.

Verification

  • amd64: built with the toolchain and run under darling — both the
    raw-syscall and the libSystem variants load; the libSystem one prints
    Hello, world and exits 0. Disassembly confirms RIP-relative addressing;
    file reports Mach-O 64-bit x86_64 executable, flags:<…|PIE>.
  • arm64: disassembly confirms all addressing is PC-relative (ADR); the
    ad-hoc CodeDirectory page hashes match the file image (AMFI would accept
    it); LC_DYLD_INFO bind opcodes are well-formed.
  • Full tests/s/mini suite (mk all) builds clean; Linux ELF and Plan 9
    targets are unaffected.

Known limitations

  • arm64 is not executed here — darling on the x86 host has no arm64
    emulation. It's structurally complete but should be confirmed on Apple
    Silicon.
  • __PAGEZERO is 1 MB, not Apple's 4 GB. 6l's text layout 32-bit-
    truncates at a 4 GB load address (a separate bug), and 7l's INITTEXT
    is int32. With MH_PIE the loader can slide the image regardless, but
    it's a deviation from a stock macOS binary.
  • Code-only PIE. A program with absolute pointers in __DATA would still
    need rebase opcodes, which aren't emitted yet (the hello-world tests have
    none).
  • Raw-syscall Mach-O binaries run on real macOS (XNU) but not under darling,
    which emulates Darwin in userspace via libSystem — hence the -I/libSystem
    variant for darling.

rafael2knokia and others added 3 commits May 21, 2026 16:22
Ships the full toolchain (compilers, assemblers, linkers, mk, rc, yacc,
acid, emulators) under /usr/lib/goken9cc. The install prefix is baked
into the binaries at build time via -DGOROOT=/usr/lib/goken9cc so the
runtime does not need to export GOROOT, which would otherwise clash
with the Go toolchain.

debian/rules patches lib_core/lib9/mkfile in place to drop the
unconditional ROOTDIR= assignment (mk's parser and recursive MKFLAGS
do not let us override it cleanly from the command line), restores it
on clean, and keeps a .debian-orig backup; both the backup pattern and
the dh build artifacts are gitignored.

/etc/profile.d/goken9cc.sh only appends the host arch's bin/ to PATH
(so Plan 9 cat/ls/grep/... do not shadow the system ones) and sets
MKSHELL, INCLUDE, ccroot to point under the install prefix.

Built and tested on amd64; arm64 is declared but not yet exercised.
Add arm64 Mach-O output to 7l (-H6, previously stubbed out) and modernize
the macOS format produced by both linkers so it targets current macOS
instead of old Mac OS X.

- 6l (amd64) and 7l (arm64): use an LC_MAIN entry instead of LC_UNIXTHREAD,
  flag the binary MH_DYLDLINK|MH_TWOLEVEL, link /usr/lib/libSystem.B.dylib,
  and add LC_DYLD_INFO_ONLY, LC_BUILD_VERSION and LC_UUID.
- 7l: emit an ad-hoc LC_CODE_SIGNATURE (embedded SuperBlob + SHA-256
  CodeDirectory over the whole image, computed in a second pass), since
  Apple Silicon's kernel requires every binary to be signed. Adds a
  self-contained sha256.c. The signing identifier is the output basename.
- Load at a 1MB __PAGEZERO (__TEXT at 0x100000), non-PIE: high enough to
  clear Linux's mmap_min_addr so the image can be mapped by Linux-based
  loaders (verified loading+executing under darling), low enough (< 2GB) for
  the amd64 32-bit-absolute addressing and the classic linker's int32
  INITTEXT. A real 0x100000000 __PAGEZERO would need 64-bit address types.
- tests/s/mini: add hello_macos_arm64.s and build hello_macos_{amd64,arm64}
  with -H6 (without it a non-darwin host emitted ELF, not Mach-O).

Verified structurally with a Mach-O parser; the arm64 CodeDirectory page
hashes match the file image (AMFI would accept it). darling's loader maps
and runs the amd64 binary; full execution there needs libSystem calls
rather than the raw Darwin syscalls these hello-world tests use.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
rafael2knokia and others added 3 commits June 18, 2026 20:43
Make the amd64 macho output position-independent. A symbol memory
reference (LEAQ name(SB), ...) is now encoded RIP-relative ([rip+disp32]
via a D_PCREL reloc) instead of an absolute disp32, and the Mach-O header
sets MH_PIE. The change is gated on HEADTYPE==6, so ELF (-H7) and Plan 9
(-H2) keep the proven absolute SIB encoding (verified: Linux output is
byte-identical).

- span.c asmandsz(): for HEADTYPE==6, emit mod=00,rm=101 (RIP-relative)
  and mark the reloc D_PCREL. doasm() then corrects the addend for any
  immediate emitted after the disp32 (RIP is relative to the end of the
  whole instruction), via the new riprelfix.
- macho.c: OR MH_PIE into the header flags (now 0x200085).

This is code-only PIE: a program with absolute pointers in __DATA would
still need rebase opcodes (not emitted). __PAGEZERO stays 1MB rather than
Apple's 4GB because INITTEXT>=4GB hits a separate 32-bit truncation in
6l's text layout (no code emitted); MH_PIE lets the loader slide the image
regardless.

Verified: macOS entry disassembles to lea rax,[rip+0x3f5] resolving to the
real __DATA address; a no-syscall PIE binary (mov eax,42; ret) loads and
runs under darling and exits 42 via dyld's LC_MAIN return->exit path.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add "6l/7l -I got:remote:lib" to declare a Mach-O dynamic import. The program
defines a GOT pointer slot `got` (8 bytes) in __DATA and calls through it; the
linker emits an LC_DYLD_INFO non-lazy bind opcode stream so dyld resolves
`remote` from `lib` (libSystem, ordinal 1) into that slot at load. This lets
hand-written Mach-O binaries call libc functions rather than issuing raw Darwin
syscalls.

- src/cmd/ld/macho.c (6l): adddynimp() + binduleb(); build the bind stream in
  asmbmacho, append it to __LINKEDIT, and point LC_DYLD_INFO_ONLY bind_off/size
  at it (cflush before the direct write so it doesn't race machowrite's buffer).
- linkers/7l/asm.c (7l): same, placed before the ad-hoc code signature so the
  bind data is covered by the CodeDirectory hashes.
- obj.c (both): -I flag -> adddynimp().
- tests/s/mini/hello_macos_libc_{amd64,arm64}.s: call _write via the GOT slot.

Verified: the amd64 binary runs under darling and prints "Hello, world" (darling
emulates Darwin via libSystem and cannot run the raw-syscall variant). The arm64
binary is structurally complete — bind opcodes well-formed, ad-hoc signature
still valid over the image including the binds — but is runtime-untested here
(no arm64 execution on an x86 host) and still non-PIE (7l uses a literal pool),
so real Apple Silicon would likely need ADRP-based PIE codegen first.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Make arm64 macho output position-independent, the analog of 6l's RIP-relative
change. Under HEADTYPE==6, omovlit() materialises a symbol address with a
PC-relative ADR (dr <- pc + (symaddr-pc), ±1MB) instead of loading its absolute
value from the read-only literal pool (which dyld can't rebase). This covers
both MOV $sym(SB),R (ADR dr,sym) and the address half of MOV sym(SB),R
(ADR REGTMP,sym; LDR R,[REGTMP]); the static base in R28/SB is itself set via
ADR, so SB-relative loads are slide-independent too. The header now sets MH_PIE.

Verified by disassembly: hello_macos_libc_arm64 materialises setSB, msg and
writep entirely PC-relative, and the ad-hoc code signature is still valid over
the image. Still runtime-untested here (no arm64 execution on an x86 host) -
needs Apple Silicon. ELF/Plan 9 are unaffected (gated on HEADTYPE==6).

Also removes a stray debug fprint left in omovlit().

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@rafael2knokia rafael2knokia changed the title macos: emit modern Mach-O (LC_MAIN + ad-hoc signing) from 6l and 7l macOS Mach-O: modern format (LC_MAIN, PIE, ad-hoc signing, libSystem) for amd64 and 6l/7l Jun 19, 2026
@rafael2knokia rafael2knokia marked this pull request as ready for review June 19, 2026 08:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant