Skip to content

sgkdev/page_inject

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

page_inject - AF_ALG aead cross-container escape

AF_ALG aead vulnerability cross-container exploit -- pivot from one compromised container into every sibling container that shares the same libc.so.6 image layer.

This is an escape primitive: it runs from inside an unprivileged container that the attacker has already compromised, and uses the AF_ALG authencesn ESN-rotation 4-byte arbitrary-write bug (CVE-2026-31431) to plant a persistent read() hook in the page-cache pages of libc.so.6. Because Docker / containerd back overlayfs lower-layer files with shared inodes, those pages are visible to every sibling container instantiated from the same image -- the hook fires in their processes too, and the attacker gets command execution inside each one.

Threat model

  • Attacker has shell access to a single container (call it victim) on a host that runs other containers (siblings) from the same image as victim.
  • victim runs with default Docker/k8s posture: unprivileged uid inside the container's user namespace, default seccomp profile, default AppArmor profile, no special capabilities, no host bind mounts.
  • victim has only:
    • read access to its own libc (/usr/lib/x86_64-linux-gnu/libc.so.6 or wherever the distro installs it)
    • the standard socket(AF_ALG, ...) syscall family
    • the standard splice / vmsplice syscalls
    • write access to a directory it can chmod +x (e.g. /tmp)
  • The kernel must be vulnerable to CVE-2026-31431 (any algif_aead + authencesn build prior to the upstream revert fix).

That is all. No special CAP_*, no host filesystem access. The attacker drops a self-contained statically-linked binary inside the container, runs it, and the page-cache corruption -- and therefore the hook -- becomes visible to every sibling.

How the exploit chains together

  1. Page-cache page identity. Inside an overlayfs container, /usr/lib/.../libc.so.6 is served by the lower image layer's ext4 inode. Every container started from the same image shares that backing inode, and the kernel's page cache is keyed by the underlying inode -- not by the overlay or by the namespace. So a single 4-byte write into a page-cache page is visible to all sibling containers' processes that have that page mmap'd.

  2. AF_ALG aead vuln turns one such write into many. algif_aead chains the user RX iovec with the trailing authsize bytes of the spliced TX SGL, and authencesn's ESN rotation parks 4 bytes of the AAD's seq_high field at dst[assoclen + cryptlen] -- which is the first byte of that chained foreign tail. The spliced page is a page-cache page of a file the attacker only has read access to, but the cipher copies bytes into it anyway, with no dirty bookkeeping. (See crypto/algif_aead.c and crypto/authencesn.c for the underlying mechanics.)

  3. Bootstrapping a callable primitive. The first thing page_inject does is bootstrap Zone A -- an asm-encoded re-implementation of the same AF_ALG dance (write_cache.asm), placed inside libc's .text cave. This makes the 4-byte write a regular call from inside any future hook payload, no per-call socket setup needed.

  4. Installing the hook. The injector then writes Zone C (zone_c.asm) into libc's .text cave and patches the first 7-12 bytes of read() with an E9 disp32 jump to it. The prologue's displaced bytes are emulated faithfully in Zone C's fast-path (three different glibc prologues are recognised -- see "Prologue handling" below). The hook is now live in libc's page cache.

  5. Hook propagation. Every sibling container runs processes that call read() constantly (logging daemons, healthchecks, cat /etc/hostname, anything). On the first such call inside a sibling container, the hijacked prologue jumps into Zone C, which:

    • stat("/")s the container's root inode (a stable per-namespace ID), uses it as the container's slot key,
    • scans the slot table for an existing entry with that key,
    • if absent, registers the key and fork()s a long-lived command-loop child that polls the CMD area for orders,
    • returns to read()+N so the caller is none the wiser. The original sibling process keeps running. From now on the attacker has a daemon inside that container.
  6. Command channel. The attacker uses the same page_inject binary in --shell mode to write commands into the slot region's CMD area. Each registered sibling container's hook child polls, forks /bin/sh -c <cmd>, captures stdout/stderr into the OUTPUT area, signals completion, and goes back to polling. The shell shows the output. Because every CMD/OUTPUT write also goes through the vuln primitive, no special privilege is needed.

  7. Unhook. When done, unhook restores read()'s original prologue bytes and zeros the slot table; hook children see an empty slot on their next iteration and self-terminate. The page-cache modifications themselves are clean (the kernel never marked the modified pages dirty), so once every container that has libc mmap'd is stopped, a drop_caches reverts the cache fully -- no on-disk artefact remains.

Building

The injector is built outside the victim container -- typically on the attacker's own development machine -- because most production container images don't ship a compiler. A standard Linux x86_64 dev environment with gcc (with -static-link support) and nasm is enough.

make            # assembles .asm sources via gen_arrays.sh, links static page_inject
make shellcode  # also produces inspectable .bin flat binaries
make clean      # removes generated files and the binary

The output is a single statically-linked ELF (./page_inject) that runs on any modern x86_64 Linux kernel.

Delivery and usage (from the victim container)

Once the attacker has shell on victim, they upload the binary to a writable directory (typically /tmp):

# inside the compromised container, attacker session
victim$ ./page_inject

With no arguments, page_inject defaults to /usr/lib/x86_64-linux-gnu/libc.so.6 (the post-merge Debian/Ubuntu location). For other distros the libc is at a different path; either pass it explicitly or use --root / to scan the built-in lookup table from the container's root:

# Fedora / Rocky / CentOS
victim$ ./page_inject /usr/lib64/libc.so.6

# Arch
victim$ ./page_inject /usr/lib/libc.so.6

# Auto-detect, regardless of distro:
victim$ ./page_inject --root /

Either invocation does the same thing: ELF-parse the in-container libc, install the hook in its page cache, monitor the slot table for ~30 s while siblings register, and run a one-shot id against the first sibling that registered as a sanity check.

After bootstrap, drop into the command shell to drive any registered sibling:

victim$ ./page_inject --shell --no-bootstrap
=== page-cache shell ===
Containers (3):
  [0] 0x0018598d  <- target
  [1] 0x001859ab
  [2] 0x001859cd

inject:0018598d> exec id
uid=0(root) gid=0(root) groups=0(root)
inject:0018598d> target 0x001859ab
inject:001859ab> exec hostname
1ccd66abee9d
inject:001859ab> exec cat /etc/shadow
root:$6$.....
inject:001859ab> unhook
... read() prologue restored, slot table zeroed ...

unhook cleans the hook out of every sibling container in one shot and lets the hook children self-terminate.

Usage: page_inject [OPTIONS] [LIBC_PATH]

Options:
  --root <prefix>   Auto-resolve libc.so.6 under <prefix> using the
                    built-in fixed-path lookup table. Inside the
                    victim container that's normally --root / .
  --shell [0xKEY]   Drop into interactive command shell after
                    injection. Optional KEY pre-selects the target.
  --no-bootstrap    Skip injection (shell-only; hook must already
                    be live in the page cache).
  --timeout SEC     Slot monitoring timeout in --shell mode
                    (default 30 s).
  --help, -h        Show help.

Default libc (when no --root and no LIBC_PATH given):
  /usr/lib/x86_64-linux-gnu/libc.so.6

Dual injection path

Different glibc builds leave different amounts of .text cave space between the executable LOAD segment and the next read-only LOAD. page_inject selects between two layouts at inject time:

  • Path A -- libc-only (default). Both Zone C and Zone A live in libc's .text cave. The slot table + CMD + OUTPUT areas live in libc's .hash section -- legacy SysV hash data that ld.so doesn't read at runtime since it uses .gnu.hash instead. When .hash is absent (Arch's modern toolchain), page_inject carves the slot region out of the tail of .eh_frame_hdr instead, after first shrinking the fde_count field so the unwinder no longer considers the freed bytes part of the FDE binary-search index (the unwinder transparently falls through to a linear scan of .eh_frame for any IP whose FDE used to be in the truncated range -- LSB-mandated behaviour).

  • Path B -- libc trampoline + ld.so payload. Some glibc builds shrink the libc cave below the size needed for the full Zone C + Zone A payload (Ubuntu 24.04 / glibc 2.39 ships an 711 B cave). In that case page_inject writes a 36 B trampoline into libc's cave -- it does the fast-path .bss-key gate intra-libc -- and on the slow path it computes ld.so's runtime base from libc's GOT slot for _rtld_global (an ld.so-side symbol every glibc imports privately) and jumps into a base-register variant of Zone C in ld.so's .text cave. The slot table + CMD + OUTPUT

    • .bss key all stay in libc; the ld.so-side Zone C reaches them through rbp + offset after the trampoline seeds rbp = libc_base.

If neither layout fits, page_inject refuses cleanly without writing anything to libc or ld.so on disk or in the page cache.

read() prologue handling

Different glibc versions emit different opening sequences in read(). The injector recognises each one, reads back the bytes that the hook displaces, and emulates them in Zone C's fast-path so single-threaded read() resumes correctly at read+N:

Glibc range Prologue (after optional endbr64) Notes
2.36 / 2.39 cmpb $0x0, __libc_single_threaded(%rip) 7 bytes; emulated cmpb sets ZF for the original jne .Lthreaded.
2.43 push rbp; movsxd rdi,edi; xor r9d,r9d 7 bytes; emulated byte-for-byte.
2.31 / 2.35 mov eax, fs:[0x18] 8 bytes; emulated byte-for-byte (FS-prefixed [disp32] is absolute, not RIP-relative, so byte-copy is faithful).

The fast-path emulation slot in Zone C is sized for the longest known prologue (8 bytes) plus the 5-byte rel32 jmp; shorter prologues fill the trailing slot byte with a NOP filler so the total slot length is constant.

File layout

page_inject/
  page_inject.c           Main injector: ELF parsing, vuln primitive,
                          dual-path layout selection, inject + unhook.
  zone_c.asm              Path-A hook dispatcher shellcode.
  zone_c_ld.asm           Path-B hook dispatcher (rbp-base variant).
  trampoline.asm          Path-B 36-byte libc-side stub.
  write_cache.asm         Zone A (vuln write primitive shellcode).
  gen_arrays.sh           Assemble .asm -> asm_bytecode.c.
  asm_bytecode.c          [generated] shellcode byte arrays.
  Makefile                Build system.

Tested-and-supported matrix

The exploit has been verified end-to-end on the following container snap distros. Each entry has had page_inject injected from inside one container and seen its hook fire in a sibling container started from the same image; commands executed correctly via the page-cache channel; and unhook restored the libc page state cleanly.

Image glibc Inject path Read() prologue Slot region
debian:bookworm 2.36 A cmpb .hash
ubuntu:24.04 2.39 B cmpb .hash (libc-side, addressed via rbp from ld.so)
ubuntu:22.04 2.35 A TLS-fs .hash
fedora:40 2.39 A cmpb .hash
archlinux:latest 2.43 A push-rbp .eh_frame_hdr (truncated tail)

Operational notes

  • page_inject is statically linked deliberately so the attacker's own process is unaffected by the hook it installs.
  • page_inject recognises an "already hooked" libc (E9 + nops at read()'s prologue) and refuses to re-inject. If you are on a test env and your page cache is stuck in that state, stop all containers using the image and drop_caches to reset.

About

CVE-2026-31431-killed page-cache exploit — code exec into containers sharing the same image layer

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors