Pull Request: Implement Dynamic Runtime JIT Compiler Core by westkevin12 · Pull Request #15 · DigitalServerHost/ORCHID

westkevin12 · 2026-06-05T17:41:04Z

Description

This PR closes #4 transitioning the ORCHID execution daemon from a static, disk-bound Ahead-of-Time (AOT) compiler to a dynamic, memory-resident Just-In-Time (JIT) compilation subsystem. It completely eliminates the runtime dependency on the GCC compiler toolchain and writing temporary files to disk, while introducing multi-tier SIMD hardware dispatch capability to maximize performance on edge hardware.

Technical Details & Architecture

1. In-Memory JIT Compilation Subsystem (`jit/`)

W^X Security Implementation: Memory pages are allocated dynamically via syscall.Mmap with write permission (PROT_READ|PROT_WRITE). Once the byte instructions are written, the page protection is transitioned to read-execute (PROT_READ|PROT_EXEC) via syscall.Mprotect before calling, ensuring no page is ever concurrently writable and executable.
ABI Bridge (jit_amd64.s): Maps standard Go parameter structs directly to AMD64 SysV ABI registers (RDI, RSI, RDX) and jumps execution directly to the dynamically allocated page pointer.
Dynamic Sizing Integration: The target matrix dimensions ($N$) are programmatically patched directly into the generated binary templates at runtime, allowing zero-stall adaptive compilation.

2. Multi-Tiered Hardware Emitters (`jit_amd64.go`)

Tier 1 (AVX-512): Vectorizes matrix math using 512-bit vector registers (zmm) and 16-way integer strides when native AVX-512 capability is detected.
Tier 2 (AVX2): Vectorizes matrix math using 256-bit vector registers (ymm) and 8-way integer strides. It utilizes direct memory broadcasts (vpbroadcastd (%rdi,%rax,4), %ymm0) to remain 100% VEX-encoded, preventing invalid instruction crashes (SIGILL) on host CPUs lacking AVX-512 support.
Tier 3 (Scalar): Standard optimized pointer loop instructions.
Cross-Platform Fallbacks (jit_arm64.go / jit_other.go): Implements Go-native fallback runners to guarantee execution stability and mathematical parity on non-x86_64 target architectures.

Verification & Performance Metrics

1. Compilation Overhead

The JIT compiler emits and loads the executable kernel in-memory in ~20–55 microseconds, a massive reduction from the AOT compile and plugin loading pipeline which took hundreds of milliseconds.

=== RUN   TestJITCompilationTime
    jit_test.go:98: JIT emission overhead for 256x256 target: 21.77µs
--- PASS: TestJITCompilationTime (0.00s)

2. Locality Benchmark Performance (AVX2 Mode)

Running the locality cache benchmark sweeps under the vectorized JIT engine yields a median speedup of ~11.5x:

Running locality cache benchmark...
HARDWARE TELEMETRY: JIT compiled kernels in 52.611µs. Executing bare-metal blocks via W^X function pointers.
VERIFY equal N=512 operations=134217728 cache_flush_bytes=67108864
PAIR 1 order=flat-first flat_sec=0.227122801 locality_sec=0.025338454 speedup=8.964x
PAIR 2 order=locality-first flat_sec=0.232138852 locality_sec=0.018635167 speedup=12.457x
PAIR 3 order=flat-first flat_sec=0.224651400 locality_sec=0.019496409 speedup=11.523x
PAIR 4 order=locality-first flat_sec=0.221163157 locality_sec=0.018989242 speedup=11.647x
PAIR 5 order=flat-first flat_sec=0.220411676 locality_sec=0.018827020 speedup=11.707x
PAIR 6 order=locality-first flat_sec=0.228881128 locality_sec=0.019836704 speedup=11.538x
PAIR 7 order=flat-first flat_sec=0.243956936 locality_sec=0.021685850 speedup=11.250x
PAIR 8 order=locality-first flat_sec=0.246389530 locality_sec=0.024793514 speedup=9.938x
FLUSH sink=159383552

3. Repository Integration & Checks

Verified that all project unit tests pass: go test -v ./...
Standardized all code header blocks, structs, and methods using Doxygen/Javadoc comment format.
Centralized architectural descriptions within docs/ARCHITECTURE.md and root README.md.

…t matrix kernels

feat: introduce dynamic JIT compiler subsystem for W^X memory-residen…

4c537ce

…t matrix kernels

westkevin12 self-assigned this Jun 5, 2026

westkevin12 added the patch label Jun 5, 2026

westkevin12 merged commit 0953573 into main Jun 5, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Pull Request: Implement Dynamic Runtime JIT Compiler Core#15

Pull Request: Implement Dynamic Runtime JIT Compiler Core#15
westkevin12 merged 1 commit into
mainfrom
feat/dynamic_runtime_compiler

westkevin12 commented Jun 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

westkevin12 commented Jun 5, 2026

Description

Technical Details & Architecture

1. In-Memory JIT Compilation Subsystem (jit/)

2. Multi-Tiered Hardware Emitters (jit_amd64.go)

Verification & Performance Metrics

1. Compilation Overhead

2. Locality Benchmark Performance (AVX2 Mode)

3. Repository Integration & Checks

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

1. In-Memory JIT Compilation Subsystem (`jit/`)

2. Multi-Tiered Hardware Emitters (`jit_amd64.go`)