Skip to content

Pull Request: Implement Dynamic Runtime JIT Compiler Core#15

Merged
westkevin12 merged 1 commit into
mainfrom
feat/dynamic_runtime_compiler
Jun 5, 2026
Merged

Pull Request: Implement Dynamic Runtime JIT Compiler Core#15
westkevin12 merged 1 commit into
mainfrom
feat/dynamic_runtime_compiler

Conversation

@westkevin12
Copy link
Copy Markdown
Member

Description

This PR closes #4 transitioning the ORCHID execution daemon from a static, disk-bound Ahead-of-Time (AOT) compiler to a dynamic, memory-resident Just-In-Time (JIT) compilation subsystem. It completely eliminates the runtime dependency on the GCC compiler toolchain and writing temporary files to disk, while introducing multi-tier SIMD hardware dispatch capability to maximize performance on edge hardware.


Technical Details & Architecture

1. In-Memory JIT Compilation Subsystem (jit/)

  • W^X Security Implementation: Memory pages are allocated dynamically via syscall.Mmap with write permission (PROT_READ|PROT_WRITE). Once the byte instructions are written, the page protection is transitioned to read-execute (PROT_READ|PROT_EXEC) via syscall.Mprotect before calling, ensuring no page is ever concurrently writable and executable.
  • ABI Bridge (jit_amd64.s): Maps standard Go parameter structs directly to AMD64 SysV ABI registers (RDI, RSI, RDX) and jumps execution directly to the dynamically allocated page pointer.
  • Dynamic Sizing Integration: The target matrix dimensions ($N$) are programmatically patched directly into the generated binary templates at runtime, allowing zero-stall adaptive compilation.

2. Multi-Tiered Hardware Emitters (jit_amd64.go)

  • Tier 1 (AVX-512): Vectorizes matrix math using 512-bit vector registers (zmm) and 16-way integer strides when native AVX-512 capability is detected.
  • Tier 2 (AVX2): Vectorizes matrix math using 256-bit vector registers (ymm) and 8-way integer strides. It utilizes direct memory broadcasts (vpbroadcastd (%rdi,%rax,4), %ymm0) to remain 100% VEX-encoded, preventing invalid instruction crashes (SIGILL) on host CPUs lacking AVX-512 support.
  • Tier 3 (Scalar): Standard optimized pointer loop instructions.
  • Cross-Platform Fallbacks (jit_arm64.go / jit_other.go): Implements Go-native fallback runners to guarantee execution stability and mathematical parity on non-x86_64 target architectures.

Verification & Performance Metrics

1. Compilation Overhead

The JIT compiler emits and loads the executable kernel in-memory in ~20–55 microseconds, a massive reduction from the AOT compile and plugin loading pipeline which took hundreds of milliseconds.

=== RUN   TestJITCompilationTime
    jit_test.go:98: JIT emission overhead for 256x256 target: 21.77µs
--- PASS: TestJITCompilationTime (0.00s)

2. Locality Benchmark Performance (AVX2 Mode)

Running the locality cache benchmark sweeps under the vectorized JIT engine yields a median speedup of ~11.5x:

Running locality cache benchmark...
HARDWARE TELEMETRY: JIT compiled kernels in 52.611µs. Executing bare-metal blocks via W^X function pointers.
VERIFY equal N=512 operations=134217728 cache_flush_bytes=67108864
PAIR 1 order=flat-first flat_sec=0.227122801 locality_sec=0.025338454 speedup=8.964x
PAIR 2 order=locality-first flat_sec=0.232138852 locality_sec=0.018635167 speedup=12.457x
PAIR 3 order=flat-first flat_sec=0.224651400 locality_sec=0.019496409 speedup=11.523x
PAIR 4 order=locality-first flat_sec=0.221163157 locality_sec=0.018989242 speedup=11.647x
PAIR 5 order=flat-first flat_sec=0.220411676 locality_sec=0.018827020 speedup=11.707x
PAIR 6 order=locality-first flat_sec=0.228881128 locality_sec=0.019836704 speedup=11.538x
PAIR 7 order=flat-first flat_sec=0.243956936 locality_sec=0.021685850 speedup=11.250x
PAIR 8 order=locality-first flat_sec=0.246389530 locality_sec=0.024793514 speedup=9.938x
FLUSH sink=159383552

3. Repository Integration & Checks

  • Verified that all project unit tests pass: go test -v ./...
  • Standardized all code header blocks, structs, and methods using Doxygen/Javadoc comment format.
  • Centralized architectural descriptions within docs/ARCHITECTURE.md and root README.md.

@westkevin12 westkevin12 self-assigned this Jun 5, 2026
@westkevin12 westkevin12 merged commit 0953573 into main Jun 5, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Dynamic Runtime Assembly Generation (JIT Compiler Core)

1 participant