Project ORCHID is the low-level micro-architectural execution core of the RAMNET protocol. It provides the mathematical proof-of-concepts, dynamic assembly generators, and scheduling blueprints required to bypass the digital memory wall and run bare-metal computation at zero-stall efficiency.
Note
Standalone Architecture: While ORCHID was intentionally designed and optimized as the foundational low-level execution engine for the decentralized compute mesh of the RAMNET Protocol, it is engineered as a completely decoupled, standalone layer. Its core scheduler, cache-line saturation modules, and micro-kernel code emitters can be utilized independently across the industry for high-concurrency systems and bare-metal orchestration.
- Concept originator: Teppei Oohira / 大平鉄兵 (@gatchimuchio)
- Designed the initial CPU cache line locality proofs, assembly code generation matrices, and parallel multi-memory bank role-scheduling modules.
- Core Architecture & Maintainer: Kevin West / @westkevin12
- Directs overall system integration, maintains the execution environments, and manages the architectural roadmap for deployment within the RAMNET distributed compute mesh.
The absolute base foundation, research primitives, and original codebase layout can be found preserved on the legacy archive branch:
👉 View the Baseline Concept Code (tree/gatchimuchio-original)
Under identical, mathematically verified logical execution constraints (512x512 matrix size and double-triplicate verification), ORCHID executes in two timing configurations. Standard Mode prioritizes raw bare-metal machine code throughput, while Trace Mode instruments execution boundaries for out-of-band ZK verification (Project VALKYRIE).
| Metric | Standard Mode (Raw Execution) | Trace Mode (Verification Hook Active) |
|---|---|---|
| Minimum Speedup | ||
| Median Speedup | ||
| Maximum Speedup | ||
| Mean Speedup |
The parallel role scheduler (scheduler.go) partitions memory operations into three distinct logical streams (B-read, C-read, A-write) using a simulated STREAM-Triad memory controller queue. Scheduling these operations onto three independent hardware memory banks achieves the absolute theoretical parallel saturation limit:
- Theoretical Maximum: 3.0x cycle reduction due to perfect memory-role serialization elimination.
- Reproduced Efficiency: The Go scheduling model hits exactly 3.000x parallel performance speedup.
Project ORCHID features a Heterogeneous Hardware Dispatch Plane to scale execution guarantees across multiple architectures:
-
Static AOT Assembly Emitters (
orchid/assembler.py): Generates target-specific optimized assembly source code:-
x86_64(AVX-512): 512-bit vector registers with activeprefetcht0preloading. -
arm64(NEON / SVE): NEON registers (v0-v31) withprfm pldl1keepsoftware lookahead prefetching offsets. -
apple_amx(Apple Silicon): Low-level matrix coprocessor wrapper viaamxinit/amxstopinstructions.
-
-
Dynamic JIT Compiler Core (
jit/): Executed natively by the Go daemon, compiling matrix sizes ($N$ ) into memory-resident machine code at runtime. It checks host capabilities to select the optimal path:-
AVX-512JIT Path: Vectorized 16-way integer strides when native AVX-512 is supported. -
AVX2JIT Path: Vectorized 8-way VEX-encoded SIMD utilizing memory-resident broadcasts (vpbroadcastd) to avoid EVEX instruction page collisions on non-AVX-512 x86_64 CPUs. -
ScalarAMD64 JIT Path: Standard pointer execution loops. -
ARM64/OtherFallback: Native Go reference model to maintain execution stability.
-
The JIT compiler strictly enforces Write-XOR-Execute (W^X) memory constraints. Page memory is allocated with write permission (syscall.PROT_WRITE), code is generated, and then the page is transitioned to read-execute (syscall.PROT_EXEC) via syscall.Mprotect before execution.
Project ORCHID is designed to be fully standalone; developers can run the core JIT compiler and parallel scheduling runtime completely independent of any blockchain or verification logic.
To maintain raw execution performance, cryptographic proof generation is decoupled from the hot path. The codebase exports runtime execution statistics and memory pointers via a lightweight, zero-overhead tracing interface. Developers can register their own custom verification layers or plug into Project VALKYRIE, which is ORCHID's default recommended open-source ZK-proving and verification layer.
To ensure professional documentation standards and maintain a clean, readable quickstart guide, Project ORCHID's deep technical designs, mathematical formulations, and nested folder blueprints have been centralized:
👉 Read the Master Architecture Blueprint (docs/ARCHITECTURE.md)
- The Go/Python Hybrid Split: Understanding how the Python client SDK prepares/decomposes graphs and the native Go daemon schedules execution payloads.
- Mathematical Formulations: Technical detail on why loop striding swap-layouts (
I-K-JvsI-J-K) saturate CPU caches, alongside the CADENCE parallel banking role-routing models. - Repository File Blueprint: A detailed responsibility description of every single directory, file, and utility script.
- Continuous Quality Orchestration: How Docker Compose, Astral
uvvirtual environments, and SonarQube static analyzer suites interact to verify system integrity.
Project ORCHID features a top-level Makefile acting as the central developer control panel. Instead of navigating subfolders and invoking standalone shell scripts, use these standardized commands:
Automatically provisions the sandboxed Python 3.10 virtual environment, installs the modular orchid Python SDK in editable development mode (uv pip install -e .), and runs first-run diagnostic verification checks.
make setupExecutes concurrent Go scheduling unit tests, compiles x86-64 assembly locality cache-line saturation benchmarks, and generates parallel banked STREAM-Triad simulation logs.
make testCompiles the high-concurrency Go node scheduler daemon into a standalone, bare-metal native binary at build/orchid-daemon.
make buildBuilds, spins up, and executes the entire multi-language ORCHID stack in isolated Docker containers, volume-syncing generated benchmarks back to your local host filesystem.
make docker-upTip
To run the container network in the background (detached mode), use the -d flag:
docker compose up -d --buildYou can follow and stream the logs live by executing:
docker compose logs -fOr isolate output to a single service (e.g., the cache locality timings):
docker compose logs -f orchid-locality-benchmarkInstantly purges temporary compile targets (locality/build/), telemetry traces (evidence/), and Python __pycache__ artifacts.
make cleanProject ORCHID publishes two distinct, optimized container flavors to the GitHub Container Registry under a single repository space to meet different operational environments:
- Target Stage:
release-hardened - Zero-Dependency Go-Native Architecture: Package holds ONLY the compiled native Go daemon (
orchid-daemon) on top of a minimal, hardeneddistrolessDebian environment (base-debian12:nonroot). - No Python Dependency: Completely eliminates the Python interpreter runtime, virtual environment setup, standard library headers, and Nuitka modules to minimize runtime footprints.
- Maximum Security & Performance: The image runs under a non-privileged user space and loads execution kernels at native compiled CPU speeds with minimized startup latency.
- Target Stage:
developer - Raw Python SDK: Features standard, raw Python code inside the package structure.
- Developer Toolset: Includes the full Astral
uvpackage manager, volume mount options, and system diagnostic sweeps for active engineering.
To ensure a deterministic, high-performance workspace out-of-the-box, Project ORCHID coordinates the following enterprise-grade tooling layers:
The Python control plane is structured as a modular, distributable Python package using the hatchling build-backend. You can build it into wheels (uv build) or import modules programmatically:
from orchid.assembler import Spec, emit_locality- x86-64 micro-kernel code emitter.from orchid.simulator import BankedMemoryScheduler- Stream-Triad memory bank role simulator.from orchid.aggregator import parse_and_summarize- Statistical result parser.
We use Astral uv for lightning-fast Python version lock-in and virtual environment sandboxing. It guarantees that the correct minimum Python version (>= 3.10) is isolated and executed in .venv/ without polluting your global system.
- VS Code Settings: Opening this folder in VS Code automatically reads the pre-configured
.vscode/settings.json, instantly targeting the.venv/bin/pythoninterpreter. - Multi-Language Quality Gates (SonarQube): We use SonarQube for enterprise-grade quality gates and security audits across all of ORCHID's modules (Python, Go, C, and Bash). Standard configuration properties are loaded from
sonar-project.properties. Developers are highly encouraged to install the SonarLint extension in their IDE for live real-time analysis logs.
"Intelligence requires every available joule."