Skip to content

Decouple Node Execution Plane to Zero-Dependency Go Daemon#11

Merged
westkevin12 merged 1 commit into
mainfrom
feat/remove_python_from_execution_path
Jun 4, 2026
Merged

Decouple Node Execution Plane to Zero-Dependency Go Daemon#11
westkevin12 merged 1 commit into
mainfrom
feat/remove_python_from_execution_path

Conversation

@westkevin12
Copy link
Copy Markdown
Member

Description

This PR transitions the Project ORCHID production node execution environment from a Python/Nuitka-based runtime to a hardened, zero-dependency Go/C execution plane, fully resolving the architectural goals in Issue #8.

By replacing the sandboxed Python control plane inside the release-hardened container stage with a dynamically compiled Go binary (orchid-daemon), we have drastically minimized container footprint sizes, cut startup latency to under a second, and fortified the runtime sandbox security using non-privileged user executions.

Closes #8


Key Changes

1. Zero-Dependency Go Daemon Execution Core (cmd/orchid-daemon/)

  • Ingestion Plane (main.go): Replaced the legacy Python script bootloader with a native Go TCP listener on port 9000 to swallow JSON planning payloads directly.
  • Math Parity & Simulation: Re-implemented the parallel STREAM-Triad simulation benchmark in Go, validating 100% bit-for-bit equivalence in cycle counters and event logs matching the legacy Python scheduler.
  • Dynamic SIMD Telemetry (matmul_wrapper.go): Deployed a runtime CPUID check linking dynamically to optimized fallback scalar kernels (C-level contiguous loop I-K-J matrices) or native AVX-512 vector assembly.

2. Multi-Stage Container Hardening (Dockerfile)

  • Production Stage (release-hardened): Completely removed Python, virtual environments, and compiler tooling. The final image builds on gcr.io/distroless/base-debian12:nonroot and houses only the static orchid-daemon executable.
  • Developer Sandbox Stage (developer): Preserved the full Python 3.10 + Astral uv environment for local developers to test raw Python SDK packages (orchid/) or bundle distribution wheel artifacts (make dist).
  • Linter Warnings Resolved: Pinned base images to specific tags and wrapped long commands to satisfy strict Hadolint and code quality pipelines.

Verification Logs & Proof of Correctness

1. Go Unit Tests (go test -v ./scheduler/...)

=== RUN   TestBankedSchedulerTriad
    scheduler_test.go:129: VERIFY: Mathematical calculations are 100% identical!
    scheduler_test.go:130: Deterministic Serial Cycles: 4925668
    scheduler_test.go:131: Deterministic Parallel Cycles: 1666401
    scheduler_test.go:132: Theoretical Parallel Speedup achieved in Go: 2.956x
--- PASS: TestBankedSchedulerTriad (0.03s)
=== RUN   TestPhysicalNUMAAllocation
--- PASS: TestPhysicalNUMAAllocation (0.00s)
PASS
ok  	ORCHID/scheduler	(cached)

2. Go Native Diagnostics Sweep (./build/orchid-daemon --mode all)

==========================================
Running parallel bank simulation...
workload: A[i] = B[i] + 3 * C[i] | total_elements=16384
latency_cycles: service=100 | compute=1
verification: identical outputs validated | checksum=-1573459

case                                               cycles      speedup        requests/bank
serial_single_memory                              4931584       1.000x              [49152]
parallel_two_memory_role_split                    3293184       1.498x        [32768,16384]
parallel_three_memory_role_split                  1638501       3.010x  [16384,16384,16384]
parallel_two_memory_conflicted_control            4931584       1.000x            [49152,0]

==========================================
Running CPU locality matrix benchmark...
HARDWARE TELEMETRY: AVX-512 not supported. Dispatching to optimized scalar fallback kernel.
VERIFY equal N=512 operations=134217728 cache_flush_bytes=67108864
PAIR 1 order=flat-first flat_sec=0.233877943 locality_sec=0.058633280 speedup=3.989x
...
PAIR 8 order=locality-first flat_sec=0.225897422 locality_sec=0.058633809 speedup=3.853x
FLUSH sink=159383552

3. Hardened Docker Run (docker run --rm orchid-production:latest)

Successfully boots in <0.1s and reproduces timing traces and cycle calculations inside the container environment.

@westkevin12 westkevin12 self-assigned this Jun 4, 2026
@westkevin12 westkevin12 merged commit 56e5d9a into main Jun 4, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Remove the Python Interpreter Dependency from the Node Execution Path

1 participant