chrishayuk · mikeumus · Apr 17, 2026 · Apr 18, 2026 · Apr 18, 2026 · Apr 18, 2026
diff --git a/.github/workflows/harness.yml b/.github/workflows/harness.yml
@@ -0,0 +1,64 @@
+name: isolation-harness
+
+on:
+  push:
+    branches: ["**"]
+  pull_request:
+
+jobs:
+  gates:
+    # Skip forks to avoid secrets leaking to untrusted PRs
+    if: github.event.pull_request.head.repo.full_name == github.repository || github.event_name == 'push'
+    runs-on: ubuntu-latest
+
+    steps:
+      - uses: actions/checkout@v4
+
+      - uses: dtolnay/rust-toolchain@stable
+
+      - uses: Swatinem/rust-cache@v2
+        with:
+          key: harness-v1
+
+      - name: build larql-server
+        run: cargo build --release -p larql-server
+
+      - name: checkout isolation-harness
+        uses: actions/checkout@v4
+        with:
+          repository: Divinci-AI/larql-isolation-harness
+          token: ${{ secrets.HARNESS_REPO_TOKEN }}
+          path: harness
+          ref: main
+
+      - name: build isolation-harness
+        run: cargo build --release --manifest-path harness/Cargo.toml
+
+      - name: start larql-server
+        run: |
+          ./target/release/larql-server testdata/tiny-vindex --port 8787 &
+          echo $! > larql.pid
+          for i in {1..40}; do
+            curl -sf http://localhost:8787/v1/health && break
+            sleep 0.5
+          done
+          curl -sf http://localhost:8787/v1/health || (echo "server failed to start" && exit 1)
+
+      - name: T2 — concurrent read-lock (no serialization)
+        env:
+          LARQL_URL: http://localhost:8787
+        run: ./harness/target/release/isolation-harness concurrent --iterations 200
+
+      - name: T3 — session global-leak isolation
+        env:
+          LARQL_URL: http://localhost:8787
+        run: ./harness/target/release/isolation-harness global-leak
+
+      - name: T5 — patch revert down/up override leak
+        env:
+          LARQL_URL: http://localhost:8787
+        run: ./harness/target/release/isolation-harness revert
+
+      - name: stop larql-server
+        if: always()
+        run: kill $(cat larql.pid) 2>/dev/null || true
diff --git a/.gitignore b/.gitignore
@@ -30,4 +30,7 @@ build/
 # output
 output/
 data/
-experiments/
+experiments/
+vindexes/
+.pids/
+docs/replay/*.bak.bak
diff --git a/AGENTS.md b/AGENTS.md
@@ -13,6 +13,7 @@ Three extraction levels gate which LQL statements work: `browse` (DESCRIBE/WALK/
 Cargo workspace at repo root with a strict dependency chain — respect this when adding modules:
 
 ```
+# LARQL-specific (depend on vindex, LQL, etc.)
 larql-models      model config, architecture traits, weight loading, quant/dequant
     ↓
 larql-compute     CPU/Metal matmul backends, pipeline
@@ -28,8 +29,21 @@ larql-server      HTTP + gRPC server serving vindexes
 larql-cli         top-level `larql` binary (every subcommand lives in commands/)
 larql-python      PyO3 bindings (maturin-built, module name `larql._native`)
 kv-cache-benchmark    standalone benchmark crate
+
+# Portable (no LARQL deps; extract to sibling repo later, name stable)
+model-compute         bounded native kernels (arithmetic/datetime) and optional
+                      wasmtime-hosted WASM modules (features: `native`/`wasm`)
 ```
 
+**`model-compute` never imports `larql-*`.** Dependency flow is one-way:
+LARQL may consume it (e.g. for compile-time `sum(1..100)` resolution); it
+knows nothing about vindex or LQL. When it moves to a sibling repo, the
+name stays the same so imports don't churn. The `install_edge` primitive
+that stamps a compiled edge into gate/up/down tensors lives at
+[crates/larql-cli/src/commands/extraction/compile_cmd/edge.rs](crates/larql-cli/src/commands/extraction/compile_cmd/edge.rs) —
+it's the lowest-level step of the `COMPILE` verb and isn't a separate crate
+until a second consumer needs it.
+
 The CLI is a thin dispatcher: each `larql <cmd>` lives in [crates/larql-cli/src/commands/extraction/](crates/larql-cli/src/commands/extraction/) or [crates/larql-cli/src/commands/query/](crates/larql-cli/src/commands/query/) and is wired into the `Commands` enum in [crates/larql-cli/src/main.rs](crates/larql-cli/src/main.rs). `larql serve` exec's into `larql-server`. `larql repl` and `larql lql` delegate to `larql_lql::run_repl`/`run_statement`.
 
 LQL parser and executor are split symmetrically: [crates/larql-lql/src/parser/](crates/larql-lql/src/parser/) and [crates/larql-lql/src/executor/](crates/larql-lql/src/executor/) both have matching `lifecycle.rs`, `query.rs`, `mutation.rs`, `introspection.rs`, `trace.rs`. When adding a statement, touch the AST in [crates/larql-lql/src/ast.rs](crates/larql-lql/src/ast.rs), then both sides.
@@ -68,15 +82,15 @@ Or via the Makefile: `make python-setup | python-build | python-test | python-cl
 - **Storage is mmap-first.** Gate vectors, embeddings, down weights are zero-copy `mmap`'d. f16 is the default dtype (`--f16` halves size with negligible accuracy loss). Don't load entire tensors into RAM unless an operation requires it.
 - **Three extraction levels, not features.** `browse` (~3 GB), `inference` (~6 GB), `all` (~10 GB) — gated by `ExtractLevel` enum in [crates/larql-vindex/src/config/types.rs](crates/larql-vindex/src/config/types.rs). Check level before attempting an operation; fail loudly if weights aren't present.
 - **Walk FFN is sparse-by-design and can beat dense** (517ms vs 535ms on Gemma 4B) because gate KNN (K≈10) skips most of the 10,240 features per layer. If you touch FFN code, preserve this invariant — see [docs/ffn-graph-layer.md](docs/ffn-graph-layer.md).
-- **MXFP4 quantized MoE (GPT-OSS) has degraded DESCRIBE/WALK** due to 4-bit precision; `INFER` is the supported path. Don't assume all model families are equivalent — see [docs/vindex-operations-spec.md](docs/vindex-operations-spec.md).
+- **MXFP4 quantized MoE (GPT-OSS) has degraded DESCRIBE/WALK** due to 4-bit precision; `INFER` is the supported path. Don't assume all model families are equivalent — see [docs/specs/vindex-operations-spec.md](docs/specs/vindex-operations-spec.md).
 
 ## Where to find things
 
-- LQL language spec: [docs/lql-spec.md](docs/lql-spec.md) (v0.3)
-- Vindex file format: [docs/vindex-format-spec.md](docs/vindex-format-spec.md)
-- Operations + patches: [docs/vindex-operations-spec.md](docs/vindex-operations-spec.md)
-- Ecosystem (HF publish, Vindexfile): [docs/vindex-ecosystem-spec.md](docs/vindex-ecosystem-spec.md)
+- LQL language spec: [docs/specs/lql-spec.md](docs/specs/lql-spec.md) (v0.3)
+- Vindex file format: [docs/specs/vindex-format-spec.md](docs/specs/vindex-format-spec.md)
+- Operations + patches: [docs/specs/vindex-operations-spec.md](docs/specs/vindex-operations-spec.md)
+- Ecosystem (HF publish, Vindexfile): [docs/specs/vindex-ecosystem-spec.md](docs/specs/vindex-ecosystem-spec.md)
 - Inference engine internals: [docs/inference-engine.md](docs/inference-engine.md), [docs/ffn-graph-layer.md](docs/ffn-graph-layer.md)
-- Trace format (.bin/.bndx/.ctxt): [docs/trace-format-spec.md](docs/trace-format-spec.md), [docs/residual-trace.md](docs/residual-trace.md)
+- Trace format (.bin/.bndx/.ctxt): [docs/specs/trace-format-spec.md](docs/specs/trace-format-spec.md), [docs/residual-trace.md](docs/residual-trace.md)
 - Experimental work: [experiments/](experiments/) — numbered 01-07, each self-contained
 - Python bindings docs: [crates/larql-python/README.md](crates/larql-python/README.md), [docs/larql-python.md](docs/larql-python.md)
diff --git a/Cargo.toml b/Cargo.toml
@@ -1,6 +1,7 @@
 [workspace]
 resolver = "2"
 members = [
+    # larql-specific
     "crates/larql-models",
     "crates/larql-compute",
     "crates/larql-core",
@@ -9,8 +10,12 @@ members = [
     "crates/larql-lql",
     "crates/larql-cli",
     "crates/larql-server",
+    "crates/larql-router",
+    "crates/larql-router-protocol",
     "crates/larql-python",
     "crates/kv-cache-benchmark",
+    # portable (extract to sibling repos later, names stable)
+    "crates/model-compute",
 ]
 default-members = [
     "crates/larql-models",
@@ -21,7 +26,10 @@ default-members = [
     "crates/larql-lql",
     "crates/larql-cli",
     "crates/larql-server",
+    "crates/larql-router",
+    "crates/larql-router-protocol",
     "crates/kv-cache-benchmark",
+    "crates/model-compute",
 ]
 
 [workspace.package]