Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
074d512
RFC-0001: Mechanistic Fact Editing Commands (crown, edit, memit) (#2)
mikeumus Apr 17, 2026
2324af4
feat(cli): add larql crown command for crown-layer discovery (#3)
mikeumus Apr 18, 2026
7c597f8
feat(cli): larql edit + apply-patch — rank-1 fact editing (Phase B of…
mikeumus Apr 18, 2026
ed369cb
feat(cli): larql memit — batch fact editing (Phase C of RFC-0001) (#8)
mikeumus Apr 18, 2026
186019c
feat(python): PyO3 bindings for crown/edit/apply_patch/memit (Phase D…
mikeumus Apr 18, 2026
44d549b
feat(models): per-layer intermediate_size for Gemma 4 double-wide MLP…
mikeumus Apr 18, 2026
3266558
ci: isolation-harness gates + Gemma4 per-layer intermediate_size + bu…
mikeumus Apr 19, 2026
bdf7e88
Sync with upstream Architecture B (chrishayuk/larql#30) (#13)
mikeumus Apr 22, 2026
845537a
docs(readme): Divinci-AI fork header + badges; test(vindex): regressi…
mikeumus Apr 22, 2026
758a052
feat(server): read --api-key from LARQL_API_KEY env var (#15)
mikeumus Apr 22, 2026
01e5f0d
feat(safetensors): support F8_E4M3 / F8_E5M2 / F8_E8M0 / I8 dtypes
mikeumus Apr 26, 2026
75cc955
chore: remove sed leftover .bak.bak files + gitignore them
mikeumus Apr 26, 2026
ba1cd0c
Merge feat/safetensors-mxfp4-dtypes: F8_E8M0/E4M3/E5M2/I8 dtype support
mikeumus Apr 26, 2026
d724bd4
feat(hf): try Dataset → fall back to Model repo type when fetching vi…
mikeumus Apr 26, 2026
6ee6dfe
Merge feat/hf-resolver-model-fallback: support model-type vindex repos
mikeumus Apr 26, 2026
c376c7a
feat(mxfp4): per-expert dequantization for DeepSeek-V4 layout
mikeumus Apr 26, 2026
a509590
Merge feat/mxfp4-per-expert-dequant: DeepSeek-V4 per-expert MXFP4 unpack
mikeumus Apr 26, 2026
7ebfa8d
feat(hf): metadata-only resolve_hf_vindex (no eager binary downloads)
mikeumus Apr 26, 2026
93b22e4
Merge feat/show-metadata-only-resolve
mikeumus Apr 26, 2026
63b74f3
feat(arch): DeepSeekV4Arch — V4 tensor naming (no model. prefix, ffn,…
mikeumus Apr 26, 2026
c02c3c7
Merge feat/deepseek-v4-arch: DeepSeekV4Arch with V4 tensor naming
mikeumus Apr 26, 2026
64b323f
feat(extract): MXFP4-aware streaming gate_vectors path
mikeumus Apr 26, 2026
4214793
Merge feat/streaming-extract-mxfp4: MXFP4-aware streaming gate path
mikeumus Apr 26, 2026
e591e86
feat(extract): per-expert top-K SVD summary tier for many-experts MoE
mikeumus Apr 26, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
64 changes: 64 additions & 0 deletions .github/workflows/harness.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
name: isolation-harness

on:
push:
branches: ["**"]
pull_request:

jobs:
gates:
# Skip forks to avoid secrets leaking to untrusted PRs
if: github.event.pull_request.head.repo.full_name == github.repository || github.event_name == 'push'
runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v4

- uses: dtolnay/rust-toolchain@stable

- uses: Swatinem/rust-cache@v2
with:
key: harness-v1

- name: build larql-server
run: cargo build --release -p larql-server

- name: checkout isolation-harness
uses: actions/checkout@v4
with:
repository: Divinci-AI/larql-isolation-harness
token: ${{ secrets.HARNESS_REPO_TOKEN }}
path: harness
ref: main

- name: build isolation-harness
run: cargo build --release --manifest-path harness/Cargo.toml

- name: start larql-server
run: |
./target/release/larql-server testdata/tiny-vindex --port 8787 &
echo $! > larql.pid
for i in {1..40}; do
curl -sf http://localhost:8787/v1/health && break
sleep 0.5
done
curl -sf http://localhost:8787/v1/health || (echo "server failed to start" && exit 1)

- name: T2 — concurrent read-lock (no serialization)
env:
LARQL_URL: http://localhost:8787
run: ./harness/target/release/isolation-harness concurrent --iterations 200

- name: T3 — session global-leak isolation
env:
LARQL_URL: http://localhost:8787
run: ./harness/target/release/isolation-harness global-leak

- name: T5 — patch revert down/up override leak
env:
LARQL_URL: http://localhost:8787
run: ./harness/target/release/isolation-harness revert

- name: stop larql-server
if: always()
run: kill $(cat larql.pid) 2>/dev/null || true
5 changes: 4 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -30,4 +30,7 @@ build/
# output
output/
data/
experiments/
experiments/
vindexes/
.pids/
docs/replay/*.bak.bak
26 changes: 20 additions & 6 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ Three extraction levels gate which LQL statements work: `browse` (DESCRIBE/WALK/
Cargo workspace at repo root with a strict dependency chain — respect this when adding modules:

```
# LARQL-specific (depend on vindex, LQL, etc.)
larql-models model config, architecture traits, weight loading, quant/dequant
larql-compute CPU/Metal matmul backends, pipeline
Expand All @@ -28,8 +29,21 @@ larql-server HTTP + gRPC server serving vindexes
larql-cli top-level `larql` binary (every subcommand lives in commands/)
larql-python PyO3 bindings (maturin-built, module name `larql._native`)
kv-cache-benchmark standalone benchmark crate

# Portable (no LARQL deps; extract to sibling repo later, name stable)
model-compute bounded native kernels (arithmetic/datetime) and optional
wasmtime-hosted WASM modules (features: `native`/`wasm`)
```

**`model-compute` never imports `larql-*`.** Dependency flow is one-way:
LARQL may consume it (e.g. for compile-time `sum(1..100)` resolution); it
knows nothing about vindex or LQL. When it moves to a sibling repo, the
name stays the same so imports don't churn. The `install_edge` primitive
that stamps a compiled edge into gate/up/down tensors lives at
[crates/larql-cli/src/commands/extraction/compile_cmd/edge.rs](crates/larql-cli/src/commands/extraction/compile_cmd/edge.rs) —
it's the lowest-level step of the `COMPILE` verb and isn't a separate crate
until a second consumer needs it.

The CLI is a thin dispatcher: each `larql <cmd>` lives in [crates/larql-cli/src/commands/extraction/](crates/larql-cli/src/commands/extraction/) or [crates/larql-cli/src/commands/query/](crates/larql-cli/src/commands/query/) and is wired into the `Commands` enum in [crates/larql-cli/src/main.rs](crates/larql-cli/src/main.rs). `larql serve` exec's into `larql-server`. `larql repl` and `larql lql` delegate to `larql_lql::run_repl`/`run_statement`.

LQL parser and executor are split symmetrically: [crates/larql-lql/src/parser/](crates/larql-lql/src/parser/) and [crates/larql-lql/src/executor/](crates/larql-lql/src/executor/) both have matching `lifecycle.rs`, `query.rs`, `mutation.rs`, `introspection.rs`, `trace.rs`. When adding a statement, touch the AST in [crates/larql-lql/src/ast.rs](crates/larql-lql/src/ast.rs), then both sides.
Expand Down Expand Up @@ -68,15 +82,15 @@ Or via the Makefile: `make python-setup | python-build | python-test | python-cl
- **Storage is mmap-first.** Gate vectors, embeddings, down weights are zero-copy `mmap`'d. f16 is the default dtype (`--f16` halves size with negligible accuracy loss). Don't load entire tensors into RAM unless an operation requires it.
- **Three extraction levels, not features.** `browse` (~3 GB), `inference` (~6 GB), `all` (~10 GB) — gated by `ExtractLevel` enum in [crates/larql-vindex/src/config/types.rs](crates/larql-vindex/src/config/types.rs). Check level before attempting an operation; fail loudly if weights aren't present.
- **Walk FFN is sparse-by-design and can beat dense** (517ms vs 535ms on Gemma 4B) because gate KNN (K≈10) skips most of the 10,240 features per layer. If you touch FFN code, preserve this invariant — see [docs/ffn-graph-layer.md](docs/ffn-graph-layer.md).
- **MXFP4 quantized MoE (GPT-OSS) has degraded DESCRIBE/WALK** due to 4-bit precision; `INFER` is the supported path. Don't assume all model families are equivalent — see [docs/vindex-operations-spec.md](docs/vindex-operations-spec.md).
- **MXFP4 quantized MoE (GPT-OSS) has degraded DESCRIBE/WALK** due to 4-bit precision; `INFER` is the supported path. Don't assume all model families are equivalent — see [docs/specs/vindex-operations-spec.md](docs/specs/vindex-operations-spec.md).

## Where to find things

- LQL language spec: [docs/lql-spec.md](docs/lql-spec.md) (v0.3)
- Vindex file format: [docs/vindex-format-spec.md](docs/vindex-format-spec.md)
- Operations + patches: [docs/vindex-operations-spec.md](docs/vindex-operations-spec.md)
- Ecosystem (HF publish, Vindexfile): [docs/vindex-ecosystem-spec.md](docs/vindex-ecosystem-spec.md)
- LQL language spec: [docs/specs/lql-spec.md](docs/specs/lql-spec.md) (v0.3)
- Vindex file format: [docs/specs/vindex-format-spec.md](docs/specs/vindex-format-spec.md)
- Operations + patches: [docs/specs/vindex-operations-spec.md](docs/specs/vindex-operations-spec.md)
- Ecosystem (HF publish, Vindexfile): [docs/specs/vindex-ecosystem-spec.md](docs/specs/vindex-ecosystem-spec.md)
- Inference engine internals: [docs/inference-engine.md](docs/inference-engine.md), [docs/ffn-graph-layer.md](docs/ffn-graph-layer.md)
- Trace format (.bin/.bndx/.ctxt): [docs/trace-format-spec.md](docs/trace-format-spec.md), [docs/residual-trace.md](docs/residual-trace.md)
- Trace format (.bin/.bndx/.ctxt): [docs/specs/trace-format-spec.md](docs/specs/trace-format-spec.md), [docs/residual-trace.md](docs/residual-trace.md)
- Experimental work: [experiments/](experiments/) — numbered 01-07, each self-contained
- Python bindings docs: [crates/larql-python/README.md](crates/larql-python/README.md), [docs/larql-python.md](docs/larql-python.md)
8 changes: 8 additions & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
[workspace]
resolver = "2"
members = [
# larql-specific
"crates/larql-models",
"crates/larql-compute",
"crates/larql-core",
Expand All @@ -9,8 +10,12 @@ members = [
"crates/larql-lql",
"crates/larql-cli",
"crates/larql-server",
"crates/larql-router",
"crates/larql-router-protocol",
"crates/larql-python",
"crates/kv-cache-benchmark",
# portable (extract to sibling repos later, names stable)
"crates/model-compute",
]
default-members = [
"crates/larql-models",
Expand All @@ -21,7 +26,10 @@ default-members = [
"crates/larql-lql",
"crates/larql-cli",
"crates/larql-server",
"crates/larql-router",
"crates/larql-router-protocol",
"crates/kv-cache-benchmark",
"crates/model-compute",
]

[workspace.package]
Expand Down
Loading