Skip to content

feat: bundled engine runner and model library#217

Merged
quiet-node merged 13 commits into
mainfrom
feat/engine-runner-and-model-library
Jun 13, 2026
Merged

feat: bundled engine runner and model library#217
quiet-node merged 13 commits into
mainfrom
feat/engine-runner-and-model-library

Conversation

@quiet-node

@quiet-node quiet-node commented Jun 12, 2026

Copy link
Copy Markdown
Owner

Overview

Phase 2 of the local inference engine, part 1 of 3 (stacked: this PR, then #218, then #219). Thuki gains a bundled llama.cpp llama-server sidecar and a local model library, so the app can serve chat without any external runtime. This PR lands the engine process management and the model storage/download machinery; nothing routes chat through it yet (that is the next PR in the stack), so user-visible behavior is unchanged.

What changed

  • Engine runner (src-tauri/src/engine/): a pure residency state machine (Stopped/Starting/Loaded/Stopping/Failed) driven by an actor over a bounded command channel. At most one llama-server process ever exists; a model or context-size change is a new Target and always kill-then-spawn; the process is killed and its exit awaited on app quit. Health is polled over loopback; an optional idle_unload_minutes config field (new, default off) unloads after inactivity.
  • Model library (src-tauri/src/models/ split into submodules): SQLite manifest of installed models, content-addressed blob store (blobs/<sha256>), resumable downloader (Range requests against pinned Hugging Face revisions, sha256 verification, disk-full/offline/checksum error classification, single download slot), and a curated starters registry (Gemma 3 4B/12B, Phi-4 14B; pinned repo revisions and digests; RAM-fit tiers).
  • Nine Tauri commands for the model library (list/download/cancel/resume/discard/delete and friends) emitting a typed DownloadEvent channel contract.
  • Build wiring: scripts/ensure-llama-server.ts fetches the pinned llama.cpp release (b9590), verifies its sha256, rewrites rpaths, ad-hoc re-signs, and installs the binary plus its dylib closure; wired as externalBin + bundled Frameworks in tauri.conf.json, cached in CI, gitignored.

How it works

The runner owns the process lifecycle through message passing; subsystems hold an EngineHandle and await ensure_loaded(target). Downloads write to tmp/<sha256>.partial, resume via Range, verify the digest, then move into the blob store and insert a manifest row; blobs are refcounted across rows so shared files (for example a vision projector) survive deleting one model. The sha256 check is an integrity check (truncation, bit rot, resume corruption); provenance comes from downloading only pinned repo revisions.

Testing

bun run test:all:coverage and bun run validate-build both exit 0 at the branch head (100% coverage gates, frontend and backend). The runner is tested under paused tokio time (idle, restart, crash recovery, single-process invariant); the downloader against local mock servers (resume, cancel during stall, checksum failure, disk-full classification).

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant