feat: bundled engine runner and model library by quiet-node · Pull Request #217 · quiet-node/thuki

quiet-node · 2026-06-12T21:56:36Z

Overview

Phase 2 of the local inference engine, part 1 of 3 (stacked: this PR, then #218, then #219). Thuki gains a bundled llama.cpp llama-server sidecar and a local model library, so the app can serve chat without any external runtime. This PR lands the engine process management and the model storage/download machinery; nothing routes chat through it yet (that is the next PR in the stack), so user-visible behavior is unchanged.

What changed

Engine runner (src-tauri/src/engine/): a pure residency state machine (Stopped/Starting/Loaded/Stopping/Failed) driven by an actor over a bounded command channel. At most one llama-server process ever exists; a model or context-size change is a new Target and always kill-then-spawn; the process is killed and its exit awaited on app quit. Health is polled over loopback; an optional idle_unload_minutes config field (new, default off) unloads after inactivity.
Model library (src-tauri/src/models/ split into submodules): SQLite manifest of installed models, content-addressed blob store (blobs/<sha256>), resumable downloader (Range requests against pinned Hugging Face revisions, sha256 verification, disk-full/offline/checksum error classification, single download slot), and a curated starters registry (Gemma 3 4B/12B, Phi-4 14B; pinned repo revisions and digests; RAM-fit tiers).
Nine Tauri commands for the model library (list/download/cancel/resume/discard/delete and friends) emitting a typed DownloadEvent channel contract.
Build wiring: scripts/ensure-llama-server.ts fetches the pinned llama.cpp release (b9590), verifies its sha256, rewrites rpaths, ad-hoc re-signs, and installs the binary plus its dylib closure; wired as externalBin + bundled Frameworks in tauri.conf.json, cached in CI, gitignored.

How it works

The runner owns the process lifecycle through message passing; subsystems hold an EngineHandle and await ensure_loaded(target). Downloads write to tmp/<sha256>.partial, resume via Range, verify the digest, then move into the blob store and insert a manifest row; blobs are refcounted across rows so shared files (for example a vision projector) survive deleting one model. The sha256 check is an integrity check (truncation, bit rot, resume corruption); provenance comes from downloading only pinned repo revisions.

Testing

bun run test:all:coverage and bun run validate-build both exit 0 at the branch head (100% coverage gates, frontend and backend). The runner is tested under paused tokio time (idle, restart, crash recovery, single-process invariant); the downloader against local mock servers (resume, cancel during stall, checksum failure, disk-full classification).

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>

quiet-node added 11 commits June 10, 2026 13:22

chore: ignore fetched engine binaries and model files

5ba2f2f

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>

feat: add idle_unload_minutes config field for the built-in engine

eea89b8

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>

feat: add engine residency state machine

dbcc876

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>

feat: add engine runner actor with process trait seam

233e73f

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>

refactor: make models a directory module

3f3b30a

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>

feat: add installed-models manifest table

55622ee

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>

feat: add content-addressed model blob store

0f3f3c6

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>

feat: add resumable model downloader with typed progress events

09917b5

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>

feat: add curated starter model registry with RAM-fit hints

e6f6090

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>

feat: add model download and library Tauri commands

7b45ba7

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>

feat: bundle llama-server sidecar with pinned fetch script and CI cache

d292632

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>

quiet-node mentioned this pull request Jun 13, 2026

Phase 2 follow-ups (deferred cleanups and decisions) #220

Closed

3 tasks

quiet-node added 2 commits June 13, 2026 18:45

docs: explain the llama.cpp version pin and how to bump it

bae191f

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>

ci: fetch the llama-server sidecar in PR backend and build jobs

dee074c

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>

quiet-node merged commit faa82ca into main Jun 13, 2026
3 checks passed

quiet-node deleted the feat/engine-runner-and-model-library branch June 13, 2026 23:40

github-actions Bot mentioned this pull request Jun 13, 2026

chore(main): release 0.15.0 #221

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: bundled engine runner and model library#217

feat: bundled engine runner and model library#217
quiet-node merged 13 commits into
mainfrom
feat/engine-runner-and-model-library

quiet-node commented Jun 12, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

quiet-node commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

What changed

How it works

Testing

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

quiet-node commented Jun 12, 2026 •

edited

Loading