feat: bundled engine runner and model library#217
Merged
Conversation
Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
3 tasks
Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
This was referenced Jun 13, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Overview
Phase 2 of the local inference engine, part 1 of 3 (stacked: this PR, then #218, then #219). Thuki gains a bundled llama.cpp
llama-serversidecar and a local model library, so the app can serve chat without any external runtime. This PR lands the engine process management and the model storage/download machinery; nothing routes chat through it yet (that is the next PR in the stack), so user-visible behavior is unchanged.What changed
src-tauri/src/engine/): a pure residency state machine (Stopped/Starting/Loaded/Stopping/Failed) driven by an actor over a bounded command channel. At most onellama-serverprocess ever exists; a model or context-size change is a newTargetand always kill-then-spawn; the process is killed and its exit awaited on app quit. Health is polled over loopback; an optionalidle_unload_minutesconfig field (new, default off) unloads after inactivity.src-tauri/src/models/split into submodules): SQLite manifest of installed models, content-addressed blob store (blobs/<sha256>), resumable downloader (Range requests against pinned Hugging Face revisions, sha256 verification, disk-full/offline/checksum error classification, single download slot), and a curated starters registry (Gemma 3 4B/12B, Phi-4 14B; pinned repo revisions and digests; RAM-fit tiers).DownloadEventchannel contract.scripts/ensure-llama-server.tsfetches the pinned llama.cpp release (b9590), verifies its sha256, rewrites rpaths, ad-hoc re-signs, and installs the binary plus its dylib closure; wired asexternalBin+ bundled Frameworks intauri.conf.json, cached in CI, gitignored.How it works
The runner owns the process lifecycle through message passing; subsystems hold an
EngineHandleand awaitensure_loaded(target). Downloads write totmp/<sha256>.partial, resume via Range, verify the digest, then move into the blob store and insert a manifest row; blobs are refcounted across rows so shared files (for example a vision projector) survive deleting one model. The sha256 check is an integrity check (truncation, bit rot, resume corruption); provenance comes from downloading only pinned repo revisions.Testing
bun run test:all:coverageandbun run validate-buildboth exit 0 at the branch head (100% coverage gates, frontend and backend). The runner is tested under paused tokio time (idle, restart, crash recovery, single-process invariant); the downloader against local mock servers (resume, cancel during stall, checksum failure, disk-full classification).