You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Tracking issue for follow-ups surfaced while building and reviewing the Phase 2 local inference engine stack (#217, #218, #219). These are deliberately kept out of those PRs to avoid scope creep; each can land on its own. The whole feature is unreleased (release held until Phase 3), so config-shape changes captured here are still cheap and migration-free.
Append new items to the checklist as they come up during review.
Follow-ups
1. Unify the two model-residency knobs into one field (details below)
2. Hide the OpenAI provider kind behind a compile-time flag, off by default, before release (details below)
3. Remove internal phase jargon (Phase 1/2/3 and Phase A/B) from shipping code, comments, and tests (details below)
1. Unify the two model-residency knobs into one field
Problem. The residency timeout is configured by two separate fields that mean the same thing but behave inconsistently:
Field
Provider
0 means
-1 means
Enforced by
keep_warm_inactivity_minutes (i32)
Ollama
defer to Ollama's own default (~5 min)
keep resident forever
per-request keep_alive pass-through
idle_unload_minutes (u32)
built-in
keep resident forever
n/a (unsigned)
engine runner idle timer
The same value 0 does opposite things across the two fields, and neither name says which provider it governs. The schema even needs a comment on one field pointing at the other just to disambiguate them.
Proposed fix. Collapse onto the better-designed scheme (the Ollama one):
Keep a single field, keep_warm_inactivity_minutes (already on main, already signed with the sentinel scheme), and remove idle_unload_minutes.
Make the built-in engine obey the same field. Translate at the boundary that already forwards config changes to the runner: -1 to no idle timer (forever), 0 to the natural short default (~5 min), N to N minutes. The runner's internal "0 = disabled" convention stays an implementation detail behind that one small, testable translator. The Ollama path does not change.
Collapse the two conditional Settings controls into one "keep model warm" control, shown for built-in and Ollama, not applicable for OpenAI (remote server; Thuki does not manage its residency).
Update docs/configurations.md: merge the two rows, update the example TOML.
Unified semantics: -1 keep resident forever; 0 use the provider's natural short default (about 5 minutes: for Ollama this defers to Ollama, for built-in Thuki applies its own 5-minute timer); N keep warm for N minutes of inactivity, then unload.
Decision needed: the built-in default. Today built-in defaults to "resident forever" (idle_unload_minutes = 0). Under the unified scheme 0 means "~5 min", so a 0 default would change built-in to unload after about 5 minutes idle by default. Recommendation: default to 0 (~5 min). Built-in is now the default provider on every fresh install, and pinning multiple GB of RAM resident forever with no setting touched is too aggressive on a 16 GB Mac; ~5 minutes keeps it instant during an active session and frees RAM when idle, and power users can set -1. Alternative: default -1 to preserve the current "always resident" behavior.
Tradeoff. A single field cannot express different residency for Ollama vs built-in at the same time. Judged YAGNI for a single-active-provider app.
Ollama consumer (keep_alive): src-tauri/src/lib.rs and src-tauri/src/commands.rs (the if keep_warm_inactivity_minutes == 0 branches), warmup::keep_alive_string.
Built-in consumer (runner timer): src-tauri/src/settings_commands.rs (forward_idle_unload_minutes, idle_unload_minutes_changed) to engine.set_idle_minutes, plus startup seeding in lib.rs.
Settings UI: src/settings/tabs/ProviderCards.tsx and ModelTab.tsx.
Why deferred: it is a clean, self-contained change, but it threads through all three stacked PRs; keeping it separate keeps those diffs focused on the feature. Best done before the Phase 3 release, while the idle_unload_minutes config value has not yet shipped to users.
2. Hide the OpenAI provider kind behind a compile-time flag before release
Problem. PR #218 and #219 fully build and 100%-test the openai provider kind: the ability to connect Thuki to any OpenAI-compatible server (LM Studio, Jan, a remote llama-server), including the /v1 routing arm, keychain.rs API-key storage, and the Settings add/edit/remove-provider UI (ProviderCards.tsx). Product decision (2026-06-15): Thuki ships local-only, the bundled built-in engine plus local or remote Ollama. We do not want the OpenAI path visible to end users at ship time, but the code is built, reviewed, and fully tested, so we keep it rather than removing it (removal would be real surgery, mostly across #219's Settings UI).
Gate the path behind a compile-time, dev-only flag, default OFF. Shipped builds exclude the affordance entirely; only developers can enable it at build time. Nothing appears in config.toml or the docs.
Scope: UI only. Gate only the user-facing Settings affordance(s) that let a user create or manage an openai provider. The backend (config acceptance of the kind, the /v1 routing arm, keychain.rs) stays active and tested. Rationale: with no UI to create an openai provider, no end user can ever reach one, so nothing downstream (model picker, chat) surfaces it; the dormant, tested backend stays as-is.
Required before release. Must land before the Phase 3 release that would otherwise expose the path to end users.
Note: this gates the openaiprovider kind, not the /v1 client itself. The bundled built-in engine speaks /v1 and depends on that client, so it must stay fully active; built-in chat is unaffected.
Implementation sketch.
A frontend build-time constant (for example a Vite define global or an import.meta.env flag), tree-shaken out when false, gating the "add OpenAI provider" affordance in src/settings/tabs/ProviderCards.tsx and any openai-only controls (base-URL edit, API-key field, vision flag). Default false in production builds.
No Rust changes required under the UI-only scope.
Coverage: the 100% gate must still hold with the flag off. Decide the approach when implementing (for example, drive the constant so tests exercise the enabled branch, or cover both branches).
Accepted edge. Under a UI-only gate, a hand-edited config.toml that already contains an openai provider would still be honored by the backend. Acceptable: an end user cannot reach that state (no UI to create one), and it is a developer-only scenario.
Backend that stays live (do not gate): src-tauri/src/openai.rs (the /v1 client, shared with built-in), src-tauri/src/keychain.rs, the openai routing arm in src-tauri/src/commands.rs, the openai kind in src-tauri/src/config/{schema,loader}.rs.
Backend CRUD for openai providers: src-tauri/src/settings_commands.rs.
Why deferred: the code is intentionally kept whole and tested in #218/#219; the flag is a small, focused frontend change best made on its own. Must land before the Phase 3 release so the path is never exposed to end users.
3. Remove internal phase jargon from shipping code, comments, and tests
Problem. The codebase leaks two internal phasing vocabularies that mean nothing to a developer reading the repo cold:
"Phase 1/2/3": the local-inference-engine project phases (e.g. "Phase 2 bundles the engine", "Phase 3 onboarding gate").
"Phase A/B": an older capability-feature phasing (compose vs history capability checks), unrelated to the engine.
Comments and section headers should describe the functionality or technical behavior, not the internal delivery phase. The docs (docs/configurations.md, CLAUDE.md) already drop the phase wording on the #219 branch; this item covers the code, comments, and tests, which no PR cleans.
Scope. Reword these to describe the behavior instead of the phase:
src/utils/capabilityConflicts.ts (post-Phase-A, Phase B) and src/utils/__tests__/capabilityConflicts.test.ts.
src/utils/sanitizeAssistantContent.ts (pre-Phase-B) and its test.
src-tauri/src/models/mod.rs ("Phase 3 onboarding gate", "full detection is Phase 3").
src-tauri/src/settings_commands/tests.rs ("builtin active since Phase 2").
src-tauri/src/screenshot.rs: uses "Phase 1 / Phase 2" as local algorithm steps; reword to "Step 1 / Step 2" to avoid the vocabulary clash.
Leave alone (legitimate, not jargon): the animation phase in src/App.tsx (morph state) and the download mmproj phase in DownloadProgress.test.tsx (a real download stage).
Why one pass. The references span pre-existing main code and the stacked PRs and are purely cosmetic (comments, a couple of section headers, test labels). Cleanest as one focused sweep landed around the rest of the stack. Verify the docs are phase-free once #219 merges.
Tracking issue for follow-ups surfaced while building and reviewing the Phase 2 local inference engine stack (#217, #218, #219). These are deliberately kept out of those PRs to avoid scope creep; each can land on its own. The whole feature is unreleased (release held until Phase 3), so config-shape changes captured here are still cheap and migration-free.
Append new items to the checklist as they come up during review.
Follow-ups
1. Unify the two model-residency knobs into one field
Problem. The residency timeout is configured by two separate fields that mean the same thing but behave inconsistently:
0means-1meanskeep_warm_inactivity_minutes(i32)keep_alivepass-throughidle_unload_minutes(u32)The same value
0does opposite things across the two fields, and neither name says which provider it governs. The schema even needs a comment on one field pointing at the other just to disambiguate them.Proposed fix. Collapse onto the better-designed scheme (the Ollama one):
keep_warm_inactivity_minutes(already onmain, already signed with the sentinel scheme), and removeidle_unload_minutes.-1to no idle timer (forever),0to the natural short default (~5 min),Nto N minutes. The runner's internal "0 = disabled" convention stays an implementation detail behind that one small, testable translator. The Ollama path does not change.docs/configurations.md: merge the two rows, update the example TOML.Unified semantics:
-1keep resident forever;0use the provider's natural short default (about 5 minutes: for Ollama this defers to Ollama, for built-in Thuki applies its own 5-minute timer);Nkeep warm for N minutes of inactivity, then unload.Decision needed: the built-in default. Today built-in defaults to "resident forever" (
idle_unload_minutes = 0). Under the unified scheme0means "~5 min", so a0default would change built-in to unload after about 5 minutes idle by default. Recommendation: default to0(~5 min). Built-in is now the default provider on every fresh install, and pinning multiple GB of RAM resident forever with no setting touched is too aggressive on a 16 GB Mac; ~5 minutes keeps it instant during an active session and frees RAM when idle, and power users can set-1. Alternative: default-1to preserve the current "always resident" behavior.Tradeoff. A single field cannot express different residency for Ollama vs built-in at the same time. Judged YAGNI for a single-active-provider app.
Pointers.
src-tauri/src/config/schema.rs(both fields),src-tauri/src/config/defaults.rs(DEFAULT_*/BOUNDS_*,ALLOWED_FIELDS),src-tauri/src/config/loader.rs(clamps).src-tauri/src/lib.rsandsrc-tauri/src/commands.rs(theif keep_warm_inactivity_minutes == 0branches),warmup::keep_alive_string.src-tauri/src/settings_commands.rs(forward_idle_unload_minutes,idle_unload_minutes_changed) toengine.set_idle_minutes, plus startup seeding inlib.rs.src/settings/tabs/ProviderCards.tsxandModelTab.tsx.Why deferred: it is a clean, self-contained change, but it threads through all three stacked PRs; keeping it separate keeps those diffs focused on the feature. Best done before the Phase 3 release, while the
idle_unload_minutesconfig value has not yet shipped to users.2. Hide the OpenAI provider kind behind a compile-time flag before release
Problem. PR #218 and #219 fully build and 100%-test the
openaiprovider kind: the ability to connect Thuki to any OpenAI-compatible server (LM Studio, Jan, a remotellama-server), including the/v1routing arm,keychain.rsAPI-key storage, and the Settings add/edit/remove-provider UI (ProviderCards.tsx). Product decision (2026-06-15): Thuki ships local-only, the bundled built-in engine plus local or remote Ollama. We do not want the OpenAI path visible to end users at ship time, but the code is built, reviewed, and fully tested, so we keep it rather than removing it (removal would be real surgery, mostly across #219's Settings UI).Decision (ratified 2026-06-15).
config.tomlor the docs.openaiprovider. The backend (config acceptance of the kind, the/v1routing arm,keychain.rs) stays active and tested. Rationale: with no UI to create anopenaiprovider, no end user can ever reach one, so nothing downstream (model picker, chat) surfaces it; the dormant, tested backend stays as-is.Note: this gates the
openaiprovider kind, not the/v1client itself. The bundled built-in engine speaks/v1and depends on that client, so it must stay fully active; built-in chat is unaffected.Implementation sketch.
defineglobal or animport.meta.envflag), tree-shaken out when false, gating the "add OpenAI provider" affordance insrc/settings/tabs/ProviderCards.tsxand any openai-only controls (base-URL edit, API-key field, vision flag). Default false in production builds.Accepted edge. Under a UI-only gate, a hand-edited
config.tomlthat already contains anopenaiprovider would still be honored by the backend. Acceptable: an end user cannot reach that state (no UI to create one), and it is a developer-only scenario.Pointers.
src/settings/tabs/ProviderCards.tsx(add/edit/remove openai provider, API-key field, vision flag).src-tauri/src/openai.rs(the/v1client, shared with built-in),src-tauri/src/keychain.rs, theopenairouting arm insrc-tauri/src/commands.rs, theopenaikind insrc-tauri/src/config/{schema,loader}.rs.src-tauri/src/settings_commands.rs.Why deferred: the code is intentionally kept whole and tested in #218/#219; the flag is a small, focused frontend change best made on its own. Must land before the Phase 3 release so the path is never exposed to end users.
3. Remove internal phase jargon from shipping code, comments, and tests
Problem. The codebase leaks two internal phasing vocabularies that mean nothing to a developer reading the repo cold:
Comments and section headers should describe the functionality or technical behavior, not the internal delivery phase. The docs (
docs/configurations.md,CLAUDE.md) already drop the phase wording on the #219 branch; this item covers the code, comments, and tests, which no PR cleans.Scope. Reword these to describe the behavior instead of the phase:
src/utils/capabilityConflicts.ts(post-Phase-A, Phase B) andsrc/utils/__tests__/capabilityConflicts.test.ts.src/utils/sanitizeAssistantContent.ts(pre-Phase-B) and its test.src/components/ChatBubble.tsx(pre-Phase-B history).src/App.tsx("Phase B history-based capability strip").src-tauri/src/commands.rs("Phase B picker hint", "pre-routing Phase 1 payload").src-tauri/src/config/defaults.rs,loader.rs,tests.rs(Phase 1/2 provider-seeding comments).src-tauri/src/models/mod.rs("Phase 3 onboarding gate", "full detection is Phase 3").src-tauri/src/settings_commands/tests.rs("builtin active since Phase 2").src-tauri/src/screenshot.rs: uses "Phase 1 / Phase 2" as local algorithm steps; reword to "Step 1 / Step 2" to avoid the vocabulary clash.Leave alone (legitimate, not jargon): the animation
phaseinsrc/App.tsx(morph state) and the downloadmmproj phaseinDownloadProgress.test.tsx(a real download stage).Why one pass. The references span pre-existing
maincode and the stacked PRs and are purely cosmetic (comments, a couple of section headers, test labels). Cleanest as one focused sweep landed around the rest of the stack. Verify the docs are phase-free once #219 merges.