feat: OpenAI-compatible /v1 client and provider routing by quiet-node · Pull Request #218 · quiet-node/thuki

quiet-node · 2026-06-12T21:56:46Z

Overview

This PR makes Thuki's bundled inference engine reachable and adds support for OpenAI-compatible servers: a generic OpenAI-compatible /v1 chat client, a new openai provider kind, and routing of every LLM consumer (chat, search, title generation, warmup, capabilities) by the active provider's kind. Stacked on the engine-runner PR (#217). Fresh installs still default to Ollama at this point in the stack; the default flips in #219.

What changed

openai provider kind: [[inference.providers]] entries with kind = "openai" (label, base_url, model, and a manual vision flag). API keys are stored write-only in the macOS Keychain (keychain.rs), never in config.toml and never returned over IPC.
Legacy-config pin: configs written before the providers list existed are pinned to Ollama on load, regardless of what the default is or becomes. Existing users keep their working setup; only fresh installs follow the compiled default.
/v1 SSE client (openai.rs): streaming chat with content parts, bounded SSE line buffering, cancellation, and typed error classification (EngineUnreachable, EngineStartFailed, model-not-found and auth variants).
Routing by kind (commands.rs): ask_model dispatches builtin chat through the engine runner plus /v1, Ollama through native /api/chat, and openai through /v1 against the provider's base_url. Search and title generation run through a provider-kind LlmTransport; warmup, eviction, the VRAM poller, and capability detection all branch by kind, with a /props probe gating vision for builtin models.

How it works

The active provider is resolved from managed config per request; builtin requests first await ensure_loaded on the engine runner, then stream from the local server like any other /v1 endpoint. Capabilities are cached per (provider_id, model). All three kinds emit the same StreamChunk channel contract, so the frontend streaming path is provider-agnostic.

Testing

Both gates exit 0 at the branch head. The /v1 client is covered against mock servers (SSE framing, oversized lines, missing done markers, cancellation, auth and 404 classification); routing, warmup, and capability branching are covered per kind in backend unit tests.

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>

…port Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>

…mirror Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>

This was referenced Jun 12, 2026

feat: bundled engine runner and model library #217

Merged

Phase 2 follow-ups (deferred cleanups and decisions) #220

Closed

Base automatically changed from feat/engine-runner-and-model-library to main June 13, 2026 23:40

quiet-node added 9 commits June 13, 2026 19:40

feat: accept openai-compatible providers in config

2b18725

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>

feat: pin legacy configs to the Ollama provider during migration

c077212

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>

feat: store provider API keys in the macOS Keychain

704924b

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>

feat: add generic OpenAI-compatible /v1 client

826499b

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>

feat: add EngineStartFailed engine error kind

595037a

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>

feat: route chat by provider kind with built-in engine support

329c791

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>

feat: run search and title generation through the provider-kind trans…

7fd1118

…port Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>

feat: branch warmup, eviction, and capabilities by provider kind

967a5fa

Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>

fix: resolve the chat model from the route and sync the active-model …

a6eaf2d

…mirror Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>

quiet-node force-pushed the feat/v1-client-and-routing branch from f8d5db6 to a6eaf2d Compare June 13, 2026 23:40

quiet-node mentioned this pull request Jun 13, 2026

feat: built-in engine onboarding, model downloads, Settings providers, and default flip #219

Merged

quiet-node merged commit 25fe634 into main Jun 15, 2026
3 checks passed

quiet-node deleted the feat/v1-client-and-routing branch June 15, 2026 22:35

github-actions Bot mentioned this pull request Jun 13, 2026

chore(main): release 0.15.0 #221

Open

This was referenced Jun 16, 2026

chore: remove the llm-box (Ollama-in-Docker) sandbox #223

Closed

refactor: unify model residency control, gate the OpenAI UI, and remove phase jargon #236

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: OpenAI-compatible /v1 client and provider routing#218

feat: OpenAI-compatible /v1 client and provider routing#218
quiet-node merged 9 commits into
mainfrom
feat/v1-client-and-routing

quiet-node commented Jun 12, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

quiet-node commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

What changed

How it works

Testing

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

quiet-node commented Jun 12, 2026 •

edited

Loading