feat: custom OpenAI-compatible providers (rapid-MLX + Ollama Cloud) with dynamic model discovery#6
feat: custom OpenAI-compatible providers (rapid-MLX + Ollama Cloud) with dynamic model discovery#6cbrown350 wants to merge 11 commits into
Conversation
Adds two new providers to the OpenFusion model selector:
- **rapid-mlx**: Local MLX inference server for Apple Silicon.
Runs on localhost:1234, no API key required.
Ships with 4 popular MLX model entries (Llama 3.2, Qwen 2.5,
Mistral Nemo, DeepSeek R1 Distill).
- **ollama-cloud**: Ollama's hosted cloud API at api.ollama.com.
Requires an API key. Ships with 6 common model entries
(Llama 3.3/3.1, Mistral, Qwen 2.5, DeepSeek R1, Gemma 2).
Architecture:
- src/providers/custom-providers.ts defines provider configs with
model descriptors, baseUrl, api type, and keyless flag.
- pi-ai-bridge.ts augments listProviders() and listModels() to
include custom providers, and registerCustomProviders() registers
model descriptors with pi-ai for resolveModel() to work at fusion
time. Called at startup from both index.ts and ui-only.ts.
- A sentinel API key ('no-key') is injected for keyless providers
via effectiveApiKey(), since pi-ai's OpenAI completions provider
throws on falsy apiKeys. The completeness gate skips keyless
providers when checking for missing API keys.
- Provider API endpoint now returns enriched metadata (name,
description, keyless flag) so the UI can display friendly names
and hide the key input for local providers.
- ApiKeys page shows 'not required' badge for keyless providers
instead of 'missing'.
- Candidates and Judge pages show friendly provider names in
dropdowns instead of raw provider ids.
- 17 new tests covering registration, listing, resolution, keyless
handling, and model descriptor shape.
…ders The initial implementation had fabricated model lists that didn't match reality. This commit replaces them with a proper dynamic discovery system: - Removed all hardcoded model entries from rapid-mlx and ollama-cloud. Models depend on what's actually running/available, not a static list. - Fixed rapid-mlx base URL: localhost:8000 (not :1234, which was wrong). - Fixed ollama-cloud base URL: https://ollama.com/v1 (not api.ollama.com). - Added 'discoverable' flag to CustomProviderDefinition — both custom providers are discoverable via the /v1/models endpoint. - Added GET /api/providers/:provider/discover endpoint that dynamically fetches models from the provider's /v1/models at runtime. - Added registerCustomModel() to pi-ai-bridge for registering discovered or user-typed models on the fly, so resolveModel() works at fusion time. - Candidates and Judge pages now show a text input + datalist for discoverable providers (with a 'Discover' button) instead of a static dropdown. Built-in providers still use the static dropdown. - listModels() returns empty for discoverable providers (their models come from runtime discovery, not a static registry). - 21 tests updated to cover dynamic registration, discovery, and corrected base URLs.
Candidates and Judge pages now show a text input with datalist for discoverable providers (rapid-mlx, ollama-cloud) instead of a static dropdown. A 'Discover' button fetches available models from the provider's /v1/models endpoint at runtime. - ProviderInfo type gains 'discoverable' boolean - Candidates/Judge: discoverable providers get input+datalist+Discover button; built-in providers keep the static select dropdown - Model IDs from discovery are registered with the bridge so they resolve correctly at fusion time
… discovery Both rapid-mlx and ollama-cloud now exclusively use /v1/models discovery to populate their model lists. No hardcoded model entries. - rapid-mlx: models depend on what the user has loaded locally - ollama-cloud: models come from ollama.com's /v1/models endpoint The UI auto-discovers models for providers referenced in the saved config on page load, so the model dropdown shows the saved model immediately instead of 'Focus to load…'.
…or ollama-cloud - Add 'local' field to CustomProviderDefinition: rapid-mlx is local (may be unreachable), ollama-cloud is not (cloud API, always reachable) - UI shows free-text input + Discover button only for local providers - Cloud discoverable providers (ollama-cloud) get a normal model dropdown that fetches models via /v1/models automatically - Eager-load models for all providers on page load so saved models appear immediately instead of showing 'Focus to load…' - Always include saved model as a dropdown option even before model list loads - Backend /models endpoint now returns discovered models for ALL discoverable providers (not empty), so cloud providers show a proper dropdown - /discover endpoint restricted to local providers only (cloud providers don't need a manual retry) - Added test for RAPID_MLX.local === true, OLLAMA_CLOUD.local === false
- Add loadingProvider state to both Candidates and Judge pages - Replace broken Map constructor with typed mergeModelLists helper - Fix ProviderModel typing for saved model options and contextWindow - All 22 tests pass, build clean
Custom providers (rapid-mlx, ollama-cloud) failed at fusion time because resolveModel() couldn't find them — models were only registered when the UI called /api/providers/:provider/models, but a fresh server start had no registrations. - Add registerConfigModels() to pi-ai-bridge: reads candidates + judges from the config and calls registerCustomModel() for each custom provider entry, so resolveModel() works without any prior UI call - Call registerConfigModels() at startup in index.ts and ui-only.ts - Call registerConfigModels() after every config save (PUT /api/config) so newly-added models are immediately resolvable - Add 3 tests covering: model resolution, empty model skip, built-in skip
Addresses the pre-PR review of the custom-providers feature. No engine
(fusion) changes; only the provider bridge, API routes, UI, and tests.
- server/api/providers.ts: surface discovery failures as {models:[],error}
instead of silently returning [] (Constitution V — no silent failures);
route discovery auth through a new discoveryApiKey() helper using
KEYLESS_PROVIDERS, dropping the cross-module `=== "no-key"` sentinel
comparison.
- ui/pages/Candidates + Judge: load models lazily on focus, not eagerly on
every mount (the eager fetch hung the UI up to 10s per unreachable local
provider on each page visit); don't cache an empty list on error so a
retry can fire; restore the "Focus to load…" placeholder; surface the
discovery error via setMsg.
- providers/pi-ai-bridge: remove the registerCustomProviders() no-op
(YAGNI / Constitution VII); document that the keyless sentinel is
module-private and callers must treat effectiveApiKey()'s value as opaque.
- index + ui-only: drop the silent startup try/catch — loadConfig() already
returns emptyConfig for a missing file, so a corrupt config.json now
fails loudly instead of being swallowed.
- Extract mergeModelLists to ui/src/lib/models.ts (was duplicated in
Candidates + Judge).
- config/completeness: document the keyless-provider exemption of the
Constitution VI key check; add isConfigured tests covering keyless-pass,
keyed-fail, and keyed-with-key.
- providers/custom-providers: document the hardcoded contextWindow/
maxTokens/cost defaults, and that the feature covers BOTH a local server
(rapid-mlx) AND a cloud provider (ollama-cloud).
Verified: pnpm typecheck, pnpm build, pnpm test (199 passed, incl. 28
custom-provider tests) all green.
There was a problem hiding this comment.
Code Review
This pull request introduces support for custom providers (specifically the local, keyless rapid-mlx and the cloud-based ollama-cloud) with dynamic model discovery, keyless provider handling in the completeness gate, and automatic registration of referenced models at startup. The UI has been updated to support model discovery and free-text inputs for local providers. The review feedback highlights several critical issues: a potential server crash on startup in registerConfigModels when the configuration is empty, missing defensive checks when parsing model discovery responses, and race conditions in both the candidates and judge pages caused by using a single string state to track concurrent provider loading.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| export function registerConfigModels(config: { candidates: Array<{ provider: string; model: string }>; judges: Array<{ provider: string; model: string }> }): void { | ||
| const entries = [...config.candidates, ...config.judges]; | ||
| for (const { provider, model } of entries) { | ||
| if (CUSTOM_PROVIDERS[provider] && model) { | ||
| registerCustomModel(provider, model); | ||
| } | ||
| } | ||
| } No newline at end of file |
There was a problem hiding this comment.
The registerConfigModels function destructures config.candidates and config.judges directly. However, on a fresh install or first run, loadConfig() can return an empty config object where these properties are undefined. This will cause a TypeError: config.candidates is not iterable and crash the server on startup. Use optional chaining and nullish coalescing to safely handle missing or empty configuration arrays.
| export function registerConfigModels(config: { candidates: Array<{ provider: string; model: string }>; judges: Array<{ provider: string; model: string }> }): void { | |
| const entries = [...config.candidates, ...config.judges]; | |
| for (const { provider, model } of entries) { | |
| if (CUSTOM_PROVIDERS[provider] && model) { | |
| registerCustomModel(provider, model); | |
| } | |
| } | |
| } | |
| export function registerConfigModels(config: { candidates?: Array<{ provider: string; model: string }>; judges?: Array<{ provider: string; model: string }> }): void { | |
| const entries = [...(config?.candidates ?? []), ...(config?.judges ?? [])]; | |
| for (const { provider, model } of entries) { | |
| if (CUSTOM_PROVIDERS[provider] && model) { | |
| registerCustomModel(provider, model); | |
| } | |
| } | |
| } |
| const json = (await resp.json()) as DiscoveryResponse; | ||
| const models = Array.isArray(json.data) ? json.data : []; | ||
| return models.map((m) => m.id).sort(); |
There was a problem hiding this comment.
When parsing the response from the /v1/models endpoint, json or json.data could be null or contain unexpected non-object elements if the custom/local provider returns an invalid or non-compliant response. To prevent runtime crashes, add defensive checks to ensure json is valid and filter the elements to guarantee they are objects with a string id.
const json = (await resp.json()) as DiscoveryResponse | null;
const models = json && Array.isArray(json.data) ? json.data : [];
return models
.filter((m) => m && typeof m === "object" && typeof m.id === "string")
.map((m) => m.id)
.sort();| const [providers, setProviders] = useState<string[]>([]); | ||
| const [providerList, setProviderList] = useState<ProviderInfo[]>([]); | ||
| const [modelsByProvider, setModelsByProvider] = useState<Record<string, ProviderModel[]>>({}); | ||
| const [loadingProvider, setLoadingProvider] = useState<string | null>(null); |
There was a problem hiding this comment.
Using a single string loadingProvider to track the loading state of providers introduces a race condition. If multiple providers are loading concurrently, they will overwrite each other's state, and the first one to finish will set loadingProvider to null, prematurely clearing the loading indicator for others. Additionally, there is no guard to prevent duplicate concurrent requests for the same provider. Using a Record<string, boolean> to track loading states per provider solves both issues.
| const [loadingProvider, setLoadingProvider] = useState<string | null>(null); | |
| const [loadingProviders, setLoadingProviders] = useState<Record<string, boolean>>({}); |
| const loadModels = async (provider: string) => { | ||
| if (modelsByProvider[provider]) return; | ||
| if (modelsByProvider[provider] !== undefined) return; | ||
| setLoadingProvider(provider); | ||
| try { | ||
| const r = await api.getModels(provider); | ||
| // On a discovery failure (auth rejected / server unreachable) the server | ||
| // returns { models: [], error }. Surface the error but DON'T cache the | ||
| // empty list — leaving modelsByProvider[provider] undefined lets a later | ||
| // focus retry, instead of permanently poisoning the cache (a [] is truthy | ||
| // and would short-circuit the guard above forever). | ||
| if (r.error) { | ||
| setMsg(r.error); | ||
| return; | ||
| } | ||
| setModelsByProvider((m) => ({ ...m, [provider]: r.models })); | ||
| } catch { | ||
| /* ignore */ | ||
| } catch (e) { | ||
| setMsg(`Failed to load models for ${provider}: ${(e as Error).message}`); | ||
| } finally { | ||
| setLoadingProvider(null); | ||
| } | ||
| }; |
There was a problem hiding this comment.
Update loadModels to use the new loadingProviders record. This prevents concurrent duplicate requests for the same provider and avoids race conditions when multiple providers are loading.
const loadModels = async (provider: string) => {
if (modelsByProvider[provider] !== undefined || loadingProviders[provider]) return;
setLoadingProviders((lp) => ({ ...lp, [provider]: true }));
try {
const r = await api.getModels(provider);
if (r.error) {
setMsg(r.error);
return;
}
setModelsByProvider((m) => ({ ...m, [provider]: r.models }));
} catch (e) {
setMsg(`Failed to load models for ${provider}: ${(e as Error).message}`);
} finally {
setLoadingProviders((lp) => ({ ...lp, [provider]: false }));
}
};
| const isLocal = pInfo?.local ?? false; | ||
| const models = modelsByProvider[c.provider] ?? []; | ||
| const loaded = modelsByProvider[c.provider] !== undefined; | ||
| const isLoading = loadingProvider === c.provider; |
| const [providers, setProviders] = useState<string[]>([]); | ||
| const [providerList, setProviderList] = useState<ProviderInfo[]>([]); | ||
| const [modelsByProvider, setModelsByProvider] = useState<Record<string, ProviderModel[]>>({}); | ||
| const [loadingProvider, setLoadingProvider] = useState<string | null>(null); |
There was a problem hiding this comment.
Using a single string loadingProvider to track the loading state of providers introduces a race condition. If multiple providers are loading concurrently, they will overwrite each other's state, and the first one to finish will set loadingProvider to null, prematurely clearing the loading indicator for others. Additionally, there is no guard to prevent duplicate concurrent requests for the same provider. Using a Record<string, boolean> to track loading states per provider solves both issues.
| const [loadingProvider, setLoadingProvider] = useState<string | null>(null); | |
| const [loadingProviders, setLoadingProviders] = useState<Record<string, boolean>>({}); |
| const loadModels = async (provider: string) => { | ||
| if (modelsByProvider[provider]) return; | ||
| if (modelsByProvider[provider] !== undefined) return; | ||
| setLoadingProvider(provider); | ||
| try { | ||
| const r = await api.getModels(provider); | ||
| // On a discovery failure the server returns { models: [], error }. Surface | ||
| // the error but DON'T cache the empty list — leaving the cache undefined | ||
| // lets a later focus retry, instead of permanently poisoning it ([] is | ||
| // truthy and would short-circuit the guard above forever). | ||
| if (r.error) { | ||
| setMsg(r.error); | ||
| return; | ||
| } | ||
| setModelsByProvider((m) => ({ ...m, [provider]: r.models })); | ||
| } catch { | ||
| /* ignore */ | ||
| } catch (e) { | ||
| setMsg(`Failed to load models for ${provider}: ${(e as Error).message}`); | ||
| } finally { | ||
| setLoadingProvider(null); | ||
| } | ||
| }; |
There was a problem hiding this comment.
Update loadModels to use the new loadingProviders record. This prevents concurrent duplicate requests for the same provider and avoids race conditions when multiple providers are loading.
const loadModels = async (provider: string) => {
if (modelsByProvider[provider] !== undefined || loadingProviders[provider]) return;
setLoadingProviders((lp) => ({ ...lp, [provider]: true }));
try {
const r = await api.getModels(provider);
if (r.error) {
setMsg(r.error);
return;
}
setModelsByProvider((m) => ({ ...m, [provider]: r.models }));
} catch (e) {
setMsg(`Failed to load models for ${provider}: ${(e as Error).message}`);
} finally {
setLoadingProviders((lp) => ({ ...lp, [provider]: false }));
}
};
| const isLocal = pInfo?.local ?? false; | ||
| const models = modelsByProvider[j.provider] ?? []; | ||
| const loaded = modelsByProvider[j.provider] !== undefined; | ||
| const isLoading = loadingProvider === j.provider; |
Addresses gemini-code-assist review on PR hashangit#6. - registerConfigModels: tolerate a partial/empty config (optional chaining + `?? []`) so a caller passing undefined candidates/judges can't crash the export. NOTE: the described startup crash can't actually occur — loadConfig()'s RawConfigSchema defaults candidates/judges to [] (schema.ts :85-86) — but the hardening is still worth it for a public export. - discoverModels: defensively parse the /v1/models response — tolerate a null body, missing/non-array `data`, and non-object entries; keep only elements shaped like { id: string }. Prevents a malformed/non-compliant provider response from crashing discovery. - Candidates/Judge: replace the single `loadingProvider: string | null` slot with a per-provider `loadingProviders: Record<string, boolean>` so concurrent loads don't clobber each other's loading indicator, and add an in-flight guard so rapid focus/switch can't fire duplicate requests for the same provider. Verified: pnpm typecheck, pnpm build, pnpm test (199 passed) all green.
|
Thanks for the review @gemini-code-assist — all three points addressed in 70c0ab6. Typecheck, build, and the full suite (199 tests) are green. 1. 2. Defensive parsing of 3. No engine ( |
|
Thanks for the update, @cbrown350. The hardening of |
…ager installs) This branch accidentally added package-lock.json (npm's lockfile) in bdcc7f9. The project is pnpm-only (pnpm-lock.yaml + pnpm-workspace.yaml; CONTRIBUTING says 'use pnpm, not npm'). Shipping an npm lockfile misleads users into running `npm install`, which (a) can resolve @types/node differently because transitive deps declare "@types/node": "*", and (b) uses npm's cache, which on Windows can serve a truncated/corrupt file (e.g. perf_hooks.d.ts cut off mid-line, causing tsc TS1010). pnpm's content-addressable store integrity-checks every file, so it won't ship a truncated copy. Also gitignore package-lock.json + ui/package-lock.json so it can't return.
|
Windows build: Root cause was a corrupt/truncated That surfaced a deeper issue: this branch had accidentally committed Fixed in 1bef7ae: removed To recover on Windows (from the repo root): Remove-Item -Recurse -Force node_modules, ui\node_modules
pnpm install
pnpm buildIf a truncated file reappears (corrupt pnpm store entry), clear it first: |
…RE drift Addresses the maintainer's fix-then-merge review on PR hashangit#6 / issue hashangit#5. Major — discoverModels + server error shapes now tested: - tests/custom-providers.test.ts: vi.spyOn(globalThis, "fetch") unit tests for discoverModels — (a) happy 200 {data:[{id}]} sorted + auth header for keyed / none for keyless, (b) data:null / missing data -> [], (c) non-200 throws with status, (d) fetch rejection (abort/timeout) propagates; plus non-object / non-string-id entries are filtered. - tests/providers-api.test.ts: route-level tests for the two server error shapes — GET /models returns {models:[], error} on discovery failure (e.g. 401) and {models:[]} with no error for a 200 with non-array data; GET /discover returns 502 {error} on failure and 404 for a non-local provider. fetch is mocked with URL-based dispatch so the in-process Express test server is hit for real while provider /v1/models URLs return the per-test mock (no real network calls). Nits: - Remove CONTEXT.md from the repo (stray session-notes file, never meant to be committed). Kept locally + gitignored so it can't return. - ARCHITECTURE.md: correct the Constitution VI drift — isConfigured() now reads "every referenced provider that needs a key" with keyless providers exempt (was "all keys present" / "every referenced provider has a key"); update the /api/providers + /api/providers/:p/models rows and add /api/providers/:p/discover. Nit 3 (soft in-flight guard via useRef<Set>) deferred per the owner's guidance — the redundant GET it would prevent is idempotent and the trigger paths don't batch in the same tick. Verified: pnpm typecheck, pnpm build, pnpm test (211 passed, 22 files) green.
|
Thanks @hashangit for the thorough merge-gate review — all three items addressed in f23b903. typecheck, build, and the full suite (211 passed, 22 files) are green. 🟠 Major —
🟡 Nit 1 — 🟡 Nit 2 — 🟡 Nit 3 — deferred per your guidance Ready for re-review when you are. |
Closes #5.
Summary
Adds two OpenAI-compatible providers not in
@earendil-works/pi-ai's static registry, each discovered at runtime via its/v1/modelsendpoint (no hardcoded model lists):rapid-mlx— local MLX inference server on Apple Silicon (http://localhost:8000/v1), keyless. UI: free-text model input + Discover button (server may be down).ollama-cloud— Ollama's hosted cloud API (https://ollama.com/v1), key required. UI: normal dropdown from the live cloud catalog.What changed
New
src/providers/custom-providers.ts— provider definitions (RAPID_MLX,OLLAMA_CLOUD),KEYLESS_PROVIDERS,buildModelDescriptor,discoverModels.ui/src/lib/models.ts— sharedmergeModelListshelper.tests/custom-providers.test.ts— 28 tests (registration, discovery, keyless handling,isConfiguredgate,registerConfigModels).Server
providers/pi-ai-bridge.ts— module-levelmodelOverridesmap +registerCustomModel/registerConfigModels/resolveModeloverride path;effectiveApiKey(keyless sentinel for the completion path only);listProvidersmerges custom ids.server/api/providers.ts— enriched/api/providersmetadata;/modelsdoes live discovery for discoverable providers and returns{models, error?}on failure;/discoverfor local manual retry.config/completeness.ts— Constitution VI gate exempts keyless providers from the key check.server/api/secrets.ts— annotateskeylessprovider ids (masked presence only, unchanged).server/api/test.ts— keyless providers don't require a posted key.server/api/config.ts— re-registers custom models after a config save.index.ts/ui-only.ts— register config-referenced custom models at startup.UI
Candidates.tsx/Judge.tsx— provider dropdowns show names; local providers get free-text + Discover; cloud/built-in get a dropdown with the saved model always visible; models load lazily on focus.api.ts/ApiKeys.tsx—ProviderInfotype; keyless providers show "not required" and hide the key input.Constitution considerations
KEYLESS_PROVIDERSand sends noAuthorizationheader for keyless providers.fusion.tsrouteapiKeythrougheffectiveApiKey).fetch, no new runtime deps; shared helper extracted instead of duplicated.Known limitation
buildModelDescriptorbakes in defaultcontextWindow(131072) /maxTokens(8192) /cost: 0for discovered/typed models — the OpenAI/v1/modelsresponse doesn't carry those fields. The dashboard's per-model context badge may be inaccurate for custom models; fusion correctness is unaffected. Documented in the module header.Verification
pnpm typecheck(strict): ✅pnpm build(tsc + vite): ✅pnpm test: ✅ 199 passed (incl. 28 new custom-provider tests)Review process
This branch went through two rounds of multi-model architectural review (OpenFusion's own fusion panel) plus my own audit. The first pass surfaced must-fix items (eager-load regression, silent discovery errors, no-op dead code, DRY duplication, unverified cloud endpoint) and should-fix items (sentinel leak, silent startup catch, Constitution VI comment/test, undocumented defaults). All were addressed; the second pass found one regression introduced by the error-surfacing fix (caching an empty list on error, blocking retry) which was fixed before opening this PR.