fix(vscode): show model select for Kilo indexing provider#10174
fix(vscode): show model select for Kilo indexing provider#10174marius-kilocode wants to merge 12 commits into
Conversation
The Kilo embedding model catalog is Cloud-managed; the indexing settings panel should pick from a Select, never a free-text input. Hide the generic 'Embedding model' TextField and 'Dimension' row when the provider is Kilo, and keep the Select rendered (disabled, 'Loading models…') while the catalog is empty so users never see a freetext fallback. Also harden the catalog request path so transient failures don't poison the UI: the webview retries while the catalog stays empty (mirrors the indexing.tsx pattern), and the extension only caches non-empty catalogs and replays the last good one before re-fetching.
| let retries = 0 | ||
| const timer = setIntervalFn(() => { | ||
| retries++ | ||
| if (!isEmptyKiloEmbeddingCatalog(opts.getCatalog()) || retries >= maxRetries) { |
There was a problem hiding this comment.
WARNING: Off-by-one in retry count — KILO_EMBEDDING_MAX_RETRIES retries are not actually sent
The interval callback increments retries before the guard check. When retries reaches maxRetries (5) on the 5th tick, the condition is immediately true and the timer is cleared without posting. So only 4 retry requests are sent (ticks 1–4), not 5 as the constant name implies.
Trace:
- Tick 1: retries=1, 1≥5? No → post (total=2)
- Tick 2: retries=2, 2≥5? No → post (total=3)
- Tick 3: retries=3, 3≥5? No → post (total=4)
- Tick 4: retries=4, 4≥5? No → post (total=5)
- Tick 5: retries=5, 5≥5? Yes → clear, return (no post)
If the intent is exactly MAX_RETRIES retry requests, the guard should be retries > maxRetries (or post before incrementing). If the intent is "fire the interval at most MAX_RETRIES times regardless", renaming to KILO_EMBEDDING_MAX_RETRY_TICKS would clarify, but the comment in indexing.tsx (the mirrored pattern) also implies 5 actual requests.
| if (!isEmptyKiloEmbeddingCatalog(opts.getCatalog()) || retries >= maxRetries) { | |
| if (!isEmptyKiloEmbeddingCatalog(opts.getCatalog()) || retries > maxRetries) { |
| for (let i = 0; i < KILO_EMBEDDING_MAX_RETRIES + 5; i++) timers.tick() | ||
|
|
||
| // Initial request + at most MAX_RETRIES re-posts. | ||
| expect(h.posted.length).toBeLessThanOrEqual(1 + KILO_EMBEDDING_MAX_RETRIES) |
There was a problem hiding this comment.
WARNING: Weak assertion — toBeLessThanOrEqual would pass even with zero retries
toBeLessThanOrEqual(6) is satisfied by any value from 0 to 6, meaning a regression that eliminates all retries would still pass this test. Since the behavior is deterministic with fake timers, use an exact equality assertion. Based on the current implementation (4 retries actually sent), the correct value is 1 + KILO_EMBEDDING_MAX_RETRIES - 1. If the off-by-one above is fixed to send exactly MAX_RETRIES retries, it becomes 1 + KILO_EMBEDDING_MAX_RETRIES.
| expect(h.posted.length).toBeLessThanOrEqual(1 + KILO_EMBEDDING_MAX_RETRIES) | |
| expect(h.posted.length).toBe(1 + KILO_EMBEDDING_MAX_RETRIES) |
Code Review SummaryStatus: No Issues Found | Recommendation: Merge Previously Flagged — All Fixed ✅
Incremental Review (commit d53e31d)Only change since last review: cosmetic style fix replacing an em dash with a comma in a JSX comment in Files Reviewed (18 files)
Reviewed by claude-sonnet-4.6 · 284,720 tokens |
Kilo-hosted embedding models use Cloud catalog dimensions for vector-store setup, but the embedding request itself should not forward a dimensions override through the gateway. Some upstream embedding APIs reject that parameter even when it matches the native dimension. Also move the Kilo embedding catalog retry helper into a JSX-free module so unit tests can import it in CI without requiring the Solid JSX runtime, and tighten the retry-cap assertion.
|
Pushed a follow-up commit that covers the CI failure, the review nits, and the runtime embedding error:
I checked the Cloud repo too. I don't see a recent Cloud-side change to dimension handling after the embedding endpoint work. The relevant Cloud code still forwards Validation run locally:
|
| * function cleans up the subscription and timer. | ||
| * | ||
| * Lives in a JSX-free module so unit tests can import it without pulling in | ||
| * the SolidJS render pipeline (the `.tsx` provider also re-exports it). |
There was a problem hiding this comment.
SUGGESTION: Stale comment — the .tsx provider no longer re-exports from this module
The JSDoc says "the .tsx provider also re-exports it", but kilo-embedding-models.tsx now only imports subscribeKiloEmbeddingModels for internal use and does not re-export any symbols from this file. The comment should be removed or corrected to avoid misleading future callers into expecting a re-export that doesn't exist.
| * the SolidJS render pipeline (the `.tsx` provider also re-exports it). | |
| * Lives in a JSX-free module so unit tests can import it without pulling in |
Switching to the Kilo indexing provider must delete any previously configured manual dimension. Kilo still passes configured dimensions through to the embedder when users intentionally set one, but the Kilo model Select uses catalog-native dimensions and should not inherit stale values from another provider.
|
Pushed another follow-up for the remaining The issue was not that Kilo can never pass dimensions. The issue is that the UI was preserving a stale manual This commit fixes that by writing
Added regression coverage in config-utils and the IndexingTab source-level test. Validation:
|
|
Updated the client PR again after narrowing down the backend behavior. The current client-side Kilo indexing dimension change is no longer correct. We do want the client to keep sending the catalog So I removed the client-side
The remaining client changes are now only the VS Code settings/model-picker behavior and stale UI config cleanup. Cloud PR Kilo-Org/cloud#3196 now handles the backend side:
|
Show structured provider response bodies in Kilo indexing embedder failures instead of collapsing OpenAI SDK errors to generic Bad Request messages. Also avoid sending invalid null indexing dimensions in the settings save payload while keeping Kilo provider selections free of stale manual dimensions in local UI state.
|
Pushed a follow-up for the config save failure and for better indexing error visibility. Root cause of the config failure: Fix in this PR:
For the vague
Validation:
|
|
Pushed a follow-up that logs/surfaces the exact Qdrant failure. I reproduced the current local indexing failure path:
Client PR update:
Cloud PR update:
|
| if (!isRecord(value)) return { global: value, project: undefined } | ||
| const global = Object.fromEntries(Object.entries(value).filter(([key]) => !PROJECT_INDEXING_KEYS.has(key))) | ||
| const project = Object.fromEntries(Object.entries(value).filter(([key]) => PROJECT_INDEXING_KEYS.has(key))) | ||
| const entries = Object.entries(value).map(([key, item]) => [key, key === "dimension" ? null : item] as const) |
There was a problem hiding this comment.
WARNING: dimension is always replaced with null regardless of provider, silently deleting non-Kilo users' custom dimension settings
The mapping key === "dimension" ? null : item fires unconditionally for every call to splitConfigByScope. For a non-Kilo provider with a user-configured dimension (e.g. { provider: "openai", dimension: 1536 }):
entries→[["provider", "openai"], ["dimension", null]]project→{ dimension: null }— the saved value1536is silently discarded
The null sentinel was originally introduced to delete stale Kilo dimensions from the project config. Now that IndexingTab.tsx uses delete next.dimension before calling updateConfig for Kilo, the dimension key is never even sent for Kilo, so this transform is no longer needed and only breaks non-Kilo providers.
The null replacement should be guarded by the provider or removed entirely:
| const entries = Object.entries(value).map(([key, item]) => [key, key === "dimension" ? null : item] as const) | |
| const entries = Object.entries(value) |
The codebase indexing settings panel was rendering a free-text "Embedding model" input and a manual "Dimension" field even when the provider was set to Kilo. The Kilo embedding catalog is Cloud-managed (
/api/gateway/embedding-models) and contains the model id, dimension, alias mapping, and similarity-score threshold — none of which a user should have to type by hand. When the catalog response is missing or invalid, the free-text row was the only thing rendered, leaving users staring at a literalprovider/modelplaceholder with no way to pick a model.This was originally surfaced when Cloud PR Kilo-Org/cloud#3109 changed the embedding-models endpoint to proxy OpenRouter's response shape directly. The CLI's gateway client validates the response with a zod schema that requires
defaultModel,models[].dimension,models[].scoreThreshold, andaliases— fields OpenRouter does not return — so the catalog always failed validation and fell back to empty. That contract break is reverted in Kilo-Org/cloud#3193, but the UX failure mode here is real regardless: any future transient empty/failed catalog response would land users in the same broken state.What this PR changes
UX correctness (the actual visible fix)
In
IndexingTab.tsx, the kilo branch now renders only the model preset Select. The free-text "Embedding model" override and the "Dimension" row are hidden entirely when provider iskilo. The Select stays mounted while the catalog is loading (disabled, placeholder "Loading models…") so users never see a free-text box for the kilo provider.For other providers (OpenAI, Ollama, Voyage, etc.), nothing changes — they still have the TextField + dimension row because their catalogs are not server-managed.
Defensive recovery (so this category of bug recovers without a webview reload)
kilo-embedding-models.tsxretriesrequestKiloEmbeddingModelswhile the catalog stays empty, capped at 5 attempts × 500 ms. Mirrors the existing pattern inindexing.tsx. The handler also ignores empty payloads delivered after a non-empty catalog so a late "fail" message can't clobber a good one in webview state.KiloProvider.fetchAndSendKiloEmbeddingModelsno longer caches empty catalogs, and replays the last cached non-empty catalog to the webview before re-fetching. A single transient failure can no longer poison the cache for the lifetime of the extension host.Tests
tests/unit/indexing-tab-kilo-no-textfield.test.ts: source-level regression asserting the kilo branch contains the Select and no longer references the overridemodel/dimensionrows or the literalprovider/modelplaceholder.tests/unit/kilo-embedding-models-retry.test.ts: webview retry loop, retry cap, stop-on-success, ignore-late-empty, cleanup.tests/unit/kilo-provider-embedding-models-cache.test.ts: extension caches real catalogs only, recovers across calls, replays cached catalog before re-fetching.