Skip to content

fix(vscode): show model select for Kilo indexing provider#10174

Open
marius-kilocode wants to merge 12 commits into
mainfrom
hulking-light
Open

fix(vscode): show model select for Kilo indexing provider#10174
marius-kilocode wants to merge 12 commits into
mainfrom
hulking-light

Conversation

@marius-kilocode
Copy link
Copy Markdown
Collaborator

@marius-kilocode marius-kilocode commented May 12, 2026

The codebase indexing settings panel was rendering a free-text "Embedding model" input and a manual "Dimension" field even when the provider was set to Kilo. The Kilo embedding catalog is Cloud-managed (/api/gateway/embedding-models) and contains the model id, dimension, alias mapping, and similarity-score threshold — none of which a user should have to type by hand. When the catalog response is missing or invalid, the free-text row was the only thing rendered, leaving users staring at a literal provider/model placeholder with no way to pick a model.

This was originally surfaced when Cloud PR Kilo-Org/cloud#3109 changed the embedding-models endpoint to proxy OpenRouter's response shape directly. The CLI's gateway client validates the response with a zod schema that requires defaultModel, models[].dimension, models[].scoreThreshold, and aliases — fields OpenRouter does not return — so the catalog always failed validation and fell back to empty. That contract break is reverted in Kilo-Org/cloud#3193, but the UX failure mode here is real regardless: any future transient empty/failed catalog response would land users in the same broken state.

image

What this PR changes

UX correctness (the actual visible fix)

In IndexingTab.tsx, the kilo branch now renders only the model preset Select. The free-text "Embedding model" override and the "Dimension" row are hidden entirely when provider is kilo. The Select stays mounted while the catalog is loading (disabled, placeholder "Loading models…") so users never see a free-text box for the kilo provider.

For other providers (OpenAI, Ollama, Voyage, etc.), nothing changes — they still have the TextField + dimension row because their catalogs are not server-managed.

Defensive recovery (so this category of bug recovers without a webview reload)

  • kilo-embedding-models.tsx retries requestKiloEmbeddingModels while the catalog stays empty, capped at 5 attempts × 500 ms. Mirrors the existing pattern in indexing.tsx. The handler also ignores empty payloads delivered after a non-empty catalog so a late "fail" message can't clobber a good one in webview state.
  • KiloProvider.fetchAndSendKiloEmbeddingModels no longer caches empty catalogs, and replays the last cached non-empty catalog to the webview before re-fetching. A single transient failure can no longer poison the cache for the lifetime of the extension host.

Tests

  • tests/unit/indexing-tab-kilo-no-textfield.test.ts: source-level regression asserting the kilo branch contains the Select and no longer references the override model / dimension rows or the literal provider/model placeholder.
  • tests/unit/kilo-embedding-models-retry.test.ts: webview retry loop, retry cap, stop-on-success, ignore-late-empty, cleanup.
  • tests/unit/kilo-provider-embedding-models-cache.test.ts: extension caches real catalogs only, recovers across calls, replays cached catalog before re-fetching.

The Kilo embedding model catalog is Cloud-managed; the indexing settings
panel should pick from a Select, never a free-text input. Hide the
generic 'Embedding model' TextField and 'Dimension' row when the
provider is Kilo, and keep the Select rendered (disabled, 'Loading
models…') while the catalog is empty so users never see a freetext
fallback.

Also harden the catalog request path so transient failures don't
poison the UI: the webview retries while the catalog stays empty
(mirrors the indexing.tsx pattern), and the extension only caches
non-empty catalogs and replays the last good one before re-fetching.
let retries = 0
const timer = setIntervalFn(() => {
retries++
if (!isEmptyKiloEmbeddingCatalog(opts.getCatalog()) || retries >= maxRetries) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WARNING: Off-by-one in retry count — KILO_EMBEDDING_MAX_RETRIES retries are not actually sent

The interval callback increments retries before the guard check. When retries reaches maxRetries (5) on the 5th tick, the condition is immediately true and the timer is cleared without posting. So only 4 retry requests are sent (ticks 1–4), not 5 as the constant name implies.

Trace:

  • Tick 1: retries=1, 1≥5? No → post (total=2)
  • Tick 2: retries=2, 2≥5? No → post (total=3)
  • Tick 3: retries=3, 3≥5? No → post (total=4)
  • Tick 4: retries=4, 4≥5? No → post (total=5)
  • Tick 5: retries=5, 5≥5? Yes → clear, return (no post)

If the intent is exactly MAX_RETRIES retry requests, the guard should be retries > maxRetries (or post before incrementing). If the intent is "fire the interval at most MAX_RETRIES times regardless", renaming to KILO_EMBEDDING_MAX_RETRY_TICKS would clarify, but the comment in indexing.tsx (the mirrored pattern) also implies 5 actual requests.

Suggested change
if (!isEmptyKiloEmbeddingCatalog(opts.getCatalog()) || retries >= maxRetries) {
if (!isEmptyKiloEmbeddingCatalog(opts.getCatalog()) || retries > maxRetries) {

for (let i = 0; i < KILO_EMBEDDING_MAX_RETRIES + 5; i++) timers.tick()

// Initial request + at most MAX_RETRIES re-posts.
expect(h.posted.length).toBeLessThanOrEqual(1 + KILO_EMBEDDING_MAX_RETRIES)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WARNING: Weak assertion — toBeLessThanOrEqual would pass even with zero retries

toBeLessThanOrEqual(6) is satisfied by any value from 0 to 6, meaning a regression that eliminates all retries would still pass this test. Since the behavior is deterministic with fake timers, use an exact equality assertion. Based on the current implementation (4 retries actually sent), the correct value is 1 + KILO_EMBEDDING_MAX_RETRIES - 1. If the off-by-one above is fixed to send exactly MAX_RETRIES retries, it becomes 1 + KILO_EMBEDDING_MAX_RETRIES.

Suggested change
expect(h.posted.length).toBeLessThanOrEqual(1 + KILO_EMBEDDING_MAX_RETRIES)
expect(h.posted.length).toBe(1 + KILO_EMBEDDING_MAX_RETRIES)

@kilo-code-bot
Copy link
Copy Markdown
Contributor

kilo-code-bot Bot commented May 12, 2026

Code Review Summary

Status: No Issues Found | Recommendation: Merge

Previously Flagged — All Fixed ✅
File Issue
packages/kilo-vscode/webview-ui/src/utils/config-scope.ts dimension always replaced with null — fixed by removing null-replacement logic and removing dimension from PROJECT_INDEXING_KEYS; values now pass through unchanged
packages/kilo-vscode/webview-ui/src/context/kilo-embedding-models.tsx Off-by-one in retry count — fixed in kilo-embedding-models-subscribe.ts
packages/kilo-vscode/tests/unit/kilo-embedding-models-retry.test.ts Weak toBeLessThanOrEqual assertion — upgraded to toBe(1 + KILO_EMBEDDING_MAX_RETRIES)
packages/kilo-vscode/webview-ui/src/context/kilo-embedding-models-subscribe.ts Stale comment claiming .tsx re-exports helpers — inaccurate clause removed
Incremental Review (commit d53e31d)

Only change since last review: cosmetic style fix replacing an em dash with a comma in a JSX comment in IndexingTab.tsx. No logic changes.

Files Reviewed (18 files)
  • .changeset/indexing-kilo-model-select.md
  • packages/kilo-indexing/src/indexing/embedders/openai-compatible.ts
  • packages/kilo-indexing/src/indexing/vector-store/qdrant-client.ts
  • packages/kilo-indexing/test/kilocode/indexing/embedders/openai-compatible.test.ts
  • packages/kilo-indexing/test/kilocode/indexing/vector-store/qdrant-client.test.ts
  • packages/kilo-vscode/src/KiloProvider.ts
  • packages/kilo-vscode/tests/unit/config-scope.test.ts
  • packages/kilo-vscode/tests/unit/config-utils.test.ts
  • packages/kilo-vscode/tests/unit/indexing-tab-kilo-no-textfield.test.ts
  • packages/kilo-vscode/tests/unit/kilo-embedding-models-retry.test.ts
  • packages/kilo-vscode/tests/unit/kilo-provider-embedding-models-cache.test.ts
  • packages/kilo-vscode/webview-ui/src/components/settings/IndexingTab.tsx
  • packages/kilo-vscode/webview-ui/src/context/kilo-embedding-models-subscribe.ts
  • packages/kilo-vscode/webview-ui/src/context/kilo-embedding-models.tsx
  • packages/kilo-vscode/webview-ui/src/types/messages/config.ts
  • packages/kilo-vscode/webview-ui/src/utils/config-scope.ts

Reviewed by claude-sonnet-4.6 · 284,720 tokens

Kilo-hosted embedding models use Cloud catalog dimensions for vector-store setup, but the embedding request itself should not forward a dimensions override through the gateway. Some upstream embedding APIs reject that parameter even when it matches the native dimension.

Also move the Kilo embedding catalog retry helper into a JSX-free module so unit tests can import it in CI without requiring the Solid JSX runtime, and tighten the retry-cap assertion.
@marius-kilocode
Copy link
Copy Markdown
Collaborator Author

Pushed a follow-up commit that covers the CI failure, the review nits, and the runtime embedding error:

  • CI failure was from the new retry test importing kilo-embedding-models.tsx, which pulls in JSX under Bun's unit-test tsconfig and failed on react/jsx-dev-runtime in CI. I moved the pure retry orchestration into a JSX-free kilo-embedding-models-subscribe.ts module and import that from tests and the Solid provider.
  • Accepted the useful review feedback: fixed the off-by-one so KILO_EMBEDDING_MAX_RETRIES sends exactly that many retry requests, and tightened the retry-cap test to exact equality instead of <=.
  • Fixed the HTTP 422 extra_forbidden body.dimensions error by reverting the Kilo embedder path added in fix(indexing): allow custom dimensions with kilo embedding provider #10045: the Kilo gateway request no longer forwards dimensions. Vector-store sizing still uses the Cloud catalog dimension separately, but the request body should not send an override that upstream embedding APIs can reject.

I checked the Cloud repo too. I don't see a recent Cloud-side change to dimension handling after the embedding endpoint work. The relevant Cloud code still forwards dimensions through buildUpstreamBody, and that behavior dates back to the original embedding proxy work. The regression lines up with kilocode #10045, which started passing config.modelDimension into the Kilo embedder request path.

Validation run locally:

  • packages/kilo-vscode: bun run typecheck, bun run lint, targeted regression tests
  • packages/kilo-indexing: bun run typecheck, Kilo embedder/service-factory targeted tests

* function cleans up the subscription and timer.
*
* Lives in a JSX-free module so unit tests can import it without pulling in
* the SolidJS render pipeline (the `.tsx` provider also re-exports it).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SUGGESTION: Stale comment — the .tsx provider no longer re-exports from this module

The JSDoc says "the .tsx provider also re-exports it", but kilo-embedding-models.tsx now only imports subscribeKiloEmbeddingModels for internal use and does not re-export any symbols from this file. The comment should be removed or corrected to avoid misleading future callers into expecting a re-export that doesn't exist.

Suggested change
* the SolidJS render pipeline (the `.tsx` provider also re-exports it).
* Lives in a JSX-free module so unit tests can import it without pulling in

Switching to the Kilo indexing provider must delete any previously configured manual dimension. Kilo still passes configured dimensions through to the embedder when users intentionally set one, but the Kilo model Select uses catalog-native dimensions and should not inherit stale values from another provider.
@marius-kilocode
Copy link
Copy Markdown
Collaborator Author

Pushed another follow-up for the remaining body.dimensions 422.

The issue was not that Kilo can never pass dimensions. The issue is that the UI was preserving a stale manual indexing.dimension value when switching to the Kilo provider / selecting a catalog model. Because undefined is stripped before save, my previous UI change did not actually delete the stale dimension from config. If a user previously had dimension: 256 or 1024, it stayed in kilo.json and kept getting passed through to the Kilo embedder request path.

This commit fixes that by writing dimension: null when:

  • switching provider to kilo
  • enabling indexing with implicit kilo
  • selecting a Kilo catalog model

null is the config delete sentinel, so the stale dimension is actually removed on save. I also restored the Kilo embedder's ability to accept an explicitly configured dimension (#10045 behavior) for users who intentionally set one in config. The model Select path now clears stale UI/provider dimensions instead of globally disabling that functionality.

Added regression coverage in config-utils and the IndexingTab source-level test.

Validation:

  • packages/kilo-vscode: impacted tests, typecheck, lint
  • packages/kilo-indexing: Kilo embedder/service-factory tests, typecheck

@marius-kilocode
Copy link
Copy Markdown
Collaborator Author

Updated the client PR again after narrowing down the backend behavior.

The current client-side Kilo indexing dimension change is no longer correct. We do want the client to keep sending the catalog dimensions value, because it is part of the Kilo gateway/client contract after #10045 and is also useful for Cloud to know what the client sized the vector store for. The provider-specific problem is in Cloud: those control-plane/SDK-only fields must not be forwarded unchanged to OpenRouter/upstream providers.

So I removed the client-side @kilocode/kilo-indexing patch from this PR:

  • restored KiloEmbedder forwarding dimensions
  • removed the obsolete "does not forward dimensions" test
  • removed the @kilocode/kilo-indexing changeset

The remaining client changes are now only the VS Code settings/model-picker behavior and stale UI config cleanup.

Cloud PR Kilo-Org/cloud#3196 now handles the backend side:

  • strips dimensions before upstream requests
  • also strips encoding_format, which appears to be the next cause of provider Bad Request
  • preserves the client contract by converting numeric upstream embeddings back to base64 when the client requested encoding_format: "base64"

kilo-code-bot Bot and others added 3 commits May 12, 2026 14:14
Show structured provider response bodies in Kilo indexing embedder failures instead of collapsing OpenAI SDK errors to generic Bad Request messages. Also avoid sending invalid null indexing dimensions in the settings save payload while keeping Kilo provider selections free of stale manual dimensions in local UI state.
@marius-kilocode
Copy link
Copy Markdown
Collaborator Author

Pushed a follow-up for the config save failure and for better indexing error visibility.

Root cause of the config failure: dimension: null is a valid delete sentinel for the config writer, but the HTTP API validates the update body against the typed config schema first, where indexing.dimension only accepts a number. So sending { indexing: { dimension: null } } from the webview is rejected before the writer can treat it as deletion.

Fix in this PR:

  • The Kilo indexing UI now deletes stale manual dimensions from local UI state before saving, instead of sending invalid null in the typed payload.
  • The provider/model update remains valid (provider, model only), so the save request passes schema validation.
  • Added regression coverage that no dimension: null / dimension: undefined payload is emitted from IndexingTab.

For the vague Bad Request issue:

  • Added structured extraction of OpenAI SDK error payloads in OpenAICompatibleEmbedder.
  • Indexing failures now include error.response.data / error.response.body / error.error if present, so the UI/logs should show the exact provider response instead of collapsing to just Bad Request.
  • Added a regression test with a simulated SDK response payload containing an upstream raw error.

Validation:

  • packages/kilo-vscode: targeted tests, typecheck, lint
  • packages/kilo-indexing: targeted error tests, typecheck

@marius-kilocode
Copy link
Copy Markdown
Collaborator Author

Pushed a follow-up that logs/surfaces the exact Qdrant failure.

I reproduced the current local indexing failure path:

  • /api/gateway/embedding-models advertises mistralai/codestral-embed-2505 as dimension: 256.
  • A real authenticated Kilo embedding request for that model currently returns vectors with length 1536.
  • Qdrant collections are created from the catalog dimension, so we create a 256-dimensional collection and then try to upsert 1536-dimensional vectors.
  • Local Qdrant reproduces the raw response: Wrong input: Vector dimension error: expected dim: 256, got 1536.

Client PR update:

  • QdrantVectorStore.upsertPoints now checks vector lengths before calling Qdrant and throws a clear error like Qdrant vector dimension mismatch before upsert: expected 256, got 1536.
  • If Qdrant itself rejects an upsert, we now log and throw the structured Qdrant response (error.response.data / error.response.body / error.data / error.body) instead of reducing it to Bad Request.
  • Added regression tests for both pre-upsert dimension mismatch and raw Qdrant response extraction.

Cloud PR update:

if (!isRecord(value)) return { global: value, project: undefined }
const global = Object.fromEntries(Object.entries(value).filter(([key]) => !PROJECT_INDEXING_KEYS.has(key)))
const project = Object.fromEntries(Object.entries(value).filter(([key]) => PROJECT_INDEXING_KEYS.has(key)))
const entries = Object.entries(value).map(([key, item]) => [key, key === "dimension" ? null : item] as const)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WARNING: dimension is always replaced with null regardless of provider, silently deleting non-Kilo users' custom dimension settings

The mapping key === "dimension" ? null : item fires unconditionally for every call to splitConfigByScope. For a non-Kilo provider with a user-configured dimension (e.g. { provider: "openai", dimension: 1536 }):

  • entries[["provider", "openai"], ["dimension", null]]
  • project{ dimension: null } — the saved value 1536 is silently discarded

The null sentinel was originally introduced to delete stale Kilo dimensions from the project config. Now that IndexingTab.tsx uses delete next.dimension before calling updateConfig for Kilo, the dimension key is never even sent for Kilo, so this transform is no longer needed and only breaks non-Kilo providers.

The null replacement should be guarded by the provider or removed entirely:

Suggested change
const entries = Object.entries(value).map(([key, item]) => [key, key === "dimension" ? null : item] as const)
const entries = Object.entries(value)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant