add client-side embeddings cached in IndexedDB#2432
Conversation
🎩 PreviewA preview build has been created at: |
|
Warning This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
This stack of pull requests is managed by Graphite. Learn more about stacking. |
626a971 to
60a4a12
Compare
|
|
||
| if (missingTexts.length === 0) return result; | ||
|
|
||
| const embeddings = await fetchEmbeddings(missingTexts, options); |
There was a problem hiding this comment.
🤖 This is an AI-generated code review comment.
[MEDIUM] The full unfiltered index is embedded on every search, and embedTextsWithCache sends all cache-missing texts in a single fetch body with no chunking. On a cold cache for a large library this exceeds OpenAI embeddings limits (2048 inputs / ~300k tokens) → 400, and the whole semantic layer silently returns [] (swallowed by the host catch). Consider batching missing texts (e.g. 256–512/request) and merging, embedding only the candidate set, or capping N.
| } | ||
|
|
||
| function cacheKey(text: string): string { | ||
| return `${CACHE_SCHEMA_VERSION}:${COMPONENT_SEARCH_EMBEDDING_MODEL}:${hashText(text)}`; |
There was a problem hiding this comment.
🤖 This is an AI-generated code review comment.
[LOW/MEDIUM] The cache key and the textHash validation field both derive from the same 32-bit FNV-1a hash, so a key collision (~50% likelihood near ~77k distinct texts) returns a stale-but-"validated" vector with no detection. Consider storing the full text (or a wider hash, e.g. SHA-256, or including text length) as the validation field so a key collision is caught as a miss.
|
|
||
| const embeddings = await fetchEmbeddings(missingTexts, options); | ||
| const now = Date.now(); | ||
| await componentSearchEmbeddingDb.embeddings.bulkPut( |
There was a problem hiding this comment.
🤖 This is an AI-generated code review comment.
[LOW] Dexie schema is pinned at version(1); cache busting relies on CACHE_SCHEMA_VERSION embedded in the key, so bumping it orphans all prior rows permanently (no prune). Also, there's no QuotaExceededError handling on this bulkPut — the rejection is swallowed by the host catch, silently disabling caching. Consider pruning rows whose key lacks the current schemaVersion:model: prefix on a schema bump, and catching quota errors to trigger a prune.
| limit: number, | ||
| ): LexicalMatch[] { | ||
| const merged: LexicalMatch[] = []; | ||
| const seen = new Set<string>(); |
There was a problem hiding this comment.
🤖 This is an AI-generated code review comment.
[LOW] This mergeUniqueMatches (dedupe-by-digest) is duplicated verbatim in useComponentSearchV2State.ts (~line 53). Consider exporting it once (e.g. from componentSearchIndex.ts) and importing it in both hosts.
| startAiSearch(aiCandidateMatches); | ||
| void startAiSearch(aiCandidateMatches, aiCandidateMatches.length); | ||
| }; | ||
|
|
There was a problem hiding this comment.
🤖 This is an AI-generated code review comment.
maxTypoDistance allows edit distance 1 on 4-char tokens, so short generic IO names collide: data<->date, path<->bath, list<->last. Fuzzy is name/io-only and scored at 0.75x so exact hits still win, but a typo-free data could pull in date-named IO. Consider raising the distance-1 floor to length >= 5.
60a4a12 to
f6d7ea7
Compare
6e2b2ae to
d8565d2
Compare
f6d7ea7 to
27a3952
Compare
d8565d2 to
761f88a
Compare
316495b to
89c4999
Compare
761f88a to
41a7bd9
Compare
41a7bd9 to
c379d9b
Compare
89c4999 to
92a80e1
Compare

Description
Adds embedding-based semantic search as a pre-processing step before LLM reranking. When an AI provider API base is configured, the component search now fetches text embeddings for both the search query and all indexed components, ranks them by cosine similarity, and merges those results with the top lexical matches before passing candidates to the LLM reranker. This improves the quality of candidates surfaced to the reranker, particularly for queries that don't share keywords with component names or descriptions.
A new
componentSearchEmbeddingsservice handles embedding generation, cosine similarity scoring, and persistent caching of embeddings in IndexedDB (keyed by a FNV-1a hash of the text and model name) to avoid redundant API calls across searches. AmergeUniqueMatcheshelper deduplicates and merges lexical and embedding results by digest before reranking.The embedding fetch is tracked with an
isEmbeddingSearchPendingflag, which gates the rerank active state and disables the AI search buttons while in progress, keeping the spinner and disabled states consistent with the existing reranking UX.Related Issue and Pull requests
Type of Change
Checklist
Screenshots (if applicable)
Test Instructions
https://api.openai.com/v1) in AI provider settings.Additional Comments
The embedding model is hardcoded to
text-embedding-3-small. If noapiBaseis set, the embedding step is skipped entirely and behaviour is identical to before this change.