⚡ Bolt: Cache numpy arrays in embedding search#105
⚡ Bolt: Cache numpy arrays in embedding search#105daggerstuff wants to merge 2 commits intostagingfrom
Conversation
Co-authored-by: daggerstuff <261005129+daggerstuff@users.noreply.github.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
Reviewer's guide (collapsed on small PRs)Reviewer's GuideImplements lazy, per-item caching of normalized NumPy embeddings during knowledge item similarity search to avoid reconstructing arrays on every query, along with a Bolt notebook note describing the optimization. Sequence diagram for cached embedding similarity searchsequenceDiagram
actor Client
participant EmbeddingAgentService
participant KnowledgeItem
Client->>EmbeddingAgentService: search_similar(query_embedding, knowledge_items)
EmbeddingAgentService->>EmbeddingAgentService: build query_np from query_embedding
loop for each item in knowledge_items
EmbeddingAgentService->>KnowledgeItem: get _cached_np_embedding
alt cache hit
KnowledgeItem-->>EmbeddingAgentService: item_np
else cache miss
EmbeddingAgentService->>KnowledgeItem: get embedding
alt embedding is None
EmbeddingAgentService-->>EmbeddingAgentService: skip item
else embedding exists
EmbeddingAgentService->>EmbeddingAgentService: item_np = np.array(embedding)
alt normalize_embeddings is True and norm > 0
EmbeddingAgentService->>EmbeddingAgentService: item_np = item_np / norm
end
EmbeddingAgentService->>KnowledgeItem: set _cached_np_embedding = item_np
end
end
alt NUMPY_AVAILABLE and query_np and item_np not None
EmbeddingAgentService->>EmbeddingAgentService: similarity = dot(query_np, item_np)
else
EmbeddingAgentService->>KnowledgeItem: get embedding
EmbeddingAgentService->>EmbeddingAgentService: similarity = cosine(query_embedding, embedding)
end
end
EmbeddingAgentService-->>Client: similar_items
Class diagram for embedding search caching changesclassDiagram
class EmbeddingAgentService {
config
_search_knowledge_items(query_embedding, knowledge_items, knowledge_types)
}
class KnowledgeItem {
embedding
_cached_np_embedding
}
EmbeddingAgentService --> KnowledgeItem : iterates_and_reads
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
📝 WalkthroughWalkthroughRefactored similarity computation in the embedding agent service by introducing a Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Hey - I've left some high level feedback:
- The new flow no longer short-circuits when
item.embeddingisNonein the non-numpy path, sozip(query_embedding, item_embedding)can now throw ifembeddingis missing; consider restoring an earlycontinuewhenitem_embedding is Nonebefore both similarity branches. - Because
_cached_np_embeddingis computed usingself.config.normalize_embeddings, a runtime change to this flag will leave stale cached vectors with incorrect normalization; if config can change, key the cache by normalization mode or avoid sharing cached arrays across modes. - Attaching
_cached_np_embeddingdirectly toiteminstances may have concurrency or lifecycle implications if these objects are shared between requests or processes; consider using an external cache keyed by item ID or ensuring that item instances are request-scoped.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The new flow no longer short-circuits when `item.embedding` is `None` in the non-numpy path, so `zip(query_embedding, item_embedding)` can now throw if `embedding` is missing; consider restoring an early `continue` when `item_embedding is None` before both similarity branches.
- Because `_cached_np_embedding` is computed using `self.config.normalize_embeddings`, a runtime change to this flag will leave stale cached vectors with incorrect normalization; if config can change, key the cache by normalization mode or avoid sharing cached arrays across modes.
- Attaching `_cached_np_embedding` directly to `item` instances may have concurrency or lifecycle implications if these objects are shared between requests or processes; consider using an external cache keyed by item ID or ensuring that item instances are request-scoped.Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
There was a problem hiding this comment.
🧹 Nitpick comments (2)
api/embedding_agent/service.py (2)
552-564: Redundant re-fetch ofitem.embeddingin fallback path.When the fallback cosine similarity path is taken (numpy unavailable),
item_embeddingwas already retrieved at line 540 but goes out of scope before line 555. Consider hoistingitem_embeddingto avoid the redundantgetattrcall.♻️ Proposed refactor to avoid redundant fetch
# Get item embedding if cache misses + item_embedding = None if item_np is None: item_embedding = getattr(item, "embedding", None) if item_embedding is None: continue if NUMPY_AVAILABLE and query_np is not None: item_np = np.array(item_embedding) if self.config.normalize_embeddings: norm = np.linalg.norm(item_np) if norm > 0: item_np = item_np / norm - setattr(item, "_cached_np_embedding", item_np) + item._cached_np_embedding = item_np # Calculate similarity if NUMPY_AVAILABLE and query_np is not None and item_np is not None: similarity = float(np.dot(query_np, item_np)) else: - item_embedding = getattr(item, "embedding", None) + if item_embedding is None: + item_embedding = getattr(item, "embedding", None) # Simple cosine similarity🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/embedding_agent/service.py` around lines 552 - 564, Hoist the retrieval of item.embedding out of the numpy/non-numpy conditional so the fallback path reuses the same value instead of calling getattr again; specifically, set item_embedding = getattr(item, "embedding", None) before the block that checks NUMPY_AVAILABLE and query_np/item_np, then use item_embedding in the fallback cosine-similarity calculation (references: NUMPY_AVAILABLE, query_np, item_np, item_embedding, query_embedding).
528-549: Thread safety consideration for cache population.This method modifies shared state (
item._cached_np_embedding) without holdingself._lock. Concurrent calls tosearch_similarcould race on populating the cache for the same item.In this case, the race is benign since all threads compute and store the same normalized array value. However, if
KnowledgeItemis not thread-safe for attribute writes, or if future changes make the cached value non-deterministic, this could become problematic.Consider documenting this as an intentional benign race, or use the lock if strict thread safety is required.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/embedding_agent/service.py` around lines 528 - 549, The loop in search_similar modifies shared state by setting item._cached_np_embedding without synchronization; either document this as an intentional benign race or make it thread-safe by using self._lock around the cache write: compute item_np (and normalization) outside the lock, then acquire self._lock, re-check getattr(item, "_cached_np_embedding", None) to avoid duplicate work, and set setattr(item, "_cached_np_embedding", item_np) if still missing; reference the search_similar method, self._lock, item._cached_np_embedding, and KnowledgeItem when making the change or adding the comment.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@api/embedding_agent/service.py`:
- Around line 552-564: Hoist the retrieval of item.embedding out of the
numpy/non-numpy conditional so the fallback path reuses the same value instead
of calling getattr again; specifically, set item_embedding = getattr(item,
"embedding", None) before the block that checks NUMPY_AVAILABLE and
query_np/item_np, then use item_embedding in the fallback cosine-similarity
calculation (references: NUMPY_AVAILABLE, query_np, item_np, item_embedding,
query_embedding).
- Around line 528-549: The loop in search_similar modifies shared state by
setting item._cached_np_embedding without synchronization; either document this
as an intentional benign race or make it thread-safe by using self._lock around
the cache write: compute item_np (and normalization) outside the lock, then
acquire self._lock, re-check getattr(item, "_cached_np_embedding", None) to
avoid duplicate work, and set setattr(item, "_cached_np_embedding", item_np) if
still missing; reference the search_similar method, self._lock,
item._cached_np_embedding, and KnowledgeItem when making the change or adding
the comment.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 05da8d9c-84aa-4b1e-80b0-313841b4f656
📒 Files selected for processing (2)
.Jules/bolt.mdapi/embedding_agent/service.py
There was a problem hiding this comment.
1 issue found across 2 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="api/embedding_agent/service.py">
<violation number="1" location="api/embedding_agent/service.py:549">
P2: The new `_cached_np_embedding` cache bypasses cache lifecycle controls, so `clear_cache()` no longer clears all embedding caches and cached vectors persist unexpectedly.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
api/embedding_agent/service.py
Outdated
| norm = np.linalg.norm(item_np) | ||
| if norm > 0: | ||
| item_np = item_np / norm | ||
| setattr(item, "_cached_np_embedding", item_np) |
There was a problem hiding this comment.
P2: The new _cached_np_embedding cache bypasses cache lifecycle controls, so clear_cache() no longer clears all embedding caches and cached vectors persist unexpectedly.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At api/embedding_agent/service.py, line 549:
<comment>The new `_cached_np_embedding` cache bypasses cache lifecycle controls, so `clear_cache()` no longer clears all embedding caches and cached vectors persist unexpectedly.</comment>
<file context>
@@ -532,20 +532,27 @@ def _search_knowledge_items(
+ norm = np.linalg.norm(item_np)
+ if norm > 0:
+ item_np = item_np / norm
+ setattr(item, "_cached_np_embedding", item_np)
# Calculate similarity
</file context>
Co-authored-by: daggerstuff <261005129+daggerstuff@users.noreply.github.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a7dc1fdadc
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| setattr(item, "_cached_np_embedding", item_np) | ||
| except AttributeError: | ||
| pass # Skip caching if item is a strict class (e.g., uses slots) |
There was a problem hiding this comment.
Handle cache-write failures beyond AttributeError
In _compute_similarity, the new cache path writes item_np back onto each knowledge item with setattr, but it only catches AttributeError. Strict model objects (for example, classes that reject unknown attributes via __setattr__) commonly raise ValueError or TypeError instead, which will now bubble up and fail the entire similarity search request for that item set. This optimization path should degrade gracefully (skip caching) rather than turning unsupported item types into request-time exceptions.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
api/embedding_agent/service.py (1)
532-532: Addstrict=Truetozip()for mismatched-length detection.Per Python 3.11+ (the target version per guidelines),
zip()supportsstrict=Trueto raiseValueErrorif iterables have different lengths. Since embeddings should always have matching dimensions, this would catch data corruption early.♻️ Proposed fix
- dot_product = sum(a * b for a, b in zip(query_embedding, item_embedding)) + dot_product = sum(a * b for a, b in zip(query_embedding, item_embedding, strict=True))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/embedding_agent/service.py` at line 532, The dot_product calculation currently uses zip(query_embedding, item_embedding) which silently truncates mismatched-length embeddings; update this to use zip(query_embedding, item_embedding, strict=True) so a ValueError is raised on dimension mismatch (catch/log upstream if needed). Modify the line that computes dot_product (referencing query_embedding, item_embedding and zip) to pass strict=True to detect corrupted or mis-sized embeddings early.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@api/embedding_agent/service.py`:
- Around line 521-524: Replace the silent except block that swallows
AttributeError when running setattr(item, "_cached_np_embedding", item_np) with
an explicit debug log so caching failures are observable; in the except
AttributeError handler call a logger.debug (e.g., logger.debug or get module
logger via logging.getLogger(__name__)) and include the item repr/type and the
exception message to show why caching was skipped, rather than using pass.
Ensure the logger reference you use is consistent with the module (or import
logging if needed) and keep the handler limited to debug-level output.
- Around line 505-535: The cached numpy embedding in _compute_similarity uses
self.config.normalize_embeddings at cache time, so if that flag changes later
cached embeddings become stale; update the caching to also record the
normalization state (e.g., store a companion attribute like
_cached_np_embedding_normalized alongside _cached_np_embedding) and when reading
the cache in _compute_similarity verify that getattr(item,
"_cached_np_embedding_normalized", None) matches
self.config.normalize_embeddings and, if not, recompute (and overwrite) the
cached embedding using the current self.config.normalize_embeddings; reference
_compute_similarity, _cached_np_embedding, and self.config.normalize_embeddings
when implementing this change.
---
Nitpick comments:
In `@api/embedding_agent/service.py`:
- Line 532: The dot_product calculation currently uses zip(query_embedding,
item_embedding) which silently truncates mismatched-length embeddings; update
this to use zip(query_embedding, item_embedding, strict=True) so a ValueError is
raised on dimension mismatch (catch/log upstream if needed). Modify the line
that computes dot_product (referencing query_embedding, item_embedding and zip)
to pass strict=True to detect corrupted or mis-sized embeddings early.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: f7abb299-06fe-4e52-b368-eab719898ce0
📒 Files selected for processing (1)
api/embedding_agent/service.py
| def _compute_similarity( | ||
| self, item: Any, query_embedding: List[float], query_np: Any | ||
| ) -> Optional[float]: | ||
| # ⚡ Bolt: Cache normalized numpy arrays safely to avoid regenerating on query | ||
| item_np = getattr(item, "_cached_np_embedding", None) | ||
|
|
||
| if item_np is None: | ||
| item_embedding = getattr(item, "embedding", None) | ||
| if item_embedding is None: | ||
| return None | ||
| if NUMPY_AVAILABLE and query_np is not None: | ||
| item_np = np.array(item_embedding) | ||
| if self.config.normalize_embeddings: | ||
| norm = np.linalg.norm(item_np) | ||
| if norm > 0: | ||
| item_np = item_np / norm | ||
| try: | ||
| setattr(item, "_cached_np_embedding", item_np) | ||
| except AttributeError: | ||
| pass # Skip caching if item is a strict class (e.g., uses slots) | ||
|
|
||
| if NUMPY_AVAILABLE and query_np is not None and item_np is not None: | ||
| return float(np.dot(query_np, item_np)) | ||
|
|
||
| item_embedding = getattr(item, "embedding", None) | ||
| if item_embedding is None: | ||
| return None | ||
| dot_product = sum(a * b for a, b in zip(query_embedding, item_embedding)) | ||
| norm_q = sum(x**2 for x in query_embedding) ** 0.5 | ||
| norm_i = sum(x**2 for x in item_embedding) ** 0.5 | ||
| return dot_product / (norm_q * norm_i) if norm_q * norm_i > 0 else 0.0 |
There was a problem hiding this comment.
Stale cache risk when normalize_embeddings config changes.
The cached _cached_np_embedding is computed based on the value of self.config.normalize_embeddings at cache time. Since EmbeddingAgentConfig is mutable (it's a Pydantic BaseModel without frozen=True), if normalize_embeddings changes after items are cached, subsequent similarity computations will use incorrectly normalized/unnormalized embeddings.
Consider one of:
- Include the normalization state in the cache (e.g.,
_cached_np_embedding_normalized: bool) - Make the config immutable (
frozen=True) - Invalidate item caches when config changes
🛡️ Proposed fix: track normalization state with the cache
- item_np = getattr(item, "_cached_np_embedding", None)
+ cached = getattr(item, "_cached_np_embedding", None)
+ cached_normalized = getattr(item, "_cached_np_normalized", None)
+
+ # Invalidate cache if normalization setting changed
+ if cached is not None and cached_normalized != self.config.normalize_embeddings:
+ cached = None
+ item_np = cached
if item_np is None:
item_embedding = getattr(item, "embedding", None)
if item_embedding is None:
return None
if NUMPY_AVAILABLE and query_np is not None:
item_np = np.array(item_embedding)
if self.config.normalize_embeddings:
norm = np.linalg.norm(item_np)
if norm > 0:
item_np = item_np / norm
try:
setattr(item, "_cached_np_embedding", item_np)
+ setattr(item, "_cached_np_normalized", self.config.normalize_embeddings)
except AttributeError:
- pass # Skip caching if item is a strict class (e.g., uses slots)
+ # Item class restricts attribute assignment (e.g., __slots__)
+ logger.debug("Cannot cache embedding on item: attribute assignment not allowed")🧰 Tools
🪛 Ruff (0.15.7)
[warning] 522-522: Do not call setattr with a constant attribute value. It is not any safer than normal property access.
Replace setattr with assignment
(B010)
[warning] 532-532: zip() without an explicit strict= parameter
Add explicit value for parameter strict=
(B905)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@api/embedding_agent/service.py` around lines 505 - 535, The cached numpy
embedding in _compute_similarity uses self.config.normalize_embeddings at cache
time, so if that flag changes later cached embeddings become stale; update the
caching to also record the normalization state (e.g., store a companion
attribute like _cached_np_embedding_normalized alongside _cached_np_embedding)
and when reading the cache in _compute_similarity verify that getattr(item,
"_cached_np_embedding_normalized", None) matches
self.config.normalize_embeddings and, if not, recompute (and overwrite) the
cached embedding using the current self.config.normalize_embeddings; reference
_compute_similarity, _cached_np_embedding, and self.config.normalize_embeddings
when implementing this change.
| try: | ||
| setattr(item, "_cached_np_embedding", item_np) | ||
| except AttributeError: | ||
| pass # Skip caching if item is a strict class (e.g., uses slots) |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Replace pass with explicit logging or structured handling.
The pass statement violates coding guidelines which prohibit "stubs, placeholders, pass statements." While the except AttributeError is a valid fallback, silently swallowing it obscures when caching fails. Log at debug level so operators can diagnose performance issues if caching is unexpectedly disabled for certain item types.
♻️ Proposed fix
try:
setattr(item, "_cached_np_embedding", item_np)
except AttributeError:
- pass # Skip caching if item is a strict class (e.g., uses slots)
+ logger.debug(
+ "Cannot cache numpy embedding on item %s: attribute assignment restricted",
+ type(item).__name__,
+ )As per coding guidelines: "Absolutely prohibit stubs, placeholders, pass statements, TODO comments, NotImplementedError, or FIXME comments; every implementation must be production-ready."
🧰 Tools
🪛 Ruff (0.15.7)
[warning] 522-522: Do not call setattr with a constant attribute value. It is not any safer than normal property access.
Replace setattr with assignment
(B010)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@api/embedding_agent/service.py` around lines 521 - 524, Replace the silent
except block that swallows AttributeError when running setattr(item,
"_cached_np_embedding", item_np) with an explicit debug log so caching failures
are observable; in the except AttributeError handler call a logger.debug (e.g.,
logger.debug or get module logger via logging.getLogger(__name__)) and include
the item repr/type and the exception message to show why caching was skipped,
rather than using pass. Ensure the logger reference you use is consistent with
the module (or import logging if needed) and keep the handler limited to
debug-level output.
💡 What: Implemented lazy caching of normalized numpy arrays for knowledge base items.
🎯 Why: The original code reconstructed and normalized numpy arrays on every query for every item, creating a massive CPU bottleneck.
📊 Impact: Drastically reduces latency for vector similarity searches by skipping repetitive array allocation and math.
🔬 Measurement: Benchmark
search_similarbefore and after.PR created automatically by Jules for task 2310443046821552135 started by @daggerstuff
Summary by Sourcery
Cache normalized numpy embeddings on knowledge items to remove repeated per-query array reconstruction during similarity search.
Enhancements:
Summary by cubic
Cache a normalized
numpyembedding per knowledge item and reuse it across queries to remove a CPU bottleneck in vector search. Also extracted_compute_similarityto reduce method complexity and gracefully handle items without embeddings or that can't store_cached_np_embedding, while keeping the non-numpycosine path unchanged.Written for commit a7dc1fd. Summary will update on new commits.
Summary by CodeRabbit