Skip to content

⚡ Bolt: Cache numpy arrays in embedding search#105

Open
daggerstuff wants to merge 2 commits intostagingfrom
perf/embedding-caching-numpy-2310443046821552135
Open

⚡ Bolt: Cache numpy arrays in embedding search#105
daggerstuff wants to merge 2 commits intostagingfrom
perf/embedding-caching-numpy-2310443046821552135

Conversation

@daggerstuff
Copy link
Copy Markdown
Owner

@daggerstuff daggerstuff commented Mar 30, 2026

💡 What: Implemented lazy caching of normalized numpy arrays for knowledge base items.
🎯 Why: The original code reconstructed and normalized numpy arrays on every query for every item, creating a massive CPU bottleneck.
📊 Impact: Drastically reduces latency for vector similarity searches by skipping repetitive array allocation and math.
🔬 Measurement: Benchmark search_similar before and after.


PR created automatically by Jules for task 2310443046821552135 started by @daggerstuff

Summary by Sourcery

Cache normalized numpy embeddings on knowledge items to remove repeated per-query array reconstruction during similarity search.

Enhancements:

  • Introduce lazy per-item caching of normalized numpy arrays for embeddings in knowledge search to reduce CPU overhead.
  • Document the performance optimization and its rationale in the Bolt metadata file.

Summary by cubic

Cache a normalized numpy embedding per knowledge item and reuse it across queries to remove a CPU bottleneck in vector search. Also extracted _compute_similarity to reduce method complexity and gracefully handle items without embeddings or that can't store _cached_np_embedding, while keeping the non-numpy cosine path unchanged.

Written for commit a7dc1fd. Summary will update on new commits.

Summary by CodeRabbit

  • Refactor
    • Optimized embedding similarity computation to improve search performance and reduce resource usage.

Co-authored-by: daggerstuff <261005129+daggerstuff@users.noreply.github.com>
@vercel
Copy link
Copy Markdown

vercel bot commented Mar 30, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
ai Error Error Mar 30, 2026 11:02pm

@google-labs-jules
Copy link
Copy Markdown
Contributor

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.


For security, I will only act on instructions from the user who triggered this task.

@sourcery-ai
Copy link
Copy Markdown

sourcery-ai bot commented Mar 30, 2026

Reviewer's guide (collapsed on small PRs)

Reviewer's Guide

Implements lazy, per-item caching of normalized NumPy embeddings during knowledge item similarity search to avoid reconstructing arrays on every query, along with a Bolt notebook note describing the optimization.

Sequence diagram for cached embedding similarity search

sequenceDiagram
    actor Client
    participant EmbeddingAgentService
    participant KnowledgeItem

    Client->>EmbeddingAgentService: search_similar(query_embedding, knowledge_items)
    EmbeddingAgentService->>EmbeddingAgentService: build query_np from query_embedding
    loop for each item in knowledge_items
        EmbeddingAgentService->>KnowledgeItem: get _cached_np_embedding
        alt cache hit
            KnowledgeItem-->>EmbeddingAgentService: item_np
        else cache miss
            EmbeddingAgentService->>KnowledgeItem: get embedding
            alt embedding is None
                EmbeddingAgentService-->>EmbeddingAgentService: skip item
            else embedding exists
                EmbeddingAgentService->>EmbeddingAgentService: item_np = np.array(embedding)
                alt normalize_embeddings is True and norm > 0
                    EmbeddingAgentService->>EmbeddingAgentService: item_np = item_np / norm
                end
                EmbeddingAgentService->>KnowledgeItem: set _cached_np_embedding = item_np
            end
        end
        alt NUMPY_AVAILABLE and query_np and item_np not None
            EmbeddingAgentService->>EmbeddingAgentService: similarity = dot(query_np, item_np)
        else
            EmbeddingAgentService->>KnowledgeItem: get embedding
            EmbeddingAgentService->>EmbeddingAgentService: similarity = cosine(query_embedding, embedding)
        end
    end
    EmbeddingAgentService-->>Client: similar_items
Loading

Class diagram for embedding search caching changes

classDiagram
    class EmbeddingAgentService {
        config
        _search_knowledge_items(query_embedding, knowledge_items, knowledge_types)
    }

    class KnowledgeItem {
        embedding
        _cached_np_embedding
    }

    EmbeddingAgentService --> KnowledgeItem : iterates_and_reads
Loading

File-Level Changes

Change Details Files
Add lazy caching of normalized NumPy embeddings on knowledge items to avoid repeated array creation and normalization inside the search loop.
  • Introduce a _cached_np_embedding attribute on knowledge items to store the parsed (and optionally normalized) NumPy array representation of the embedding.
  • On cache miss, fetch the raw embedding, convert it to a NumPy array, optionally normalize it based on configuration, and store it back on the item for future queries.
  • Reuse the cached NumPy embedding for similarity computation when NumPy and the query array are available, falling back to Python-based cosine similarity when not.
  • Ensure the non-NumPy similarity path still retrieves the embedding attribute directly when needed.
api/embedding_agent/service.py
Document the performance optimization in the Bolt metadata file.
  • Add a Bolt log entry describing the lazy caching of normalized embeddings and the motivation behind it.
.Jules/bolt.md

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 30, 2026

📝 Walkthrough

Walkthrough

Refactored similarity computation in the embedding agent service by introducing a _compute_similarity helper method that caches NumPy embeddings per item, reducing redundant array conversions and normalizations during knowledge item searches while maintaining fallback cosine similarity computation.

Changes

Cohort / File(s) Summary
Embedding Similarity Optimization
api/embedding_agent/service.py
Extracted similarity calculation into a _compute_similarity helper that caches per-item NumPy embeddings and their normalized forms. Updated _search_knowledge_items to delegate similarity computation, avoiding repeated array conversions and normalization steps. Handles items with restricted attributes (e.g., __slots__) by skipping cache storage.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 A cache-fueled hop through embeddings so fine,
No more rebuilding arrays each time—
NumPy embeddings saved for the win,
Similarity searches now faster within!
Optimization magic, a rabbit's delight!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly addresses the main change: caching NumPy arrays in embedding search to improve performance, which matches the core optimization described in the PR objectives.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch perf/embedding-caching-numpy-2310443046821552135

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've left some high level feedback:

  • The new flow no longer short-circuits when item.embedding is None in the non-numpy path, so zip(query_embedding, item_embedding) can now throw if embedding is missing; consider restoring an early continue when item_embedding is None before both similarity branches.
  • Because _cached_np_embedding is computed using self.config.normalize_embeddings, a runtime change to this flag will leave stale cached vectors with incorrect normalization; if config can change, key the cache by normalization mode or avoid sharing cached arrays across modes.
  • Attaching _cached_np_embedding directly to item instances may have concurrency or lifecycle implications if these objects are shared between requests or processes; consider using an external cache keyed by item ID or ensuring that item instances are request-scoped.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The new flow no longer short-circuits when `item.embedding` is `None` in the non-numpy path, so `zip(query_embedding, item_embedding)` can now throw if `embedding` is missing; consider restoring an early `continue` when `item_embedding is None` before both similarity branches.
- Because `_cached_np_embedding` is computed using `self.config.normalize_embeddings`, a runtime change to this flag will leave stale cached vectors with incorrect normalization; if config can change, key the cache by normalization mode or avoid sharing cached arrays across modes.
- Attaching `_cached_np_embedding` directly to `item` instances may have concurrency or lifecycle implications if these objects are shared between requests or processes; consider using an external cache keyed by item ID or ensuring that item instances are request-scoped.

Fix all in Cursor


Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
api/embedding_agent/service.py (2)

552-564: Redundant re-fetch of item.embedding in fallback path.

When the fallback cosine similarity path is taken (numpy unavailable), item_embedding was already retrieved at line 540 but goes out of scope before line 555. Consider hoisting item_embedding to avoid the redundant getattr call.

♻️ Proposed refactor to avoid redundant fetch
         # Get item embedding if cache misses
+        item_embedding = None
         if item_np is None:
             item_embedding = getattr(item, "embedding", None)
             if item_embedding is None:
                 continue
             if NUMPY_AVAILABLE and query_np is not None:
                 item_np = np.array(item_embedding)
                 if self.config.normalize_embeddings:
                     norm = np.linalg.norm(item_np)
                     if norm > 0:
                         item_np = item_np / norm
-                    setattr(item, "_cached_np_embedding", item_np)
+                item._cached_np_embedding = item_np

         # Calculate similarity
         if NUMPY_AVAILABLE and query_np is not None and item_np is not None:
             similarity = float(np.dot(query_np, item_np))
         else:
-            item_embedding = getattr(item, "embedding", None)
+            if item_embedding is None:
+                item_embedding = getattr(item, "embedding", None)
             # Simple cosine similarity
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/embedding_agent/service.py` around lines 552 - 564, Hoist the retrieval
of item.embedding out of the numpy/non-numpy conditional so the fallback path
reuses the same value instead of calling getattr again; specifically, set
item_embedding = getattr(item, "embedding", None) before the block that checks
NUMPY_AVAILABLE and query_np/item_np, then use item_embedding in the fallback
cosine-similarity calculation (references: NUMPY_AVAILABLE, query_np, item_np,
item_embedding, query_embedding).

528-549: Thread safety consideration for cache population.

This method modifies shared state (item._cached_np_embedding) without holding self._lock. Concurrent calls to search_similar could race on populating the cache for the same item.

In this case, the race is benign since all threads compute and store the same normalized array value. However, if KnowledgeItem is not thread-safe for attribute writes, or if future changes make the cached value non-deterministic, this could become problematic.

Consider documenting this as an intentional benign race, or use the lock if strict thread safety is required.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/embedding_agent/service.py` around lines 528 - 549, The loop in
search_similar modifies shared state by setting item._cached_np_embedding
without synchronization; either document this as an intentional benign race or
make it thread-safe by using self._lock around the cache write: compute item_np
(and normalization) outside the lock, then acquire self._lock, re-check
getattr(item, "_cached_np_embedding", None) to avoid duplicate work, and set
setattr(item, "_cached_np_embedding", item_np) if still missing; reference the
search_similar method, self._lock, item._cached_np_embedding, and KnowledgeItem
when making the change or adding the comment.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@api/embedding_agent/service.py`:
- Around line 552-564: Hoist the retrieval of item.embedding out of the
numpy/non-numpy conditional so the fallback path reuses the same value instead
of calling getattr again; specifically, set item_embedding = getattr(item,
"embedding", None) before the block that checks NUMPY_AVAILABLE and
query_np/item_np, then use item_embedding in the fallback cosine-similarity
calculation (references: NUMPY_AVAILABLE, query_np, item_np, item_embedding,
query_embedding).
- Around line 528-549: The loop in search_similar modifies shared state by
setting item._cached_np_embedding without synchronization; either document this
as an intentional benign race or make it thread-safe by using self._lock around
the cache write: compute item_np (and normalization) outside the lock, then
acquire self._lock, re-check getattr(item, "_cached_np_embedding", None) to
avoid duplicate work, and set setattr(item, "_cached_np_embedding", item_np) if
still missing; reference the search_similar method, self._lock,
item._cached_np_embedding, and KnowledgeItem when making the change or adding
the comment.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 05da8d9c-84aa-4b1e-80b0-313841b4f656

📥 Commits

Reviewing files that changed from the base of the PR and between 5e91c81 and 7cd0e2a.

📒 Files selected for processing (2)
  • .Jules/bolt.md
  • api/embedding_agent/service.py

Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 2 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="api/embedding_agent/service.py">

<violation number="1" location="api/embedding_agent/service.py:549">
P2: The new `_cached_np_embedding` cache bypasses cache lifecycle controls, so `clear_cache()` no longer clears all embedding caches and cached vectors persist unexpectedly.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

norm = np.linalg.norm(item_np)
if norm > 0:
item_np = item_np / norm
setattr(item, "_cached_np_embedding", item_np)
Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai bot Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: The new _cached_np_embedding cache bypasses cache lifecycle controls, so clear_cache() no longer clears all embedding caches and cached vectors persist unexpectedly.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At api/embedding_agent/service.py, line 549:

<comment>The new `_cached_np_embedding` cache bypasses cache lifecycle controls, so `clear_cache()` no longer clears all embedding caches and cached vectors persist unexpectedly.</comment>

<file context>
@@ -532,20 +532,27 @@ def _search_knowledge_items(
+                        norm = np.linalg.norm(item_np)
+                        if norm > 0:
+                            item_np = item_np / norm
+                    setattr(item, "_cached_np_embedding", item_np)
 
             # Calculate similarity
</file context>
Fix with Cubic

Co-authored-by: daggerstuff <261005129+daggerstuff@users.noreply.github.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a7dc1fdadc

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +522 to +524
setattr(item, "_cached_np_embedding", item_np)
except AttributeError:
pass # Skip caching if item is a strict class (e.g., uses slots)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Handle cache-write failures beyond AttributeError

In _compute_similarity, the new cache path writes item_np back onto each knowledge item with setattr, but it only catches AttributeError. Strict model objects (for example, classes that reject unknown attributes via __setattr__) commonly raise ValueError or TypeError instead, which will now bubble up and fail the entire similarity search request for that item set. This optimization path should degrade gracefully (skip caching) rather than turning unsupported item types into request-time exceptions.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
api/embedding_agent/service.py (1)

532-532: Add strict=True to zip() for mismatched-length detection.

Per Python 3.11+ (the target version per guidelines), zip() supports strict=True to raise ValueError if iterables have different lengths. Since embeddings should always have matching dimensions, this would catch data corruption early.

♻️ Proposed fix
-        dot_product = sum(a * b for a, b in zip(query_embedding, item_embedding))
+        dot_product = sum(a * b for a, b in zip(query_embedding, item_embedding, strict=True))
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/embedding_agent/service.py` at line 532, The dot_product calculation
currently uses zip(query_embedding, item_embedding) which silently truncates
mismatched-length embeddings; update this to use zip(query_embedding,
item_embedding, strict=True) so a ValueError is raised on dimension mismatch
(catch/log upstream if needed). Modify the line that computes dot_product
(referencing query_embedding, item_embedding and zip) to pass strict=True to
detect corrupted or mis-sized embeddings early.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@api/embedding_agent/service.py`:
- Around line 521-524: Replace the silent except block that swallows
AttributeError when running setattr(item, "_cached_np_embedding", item_np) with
an explicit debug log so caching failures are observable; in the except
AttributeError handler call a logger.debug (e.g., logger.debug or get module
logger via logging.getLogger(__name__)) and include the item repr/type and the
exception message to show why caching was skipped, rather than using pass.
Ensure the logger reference you use is consistent with the module (or import
logging if needed) and keep the handler limited to debug-level output.
- Around line 505-535: The cached numpy embedding in _compute_similarity uses
self.config.normalize_embeddings at cache time, so if that flag changes later
cached embeddings become stale; update the caching to also record the
normalization state (e.g., store a companion attribute like
_cached_np_embedding_normalized alongside _cached_np_embedding) and when reading
the cache in _compute_similarity verify that getattr(item,
"_cached_np_embedding_normalized", None) matches
self.config.normalize_embeddings and, if not, recompute (and overwrite) the
cached embedding using the current self.config.normalize_embeddings; reference
_compute_similarity, _cached_np_embedding, and self.config.normalize_embeddings
when implementing this change.

---

Nitpick comments:
In `@api/embedding_agent/service.py`:
- Line 532: The dot_product calculation currently uses zip(query_embedding,
item_embedding) which silently truncates mismatched-length embeddings; update
this to use zip(query_embedding, item_embedding, strict=True) so a ValueError is
raised on dimension mismatch (catch/log upstream if needed). Modify the line
that computes dot_product (referencing query_embedding, item_embedding and zip)
to pass strict=True to detect corrupted or mis-sized embeddings early.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f7abb299-06fe-4e52-b368-eab719898ce0

📥 Commits

Reviewing files that changed from the base of the PR and between 7cd0e2a and a7dc1fd.

📒 Files selected for processing (1)
  • api/embedding_agent/service.py

Comment on lines +505 to +535
def _compute_similarity(
self, item: Any, query_embedding: List[float], query_np: Any
) -> Optional[float]:
# ⚡ Bolt: Cache normalized numpy arrays safely to avoid regenerating on query
item_np = getattr(item, "_cached_np_embedding", None)

if item_np is None:
item_embedding = getattr(item, "embedding", None)
if item_embedding is None:
return None
if NUMPY_AVAILABLE and query_np is not None:
item_np = np.array(item_embedding)
if self.config.normalize_embeddings:
norm = np.linalg.norm(item_np)
if norm > 0:
item_np = item_np / norm
try:
setattr(item, "_cached_np_embedding", item_np)
except AttributeError:
pass # Skip caching if item is a strict class (e.g., uses slots)

if NUMPY_AVAILABLE and query_np is not None and item_np is not None:
return float(np.dot(query_np, item_np))

item_embedding = getattr(item, "embedding", None)
if item_embedding is None:
return None
dot_product = sum(a * b for a, b in zip(query_embedding, item_embedding))
norm_q = sum(x**2 for x in query_embedding) ** 0.5
norm_i = sum(x**2 for x in item_embedding) ** 0.5
return dot_product / (norm_q * norm_i) if norm_q * norm_i > 0 else 0.0
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Stale cache risk when normalize_embeddings config changes.

The cached _cached_np_embedding is computed based on the value of self.config.normalize_embeddings at cache time. Since EmbeddingAgentConfig is mutable (it's a Pydantic BaseModel without frozen=True), if normalize_embeddings changes after items are cached, subsequent similarity computations will use incorrectly normalized/unnormalized embeddings.

Consider one of:

  1. Include the normalization state in the cache (e.g., _cached_np_embedding_normalized: bool)
  2. Make the config immutable (frozen=True)
  3. Invalidate item caches when config changes
🛡️ Proposed fix: track normalization state with the cache
-        item_np = getattr(item, "_cached_np_embedding", None)
+        cached = getattr(item, "_cached_np_embedding", None)
+        cached_normalized = getattr(item, "_cached_np_normalized", None)
+        
+        # Invalidate cache if normalization setting changed
+        if cached is not None and cached_normalized != self.config.normalize_embeddings:
+            cached = None
+        item_np = cached

         if item_np is None:
             item_embedding = getattr(item, "embedding", None)
             if item_embedding is None:
                 return None
             if NUMPY_AVAILABLE and query_np is not None:
                 item_np = np.array(item_embedding)
                 if self.config.normalize_embeddings:
                     norm = np.linalg.norm(item_np)
                     if norm > 0:
                         item_np = item_np / norm
                 try:
                     setattr(item, "_cached_np_embedding", item_np)
+                    setattr(item, "_cached_np_normalized", self.config.normalize_embeddings)
                 except AttributeError:
-                    pass  # Skip caching if item is a strict class (e.g., uses slots)
+                    # Item class restricts attribute assignment (e.g., __slots__)
+                    logger.debug("Cannot cache embedding on item: attribute assignment not allowed")
🧰 Tools
🪛 Ruff (0.15.7)

[warning] 522-522: Do not call setattr with a constant attribute value. It is not any safer than normal property access.

Replace setattr with assignment

(B010)


[warning] 532-532: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/embedding_agent/service.py` around lines 505 - 535, The cached numpy
embedding in _compute_similarity uses self.config.normalize_embeddings at cache
time, so if that flag changes later cached embeddings become stale; update the
caching to also record the normalization state (e.g., store a companion
attribute like _cached_np_embedding_normalized alongside _cached_np_embedding)
and when reading the cache in _compute_similarity verify that getattr(item,
"_cached_np_embedding_normalized", None) matches
self.config.normalize_embeddings and, if not, recompute (and overwrite) the
cached embedding using the current self.config.normalize_embeddings; reference
_compute_similarity, _cached_np_embedding, and self.config.normalize_embeddings
when implementing this change.

Comment on lines +521 to +524
try:
setattr(item, "_cached_np_embedding", item_np)
except AttributeError:
pass # Skip caching if item is a strict class (e.g., uses slots)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Replace pass with explicit logging or structured handling.

The pass statement violates coding guidelines which prohibit "stubs, placeholders, pass statements." While the except AttributeError is a valid fallback, silently swallowing it obscures when caching fails. Log at debug level so operators can diagnose performance issues if caching is unexpectedly disabled for certain item types.

♻️ Proposed fix
                 try:
                     setattr(item, "_cached_np_embedding", item_np)
                 except AttributeError:
-                    pass  # Skip caching if item is a strict class (e.g., uses slots)
+                    logger.debug(
+                        "Cannot cache numpy embedding on item %s: attribute assignment restricted",
+                        type(item).__name__,
+                    )

As per coding guidelines: "Absolutely prohibit stubs, placeholders, pass statements, TODO comments, NotImplementedError, or FIXME comments; every implementation must be production-ready."

🧰 Tools
🪛 Ruff (0.15.7)

[warning] 522-522: Do not call setattr with a constant attribute value. It is not any safer than normal property access.

Replace setattr with assignment

(B010)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/embedding_agent/service.py` around lines 521 - 524, Replace the silent
except block that swallows AttributeError when running setattr(item,
"_cached_np_embedding", item_np) with an explicit debug log so caching failures
are observable; in the except AttributeError handler call a logger.debug (e.g.,
logger.debug or get module logger via logging.getLogger(__name__)) and include
the item repr/type and the exception message to show why caching was skipped,
rather than using pass. Ensure the logger reference you use is consistent with
the module (or import logging if needed) and keep the handler limited to
debug-level output.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant