Add lexical fallback for conversation search#7421
Conversation
Greptile SummaryThis PR adds a lightweight lexical fallback to
Confidence Score: 4/5Safe to merge; the existing vector search path is untouched and the new lexical branch only fires on a miss. The core fallback logic is correct and well-tested. The main friction points are a redundant Firestore re-fetch in the lexical branch, a 200-conversation candidate cap that silently limits recall for users with large histories, and a large commented-out block of the old implementation. None of these affect correctness in normal operation. backend/utils/retrieval/tool_services/conversations.py — the lexical fallback section (lines ~297–319) has the redundant DB fetch and dead-code comment block worth cleaning up before merge. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[search_conversations_text called] --> B[Parse date filters]
B --> C[Call vector_db.query_vectors]
C --> D{conversation_ids empty?}
D -- No --> G[retrieval_mode = semantic]
D -- Yes --> E[Call get_conversations\nlimit=200, start_dt/end_dt]
E --> F[_rank_conversations_lexically\nscores title x6 overview x4 transcript x1.5\nphrase boost +12/+8/+4]
F --> H{ranked IDs empty?}
H -- Yes --> I[Return: No conversations found]
H -- No --> J[retrieval_mode = lexical]
G --> K[Call get_conversations_by_id]
J --> K
K --> L[Filter locked conversations]
L --> M[Sort by retrieval rank]
M --> N[Deserialize + format]
N --> O[Return: Found N conversations via MODE retrieval]
|
| # Existing vector-only implementation: | ||
| # conversation_ids = vector_db.query_vectors(query=query, uid=uid, starts_at=starts_at, ends_at=ends_at, k=limit) | ||
| # | ||
| # if not conversation_ids: | ||
| # date_info = "" | ||
| # if starts_at and ends_at: | ||
| # date_info = " in the specified date range" | ||
| # elif starts_at: | ||
| # date_info = " after the specified start date" | ||
| # elif ends_at: | ||
| # date_info = " before the specified end date" | ||
| # return f"No conversations found matching '{query}'{date_info}." | ||
| # | ||
| # conversations_data = conversations_db.get_conversations_by_id(uid, conversation_ids) |
There was a problem hiding this comment.
The old vector-only implementation is left as a 14-line comment block. This is pure noise — the git history already preserves the original logic. The comment also risks confusion: a future reader may assume the block is meant to be re-enabled or may accidentally un-comment it.
|
|
||
| logger = logging.getLogger(__name__) | ||
|
|
||
| LEXICAL_FALLBACK_CANDIDATE_LIMIT = 200 |
There was a problem hiding this comment.
Silent recall cap for users with large conversation histories
LEXICAL_FALLBACK_CANDIDATE_LIMIT = 200 means the fallback searches only the 200 most-recently-created conversations. A user who had a relevant conversation #201 or older will silently get "No conversations found" even though lexical scoring would have ranked it. Consider logging when len(candidate_conversations) == LEXICAL_FALLBACK_CANDIDATE_LIMIT to make this ceiling observable in production.
|
@RatnamOjha please check greptile review comments |
kodjima33
left a comment
There was a problem hiding this comment.
lexical fallback for exact-entity recall is a sensible addition, thanks
Summary
This PR adds a lightweight vectorless lexical fallback to conversation search.
Today,
search_conversations_textprimarily depends on semantic vector search through Pinecone. That works well for broad conceptual queries, but it can miss exact-match user queries like names, acronyms, company names, project names, tools, and short phrases.This change keeps the existing vector search path intact, but when vector search returns no conversation IDs, it falls back to a small in-repo lexical scorer over conversation title, overview, and transcript text.
Thinking Behind The Logic
Omi users often ask memory-style questions that are not purely semantic:
These queries contain exact entities. Vector search is useful for semantic similarity, but exact tokens and acronyms are sometimes better handled by lexical retrieval.
The fallback scores candidate conversations across three fields:
structured.titlestructured.overviewtranscript_segments[].textMatches in title and overview are weighted higher than transcript matches because they are more likely to represent the central topic of the conversation. Phrase matches get an additional boost for multi-word entities.
The goal is not to replace semantic retrieval. The goal is to make retrieval more robust when semantic search misses or when Pinecone is unavailable.
Why This Is Beneficial
1. Better exact recall
Personal memory queries often involve exact names, tools, people, products, and organizations. Lexical fallback improves recall for these cases.
2. More graceful failure mode
If Pinecone returns no matches, the user currently gets “No conversations found.” With this fallback, Omi can still search recent/date-filtered conversation candidates using local structured text.
3. No new dependency
The implementation is intentionally lightweight and dependency-free. It does not add a BM25 package or introduce new infra.
4. Keeps existing behavior safe
The existing vector retrieval path remains unchanged when vector results are available. The fallback only activates on vector misses.
5. Better local/dev/self-hosted experience
Vector search requires configured embedding + Pinecone infra. A vectorless retrieval path makes conversation search more useful in constrained local or self-hosted environments.
How This Is Different From Existing Retrieval
Existing semantic retrieval:
This PR adds fallback retrieval:
So the system now handles two different search modes:
This is especially useful because “memory search” is not only semantic. Users frequently remember exact words, names, tools, and acronyms.
Tests
Added unit coverage for:
tested locally:
result:

Future Scope
This is a small first step. Some possible follow-ups:
Hybrid ranking instead of fallback-only
Right now lexical retrieval only runs when vector search misses. A stronger version would run both semantic and lexical retrieval, then merge results with Reciprocal Rank Fusion.
True BM25
The current scorer is intentionally simple. A future PR could add real BM25 scoring over title, overview, and transcript text.
Entity-aware search
Omi already extracts people, topics, entities, and metadata in parts of the pipeline. Future retrieval could boost exact matches on entities and people.
Better candidate source
The fallback currently depends on get_conversations returning enough candidate text. A dedicated lightweight searchable index would make this faster and more complete.