Add lexical fallback for conversation search by RatnamOjha · Pull Request #7421 · BasedHardware/omi

RatnamOjha · 2026-05-21T04:24:29Z

Summary

This PR adds a lightweight vectorless lexical fallback to conversation search.

Today, search_conversations_text primarily depends on semantic vector search through Pinecone. That works well for broad conceptual queries, but it can miss exact-match user queries like names, acronyms, company names, project names, tools, and short phrases.

This change keeps the existing vector search path intact, but when vector search returns no conversation IDs, it falls back to a small in-repo lexical scorer over conversation title, overview, and transcript text.

Thinking Behind The Logic

Omi users often ask memory-style questions that are not purely semantic:

“When did I talk about that internship?”
“What did we discuss about the new project?”
“Find my conversation about the database migration”
“When did I mention the conference?”

These queries contain exact entities. Vector search is useful for semantic similarity, but exact tokens and acronyms are sometimes better handled by lexical retrieval.

The fallback scores candidate conversations across three fields:

structured.title
structured.overview
transcript_segments[].text

Matches in title and overview are weighted higher than transcript matches because they are more likely to represent the central topic of the conversation. Phrase matches get an additional boost for multi-word entities.

The goal is not to replace semantic retrieval. The goal is to make retrieval more robust when semantic search misses or when Pinecone is unavailable.

Why This Is Beneficial

1. Better exact recall

Personal memory queries often involve exact names, tools, people, products, and organizations. Lexical fallback improves recall for these cases.

2. More graceful failure mode

If Pinecone returns no matches, the user currently gets “No conversations found.” With this fallback, Omi can still search recent/date-filtered conversation candidates using local structured text.

3. No new dependency

The implementation is intentionally lightweight and dependency-free. It does not add a BM25 package or introduce new infra.

4. Keeps existing behavior safe

The existing vector retrieval path remains unchanged when vector results are available. The fallback only activates on vector misses.

5. Better local/dev/self-hosted experience

Vector search requires configured embedding + Pinecone infra. A vectorless retrieval path makes conversation search more useful in constrained local or self-hosted environments.

How This Is Different From Existing Retrieval

Existing semantic retrieval:

query -> embedding -> Pinecone -> conversation IDs -> Firestore -> formatted context

This PR adds fallback retrieval:

query -> tokenize -> score title/overview/transcripts -> ranked conversation IDs -> Firestore -> formatted context

So the system now handles two different search modes:

Semantic search for conceptual similarity
Lexical search for exact entity recall

This is especially useful because “memory search” is not only semantic. Users frequently remember exact words, names, tools, and acronyms.

Tests

Added unit coverage for:

Existing vector search behavior
Lexical ranking prioritizing title/overview matches over transcript-only matches
Lexical fallback when vector search returns no results
Transcript-text lexical matching
Locked conversation filtering in fallback path

tested locally:

python3 -m pytest tests/unit/test_tools_router.py -q -k SearchConversationsText

result:

Future Scope

This is a small first step. Some possible follow-ups:

Hybrid ranking instead of fallback-only
Right now lexical retrieval only runs when vector search misses. A stronger version would run both semantic and lexical retrieval, then merge results with Reciprocal Rank Fusion.
True BM25
The current scorer is intentionally simple. A future PR could add real BM25 scoring over title, overview, and transcript text.
Entity-aware search
Omi already extracts people, topics, entities, and metadata in parts of the pipeline. Future retrieval could boost exact matches on entities and people.
Better candidate source
The fallback currently depends on get_conversations returning enough candidate text. A dedicated lightweight searchable index would make this faster and more complete.

greptile-apps · 2026-05-21T04:27:44Z

Greptile Summary

This PR adds a lightweight lexical fallback to search_conversations_text that activates when Pinecone vector search returns no results, scoring conversations across title, overview, and transcript text using a simple token + phrase-match heuristic.

The existing vector search path is unchanged; lexical scoring only runs on a miss and considers up to 200 most-recent conversations fetched from Firestore.
Three new helper functions (_tokenize_for_lexical_search, _score_conversation_lexically, _rank_conversations_lexically) implement the scoring logic, and four new unit tests cover the main fallback scenarios.

Confidence Score: 4/5

Safe to merge; the existing vector search path is untouched and the new lexical branch only fires on a miss.

The core fallback logic is correct and well-tested. The main friction points are a redundant Firestore re-fetch in the lexical branch, a 200-conversation candidate cap that silently limits recall for users with large histories, and a large commented-out block of the old implementation. None of these affect correctness in normal operation.

backend/utils/retrieval/tool_services/conversations.py — the lexical fallback section (lines ~297–319) has the redundant DB fetch and dead-code comment block worth cleaning up before merge.

Important Files Changed

Filename	Overview
backend/utils/retrieval/tool_services/conversations.py	Adds lexical fallback for conversation search: new helper functions for tokenizing/scoring text, `_rank_conversations_lexically`, and a fallback branch in `search_conversations_text` that fires when vector search returns no IDs. Contains a redundant Firestore re-fetch in the fallback path, a large commented-out dead-code block, and a silent recall cap at 200 candidates.
backend/tests/unit/test_tools_router.py	Adds four new unit tests covering: title/overview score priority, fallback activation on empty vector results, transcript-text matching, and locked-conversation filtering. Setup properly resets all mocks including the new `get_conversations` mock.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[search_conversations_text called] --> B[Parse date filters]
    B --> C[Call vector_db.query_vectors]
    C --> D{conversation_ids empty?}
    D -- No --> G[retrieval_mode = semantic]
    D -- Yes --> E[Call get_conversations\nlimit=200, start_dt/end_dt]
    E --> F[_rank_conversations_lexically\nscores title x6 overview x4 transcript x1.5\nphrase boost +12/+8/+4]
    F --> H{ranked IDs empty?}
    H -- Yes --> I[Return: No conversations found]
    H -- No --> J[retrieval_mode = lexical]
    G --> K[Call get_conversations_by_id]
    J --> K
    K --> L[Filter locked conversations]
    L --> M[Sort by retrieval rank]
    M --> N[Deserialize + format]
    N --> O[Return: Found N conversations via MODE retrieval]

Comments Outside Diff (1)

backend/utils/retrieval/tool_services/conversations.py, line 297-319 (link)

Redundant Firestore fetch in lexical fallback path

candidate_conversations already contains the full conversation documents fetched from Firestore (title, overview, transcript segments). After _rank_conversations_lexically selects the top-N IDs from that same slice, the code immediately calls get_conversations_by_id to re-fetch the exact same documents by those IDs. This doubles the Firestore read in the fallback path, adding latency and cost. In the lexical branch, a simple lookup dict from candidate_conversations would skip the second round-trip entirely.

_{Reviews (1): Last reviewed commit: "Add hybrid semantic + lexical conversati..." | Re-trigger Greptile}

greptile-apps · 2026-05-21T04:27:48Z

+        # Existing vector-only implementation:
+        # conversation_ids = vector_db.query_vectors(query=query, uid=uid, starts_at=starts_at, ends_at=ends_at, k=limit)
+        #
+        # if not conversation_ids:
+        #     date_info = ""
+        #     if starts_at and ends_at:
+        #         date_info = " in the specified date range"
+        #     elif starts_at:
+        #         date_info = " after the specified start date"
+        #     elif ends_at:
+        #         date_info = " before the specified end date"
+        #     return f"No conversations found matching '{query}'{date_info}."
+        #
+        # conversations_data = conversations_db.get_conversations_by_id(uid, conversation_ids)


Dead commented-out code block

The old vector-only implementation is left as a 14-line comment block. This is pure noise — the git history already preserves the original logic. The comment also risks confusion: a future reader may assume the block is meant to be re-enabled or may accidentally un-comment it.

greptile-apps · 2026-05-21T04:27:52Z


 logger = logging.getLogger(__name__)

+LEXICAL_FALLBACK_CANDIDATE_LIMIT = 200


Silent recall cap for users with large conversation histories

LEXICAL_FALLBACK_CANDIDATE_LIMIT = 200 means the fallback searches only the 200 most-recently-created conversations. A user who had a relevant conversation #201 or older will silently get "No conversations found" even though lexical scoring would have ranked it. Consider logging when len(candidate_conversations) == LEXICAL_FALLBACK_CANDIDATE_LIMIT to make this ceiling observable in production.

krushnarout · 2026-05-28T17:48:29Z

@RatnamOjha please check greptile review comments

kodjima33

lexical fallback for exact-entity recall is a sensible addition, thanks

Add hybrid semantic + lexical conversation retrieval

f9927c3

greptile-apps Bot reviewed May 21, 2026

View reviewed changes

Format tools router tests

87fda67

kodjima33 approved these changes May 29, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add lexical fallback for conversation search#7421

Add lexical fallback for conversation search#7421
RatnamOjha wants to merge 2 commits into
BasedHardware:mainfrom
RatnamOjha:hybrid-retrieval-evals

RatnamOjha commented May 21, 2026

Uh oh!

greptile-apps Bot commented May 21, 2026 •

edited

Loading

Comments Outside Diff (1)

Uh oh!

greptile-apps Bot May 21, 2026

Uh oh!

greptile-apps Bot May 21, 2026

Uh oh!

krushnarout commented May 28, 2026

Uh oh!

kodjima33 left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants


		logger = logging.getLogger(__name__)

		LEXICAL_FALLBACK_CANDIDATE_LIMIT = 200

Conversation

RatnamOjha commented May 21, 2026

Summary

Thinking Behind The Logic

Why This Is Beneficial

1. Better exact recall

2. More graceful failure mode

3. No new dependency

4. Keeps existing behavior safe

5. Better local/dev/self-hosted experience

How This Is Different From Existing Retrieval

Tests

Future Scope

Uh oh!

greptile-apps Bot commented May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Flowchart

Comments Outside Diff (1)

Uh oh!

greptile-apps Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

krushnarout commented May 28, 2026

Uh oh!

kodjima33 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

greptile-apps Bot commented May 21, 2026 •

edited

Loading