Skip to content

Add lexical fallback for conversation search#7421

Open
RatnamOjha wants to merge 2 commits into
BasedHardware:mainfrom
RatnamOjha:hybrid-retrieval-evals
Open

Add lexical fallback for conversation search#7421
RatnamOjha wants to merge 2 commits into
BasedHardware:mainfrom
RatnamOjha:hybrid-retrieval-evals

Conversation

@RatnamOjha
Copy link
Copy Markdown

Summary

This PR adds a lightweight vectorless lexical fallback to conversation search.

Today, search_conversations_text primarily depends on semantic vector search through Pinecone. That works well for broad conceptual queries, but it can miss exact-match user queries like names, acronyms, company names, project names, tools, and short phrases.

This change keeps the existing vector search path intact, but when vector search returns no conversation IDs, it falls back to a small in-repo lexical scorer over conversation title, overview, and transcript text.

Thinking Behind The Logic

Omi users often ask memory-style questions that are not purely semantic:

  • “When did I talk about that internship?”
  • “What did we discuss about the new project?”
  • “Find my conversation about the database migration”
  • “When did I mention the conference?”

These queries contain exact entities. Vector search is useful for semantic similarity, but exact tokens and acronyms are sometimes better handled by lexical retrieval.

The fallback scores candidate conversations across three fields:

  • structured.title
  • structured.overview
  • transcript_segments[].text

Matches in title and overview are weighted higher than transcript matches because they are more likely to represent the central topic of the conversation. Phrase matches get an additional boost for multi-word entities.

The goal is not to replace semantic retrieval. The goal is to make retrieval more robust when semantic search misses or when Pinecone is unavailable.

Why This Is Beneficial

1. Better exact recall

Personal memory queries often involve exact names, tools, people, products, and organizations. Lexical fallback improves recall for these cases.

2. More graceful failure mode

If Pinecone returns no matches, the user currently gets “No conversations found.” With this fallback, Omi can still search recent/date-filtered conversation candidates using local structured text.

3. No new dependency

The implementation is intentionally lightweight and dependency-free. It does not add a BM25 package or introduce new infra.

4. Keeps existing behavior safe

The existing vector retrieval path remains unchanged when vector results are available. The fallback only activates on vector misses.

5. Better local/dev/self-hosted experience

Vector search requires configured embedding + Pinecone infra. A vectorless retrieval path makes conversation search more useful in constrained local or self-hosted environments.

How This Is Different From Existing Retrieval

Existing semantic retrieval:

query -> embedding -> Pinecone -> conversation IDs -> Firestore -> formatted context

This PR adds fallback retrieval:

query -> tokenize -> score title/overview/transcripts -> ranked conversation IDs -> Firestore -> formatted context

So the system now handles two different search modes:

  • Semantic search for conceptual similarity
  • Lexical search for exact entity recall

This is especially useful because “memory search” is not only semantic. Users frequently remember exact words, names, tools, and acronyms.

Tests

Added unit coverage for:

  • Existing vector search behavior
  • Lexical ranking prioritizing title/overview matches over transcript-only matches
  • Lexical fallback when vector search returns no results
  • Transcript-text lexical matching
  • Locked conversation filtering in fallback path

tested locally:

python3 -m pytest tests/unit/test_tools_router.py -q -k SearchConversationsText

result:
PHOTO-2026-05-21-09-49-10

Future Scope

This is a small first step. Some possible follow-ups:

  1. Hybrid ranking instead of fallback-only
    Right now lexical retrieval only runs when vector search misses. A stronger version would run both semantic and lexical retrieval, then merge results with Reciprocal Rank Fusion.

  2. True BM25
    The current scorer is intentionally simple. A future PR could add real BM25 scoring over title, overview, and transcript text.

  3. Entity-aware search
    Omi already extracts people, topics, entities, and metadata in parts of the pipeline. Future retrieval could boost exact matches on entities and people.

  4. Better candidate source
    The fallback currently depends on get_conversations returning enough candidate text. A dedicated lightweight searchable index would make this faster and more complete.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 21, 2026

Greptile Summary

This PR adds a lightweight lexical fallback to search_conversations_text that activates when Pinecone vector search returns no results, scoring conversations across title, overview, and transcript text using a simple token + phrase-match heuristic.

  • The existing vector search path is unchanged; lexical scoring only runs on a miss and considers up to 200 most-recent conversations fetched from Firestore.
  • Three new helper functions (_tokenize_for_lexical_search, _score_conversation_lexically, _rank_conversations_lexically) implement the scoring logic, and four new unit tests cover the main fallback scenarios.

Confidence Score: 4/5

Safe to merge; the existing vector search path is untouched and the new lexical branch only fires on a miss.

The core fallback logic is correct and well-tested. The main friction points are a redundant Firestore re-fetch in the lexical branch, a 200-conversation candidate cap that silently limits recall for users with large histories, and a large commented-out block of the old implementation. None of these affect correctness in normal operation.

backend/utils/retrieval/tool_services/conversations.py — the lexical fallback section (lines ~297–319) has the redundant DB fetch and dead-code comment block worth cleaning up before merge.

Important Files Changed

Filename Overview
backend/utils/retrieval/tool_services/conversations.py Adds lexical fallback for conversation search: new helper functions for tokenizing/scoring text, _rank_conversations_lexically, and a fallback branch in search_conversations_text that fires when vector search returns no IDs. Contains a redundant Firestore re-fetch in the fallback path, a large commented-out dead-code block, and a silent recall cap at 200 candidates.
backend/tests/unit/test_tools_router.py Adds four new unit tests covering: title/overview score priority, fallback activation on empty vector results, transcript-text matching, and locked-conversation filtering. Setup properly resets all mocks including the new get_conversations mock.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[search_conversations_text called] --> B[Parse date filters]
    B --> C[Call vector_db.query_vectors]
    C --> D{conversation_ids empty?}
    D -- No --> G[retrieval_mode = semantic]
    D -- Yes --> E[Call get_conversations\nlimit=200, start_dt/end_dt]
    E --> F[_rank_conversations_lexically\nscores title x6 overview x4 transcript x1.5\nphrase boost +12/+8/+4]
    F --> H{ranked IDs empty?}
    H -- Yes --> I[Return: No conversations found]
    H -- No --> J[retrieval_mode = lexical]
    G --> K[Call get_conversations_by_id]
    J --> K
    K --> L[Filter locked conversations]
    L --> M[Sort by retrieval rank]
    M --> N[Deserialize + format]
    N --> O[Return: Found N conversations via MODE retrieval]
Loading

Comments Outside Diff (1)

  1. backend/utils/retrieval/tool_services/conversations.py, line 297-319 (link)

    P2 Redundant Firestore fetch in lexical fallback path

    candidate_conversations already contains the full conversation documents fetched from Firestore (title, overview, transcript segments). After _rank_conversations_lexically selects the top-N IDs from that same slice, the code immediately calls get_conversations_by_id to re-fetch the exact same documents by those IDs. This doubles the Firestore read in the fallback path, adding latency and cost. In the lexical branch, a simple lookup dict from candidate_conversations would skip the second round-trip entirely.

Reviews (1): Last reviewed commit: "Add hybrid semantic + lexical conversati..." | Re-trigger Greptile

Comment on lines +276 to +289
# Existing vector-only implementation:
# conversation_ids = vector_db.query_vectors(query=query, uid=uid, starts_at=starts_at, ends_at=ends_at, k=limit)
#
# if not conversation_ids:
# date_info = ""
# if starts_at and ends_at:
# date_info = " in the specified date range"
# elif starts_at:
# date_info = " after the specified start date"
# elif ends_at:
# date_info = " before the specified end date"
# return f"No conversations found matching '{query}'{date_info}."
#
# conversations_data = conversations_db.get_conversations_by_id(uid, conversation_ids)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Dead commented-out code block

The old vector-only implementation is left as a 14-line comment block. This is pure noise — the git history already preserves the original logic. The comment also risks confusion: a future reader may assume the block is meant to be re-enabled or may accidentally un-comment it.


logger = logging.getLogger(__name__)

LEXICAL_FALLBACK_CANDIDATE_LIMIT = 200
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Silent recall cap for users with large conversation histories

LEXICAL_FALLBACK_CANDIDATE_LIMIT = 200 means the fallback searches only the 200 most-recently-created conversations. A user who had a relevant conversation #201 or older will silently get "No conversations found" even though lexical scoring would have ranked it. Consider logging when len(candidate_conversations) == LEXICAL_FALLBACK_CANDIDATE_LIMIT to make this ceiling observable in production.

@krushnarout
Copy link
Copy Markdown
Member

@RatnamOjha please check greptile review comments

Copy link
Copy Markdown
Collaborator

@kodjima33 kodjima33 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lexical fallback for exact-entity recall is a sensible addition, thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants