Skip to content

feat(store): mem_search returns 0 when no observation contains all query tokens #352

@Basparin

Description

@Basparin

📋 Pre-flight Checks

  • I have searched existing issues and this is not a duplicate
  • I understand this issue needs status:approved before a PR can be opened

🔍 Problem Description

mem_search returns 0 results for multi-token queries when no single observation contains every token, even if observations exist for each token individually. The failure is silent: callers see "no results" and conclude the data is missing, sometimes saving duplicates of memos that already exist.

💡 Proposed Solution

Add a match_mode option ("all" | "any") to Store.Search() and propagate to the mem_search MCP tool. The OR branch can reuse the join-with-OR logic already present in sanitizeFTSCandidates (internal/store/relations.go), which FindCandidates uses internally for the same kind of relaxed match.

Example usage:

mem_search(query: "auth compliance session", match_mode: "any")

📦 Affected Area

Store (database, queries)

🔄 Alternatives Considered

📎 Additional Context

Synthetic repro (fresh database, observations with partial overlap):

mem_save(title: "Auth session middleware", content: "")
mem_save(title: "Compliance audit notes",   content: "session policy")
mem_save(title: "OAuth tokens",              content: "auth and compliance")

mem_search("auth")        -> 2 hits
mem_search("compliance")  -> 2 hits
mem_search("session")     -> 2 hits
mem_search("auth compliance session") -> 0 hits

Each observation contains 2 of the 3 query tokens; the AND query still drops everything because no single observation contains all three. Root cause at internal/store/store.go:5874-5884: sanitizeFTS() wraps each token in quotes joined by spaces, and FTS5 reads spaces between quoted tokens as implicit AND.

Stopgap mitigation (until match_mode lands)

A minimal client-side pre-tool advisory catches the silent collapse before it hits the store. Wired as a Claude Code PreToolUse hook on mcp__plugin_engram_engram__mem_search:

import sys, json, re
data = json.load(sys.stdin)
if data.get("tool_name", "").endswith("__mem_search"):
    q = (data.get("tool_input") or {}).get("query") or ""
    tokens = re.findall(r"[A-Za-z0-9_-]{2,}", q)
    if len(tokens) >= 6:  # placeholder threshold; tune locally
        sys.stderr.write(
            f"engram-query: {len(tokens)} tokens; FTS5 AND likely returns 0. "
            f"Consider splitting into shorter queries.\n"
        )
sys.exit(0)  # advisory only, never blocks

The hook prints a stderr warning when a query is unlikely to match under AND. It does not block the call. Useful while the design is decided so silent zero-results stop being silent.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions