Skip to content

fix(search): empty text/subject terms must not match every message#407

Merged
mariusvniekerk merged 1 commit into
kenn-io:mainfrom
njt:fix/search-empty-terms
Jun 22, 2026
Merged

fix(search): empty text/subject terms must not match every message#407
mariusvniekerk merged 1 commit into
kenn-io:mainfrom
njt:fix/search-empty-terms

Conversation

@njt

@njt njt commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

An empty search term produced LIKE '%%', which matched every message instead of nothing. Two cases:

  • Free-text terms on the LIKE fallback path (used when FTS errors at runtime, or the binary is built without the fts5 tag): a tokenless term (empty or punctuation-only) became LIKE '%term%'; "" matched everything. Tokenless terms are now skipped (same predicate the FTS path uses), substituting FALSE when none remain — so the fallback mirrors the FTS path.
  • subject: operator: subject: or subject:"" yielded an empty subject filter that matched every message. The parser now drops empty/whitespace subject: values (mirroring the existing label: handlers); subject:"!!!" remains a valid literal substring search.

🤖 Generated with Claude Code

Two related cases where an empty term produced LIKE '%%' (matches all) instead
of the intended zero/no-op:

- LIKE fallback path (ftsAvailable=false, reached when FTS errors at runtime or
  the binary is built without the fts5 tag): a tokenless TextTerm (empty or
  punctuation-only) was turned into LIKE '%term%'. For "" this matched every
  message. Skip tokenless terms (hasFTSToken, the same predicate the FTS path
  uses) and substitute FALSE when none remain, mirroring the FTS path.

- subject: operator: the parser appended the raw value, so `subject:` or
  `subject:""` yielded SubjectTerms=[""] -> LIKE '%%' -> every message. Drop
  empty/whitespace values in the parser (mirroring the label handlers) and
  guard empty subject terms in the store. Punctuation-only values like
  subject:"!!!" remain valid literal substring searches.

Adds an internal store test that forces the no-FTS branch (runs under the fts5
build too) and parser cases for empty/punctuation subject values.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_012Ri7QdvXXUMQPke9wrLjSS
@roborev-ci

roborev-ci Bot commented Jun 22, 2026

Copy link
Copy Markdown

roborev: Combined Review (6eff015)

No issues found.


Panel: ci_default_security | Synthesis: codex | Members: codex_default (codex/default, done, 3m30s), codex_security (codex/security, done, 8s) | Total: 3m38s

@njt

njt commented Jun 22, 2026

Copy link
Copy Markdown
Contributor Author

Split out of #398 (Teams ingestion) to keep that PR scoped. Found via a pre-existing failing store test (TestSearchMessagesQuery_TokenlessTextTerms) while validating that work. Independent of #398 — branches off current main.

@mariusvniekerk mariusvniekerk merged commit f9b7ee9 into kenn-io:main Jun 22, 2026
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants