Search Claude Code session history with regex or semantic (vector) search.
| Problem | Solution |
|---|---|
| Claude Code sessions are buried in JSONL files | claude-grep searches them like grep |
| Can't find "that conversation about X" | -s does meaning-based search via embeddings |
| Python startup adds ~200ms latency | Go binary starts in <10ms |
| Search results flood the context window | BM25 compression shows only query-relevant text |
| No structured output for piping | --json outputs clean JSON |
# From source
go install github.com/evoleinik/claude-grep@latest
# Or build manually
git clone https://github.com/evoleinik/claude-grep.git
cd claude-grep
go build -o claude-grep .Requires ollama with nomic-embed-text:
curl -fsSL https://ollama.com/install.sh | sh
ollama pull nomic-embed-text# Regex search (default)
claude-grep "worktree" # current project, last 7 days
claude-grep -a -d 30 "deploy" # all projects, last 30 days
claude-grep -p "database" # your prompts only
claude-grep -r "error" # AI responses only
claude-grep -C 2 "migration" # 2 messages context
# Semantic search
claude-grep --index # build vector index (run once)
claude-grep -s "that database fix" # search by meaning
claude-grep -s -C 1 "notification" # with context
# JSON output
claude-grep --json "test" | jq . # pipe to jq
claude-grep -s --json "deploy" | jq '.[0].similarity'
# Session list
claude-grep -l "error" # list sessions, not content
# Index management
claude-grep --index # index new/changed files
claude-grep --index --all # reindex everything
claude-grep --index --status # show index stats
# Usage telemetry
claude-grep --usage # see how agents use the tool| Flag | Description | Default |
|---|---|---|
-p |
Search only user prompts | both |
-r |
Search only AI responses | both |
-a |
Search all projects | current dir |
-l |
List sessions only | off |
-n N |
Max results | 100 |
-d N |
Max age in days | 7 |
-H N |
Max age in hours (overrides -d) |
- |
-C N |
Context messages (before + after) | 0 |
-B N |
Context messages before | 0 |
-A N |
Context messages after | 0 |
-s |
Semantic search mode | regex |
--json |
JSON output | terminal |
--index |
Build/update vector index | - |
--status |
Show index stats | - |
--all |
Reindex everything | incremental |
--usage |
Show usage stats (agent telemetry) | - |
| Code | Meaning |
|---|---|
| 0 | Matches found |
| 1 | No matches |
| 2 | Error |
Regex mode: Walks ~/.claude/projects/, parses JSONL session files, matches text with Go regexp. Pre-filters files with literal substring matching for speed — alternation patterns like (a|b|c) are decomposed into individual literals and checked with OR semantics. Concurrent file processing (8 goroutines).
Semantic mode: Embeds query via ollama (nomic-embed-text, 768 dims), computes cosine similarity against pre-built index (threshold: 0.55). Skips file re-reads when no context is requested (~60x faster). Index stored as gob files in ~/.claude/search-index/.
BM25 compression: Terminal output uses Okapi BM25 to extract the most query-relevant chunks from each matched message, instead of blind head truncation. The pipeline:
- Split message into sentences (paragraphs > 200 chars get sentence-split)
- Tokenize with bigrams, stop word filtering, and suffix stemming
- Score each chunk against the query with BM25 (k1=1.2, b=0.75)
- Select top-scoring chunks within an adaptive per-match budget
- Deduplicate identical compressed text across matches
This means "deploy" also matches "deployed", "deploying", "deployment", and multi-word queries like "pip install" boost chunks where the words appear adjacent. Budget adapts: 3 matches get 2000 chars each, 100 matches get 300 chars each (30K total target). JSON output (--json) always preserves full uncompressed text.
Near-miss hints: When a regex search returns zero results, claude-grep extracts the longest literal substring from the pattern and runs a relaxed case-insensitive search. If files contain that literal, it prints a suggestion like near: 3 files contain "deploy" — try: claude-grep "deploy". This helps when a complex pattern (e.g. deploy.*rollback) fails but simpler terms would match.
Auto-escalation: When the current project has ≤5 session files, automatically widens to all projects (avoids the common retry pattern of project→all).
Auto-fallback: When regex finds 0 results and Ollama is running, automatically retries with semantic search. Eliminates the manual -s retry loop.
Self-exclusion: Skips the most recently modified session file (within 60s) from results to prevent self-referential matches — the agent searching for X doesn't find itself asking about X.
Short-pattern warning: Patterns with longest literal ≤3 chars emit a stderr hint suggesting semantic search instead.
Regex syntax: Uses Go regexp (ERE-style), not grep BRE. Use | not \|, ( not \(. BRE escapes are auto-normalized but should be avoided.
Every search emits a structured event to ~/.claude/search-index/usage.jsonl:
{"ts":"...","pattern":"(a|b)","mode":"regex","results":5,"files":500,"ms":1200,"pf_skip":393,"pf_pass":107}| Field | Description |
|---|---|
pf_skip |
Files rejected by pre-filter (didn't contain any literal) |
pf_pass |
Files that passed pre-filter and were regex-searched |
results |
Final match count |
files |
Total files in scope |
Run claude-grep --usage for a 30-day summary including:
- Hit rate and latency
- Empty patterns (improvement candidates)
- Prefilter diagnostics — flags searches where the pre-filter rejected ALL files
- Retry chains — consecutive searches <90s apart (any type), with worst chain, wasted time, scope escalations
- Duplicate searches — same pattern+scope repeated across sessions
- BRE misuse and extra arg warnings
Add to your CLAUDE.md:
SESSION HISTORY:
- `claude-grep "pattern"` — regex search session history
- `claude-grep -s "query"` — semantic search by meaning
- `claude-grep --json "pattern" | jq .` — structured output
- `claude-grep --usage` — check search health and hit rateThe initial index builds embeddings for all session history. This is slow on CPU (~0.5-1s per message via ollama). A session with 2000 messages takes ~30 minutes on CPU. After the first run, incremental updates only process new/changed files.
claude-grep --index # incremental (skips unchanged files)
claude-grep --index --all # full reindex
claude-grep --index --status # check progress and statsSet up cron to keep the index fresh. A lockfile prevents concurrent runs — if the previous indexing is still going, the new cron invocation exits immediately.
(crontab -l; echo '*/30 * * * * $HOME/go/bin/claude-grep --index 2>&1 | logger -t claude-grep') | crontab -- CPU-only: No GPU required, but initial indexing is slow. Budget 1-2 hours for a large history. Subsequent runs are fast (seconds).
- Active sessions: A session's JSONL file is modified on every message, so active sessions get re-indexed on each cron run. This re-embeds the entire file, not just the new messages.
- Disk usage: ~4.5 KB per message (768 float32 dims). 4000 vectors ≈ 17 MB.
- ollama must be running: Indexing and semantic search both call ollama's HTTP API. If ollama is stopped, indexing exits with a clear error.
MIT