Summary
The basic-memory CLI loads the embedding stack (onnxruntime / fastembed) at import time on every invocation — including structured, filter-only search-notes queries that never run a vector search. This makes the Claude Code plugin's SessionStart brief slow enough to blow its timeout on a cold machine, where it then prints "Couldn't read from <project> — it may be misnamed or unreachable."
Measured (plugin 0.3.13 / package 0.21.x, WSL2, fastembed bge-small-en-v1.5, sqlite)
- Warm
basic-memory tool search-notes --type task --status active --project X --page-size 5: ~5s
- Same query with
--search-type text: ~2.5s
- Same query with
BASIC_MEMORY_SEMANTIC_SEARCH_ENABLED=false: ~4.8s (barely changes — the cost is a module-level import, not gated by the flag)
On a cold box (model not resident) the first invocation is slower still and exceeds the SessionStart hook's per-query (10s) / total (20s) budget, so all three brief queries return nothing and the hook emits the "couldn't read" line.
Relationship to #740 / #828
#740 ("Slow startup time") was closed by #828, which deferred the FastAPI ASGI app import (--help ~3.0s → ~1.7s). #828 explicitly did not defer onnxruntime/fastembed, so search-notes still pays the embedding-init cost. The SessionStart brief queries use only structured filters (--type / --status / --after_date) — they never need embeddings.
Ask
Lazy-import the embedding stack inside the vector/semantic code path (only when a vector or semantic query actually runs), so structured filter-only search-notes skips onnxruntime/fastembed init entirely. Benefits:
- SessionStart brief becomes fast and reliable on cold machines.
BASIC_MEMORY_SEMANTIC_SEARCH_ENABLED=false would actually become fast (currently it isn't, because the import isn't gated by the flag).
Environment
- basic-memory package 0.21.x, claude-code plugin 0.3.13
- WSL2 Ubuntu, Python 3.x, fastembed bge-small-en-v1.5, sqlite backend
Summary
The
basic-memoryCLI loads the embedding stack (onnxruntime / fastembed) at import time on every invocation — including structured, filter-onlysearch-notesqueries that never run a vector search. This makes the Claude Code plugin's SessionStart brief slow enough to blow its timeout on a cold machine, where it then prints "Couldn't read from<project>— it may be misnamed or unreachable."Measured (plugin 0.3.13 / package 0.21.x, WSL2, fastembed bge-small-en-v1.5, sqlite)
basic-memory tool search-notes --type task --status active --project X --page-size 5: ~5s--search-type text: ~2.5sBASIC_MEMORY_SEMANTIC_SEARCH_ENABLED=false: ~4.8s (barely changes — the cost is a module-level import, not gated by the flag)On a cold box (model not resident) the first invocation is slower still and exceeds the SessionStart hook's per-query (10s) / total (20s) budget, so all three brief queries return nothing and the hook emits the "couldn't read" line.
Relationship to #740 / #828
#740 ("Slow startup time") was closed by #828, which deferred the FastAPI ASGI app import (
--help~3.0s → ~1.7s). #828 explicitly did not defer onnxruntime/fastembed, sosearch-notesstill pays the embedding-init cost. The SessionStart brief queries use only structured filters (--type/--status/--after_date) — they never need embeddings.Ask
Lazy-import the embedding stack inside the vector/semantic code path (only when a vector or semantic query actually runs), so structured filter-only
search-notesskips onnxruntime/fastembed init entirely. Benefits:BASIC_MEMORY_SEMANTIC_SEARCH_ENABLED=falsewould actually become fast (currently it isn't, because the import isn't gated by the flag).Environment