CLI eagerly imports embedding stack (onnxruntime/fastembed) for filter-only `search-notes` → slow SessionStart brief

## Summary
The `basic-memory` CLI loads the embedding stack (onnxruntime / fastembed) at import time on **every** invocation — including structured, filter-only `search-notes` queries that never run a vector search. This makes the Claude Code plugin's SessionStart brief slow enough to blow its timeout on a cold machine, where it then prints *"Couldn't read from `<project>` — it may be misnamed or unreachable."*

## Measured (plugin 0.3.13 / package 0.21.x, WSL2, fastembed bge-small-en-v1.5, sqlite)
- Warm `basic-memory tool search-notes --type task --status active --project X --page-size 5`: **~5s**
- Same query with `--search-type text`: **~2.5s**
- Same query with `BASIC_MEMORY_SEMANTIC_SEARCH_ENABLED=false`: **~4.8s** (barely changes — the cost is a module-level import, not gated by the flag)

On a cold box (model not resident) the first invocation is slower still and exceeds the SessionStart hook's per-query (10s) / total (20s) budget, so all three brief queries return nothing and the hook emits the "couldn't read" line.

## Relationship to #740 / #828
#740 ("Slow startup time") was closed by #828, which deferred the FastAPI ASGI app import (`--help` ~3.0s → ~1.7s). #828 explicitly did **not** defer onnxruntime/fastembed, so `search-notes` still pays the embedding-init cost. The SessionStart brief queries use only structured filters (`--type` / `--status` / `--after_date`) — they never need embeddings.

## Ask
Lazy-import the embedding stack inside the vector/semantic code path (only when a vector or semantic query actually runs), so structured filter-only `search-notes` skips onnxruntime/fastembed init entirely. Benefits:
- SessionStart brief becomes fast and reliable on cold machines.
- `BASIC_MEMORY_SEMANTIC_SEARCH_ENABLED=false` would actually become fast (currently it isn't, because the import isn't gated by the flag).

## Environment
- basic-memory package 0.21.x, claude-code plugin 0.3.13
- WSL2 Ubuntu, Python 3.x, fastembed bge-small-en-v1.5, sqlite backend


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CLI eagerly imports embedding stack (onnxruntime/fastembed) for filter-only `search-notes` → slow SessionStart brief #886

Summary

Measured (plugin 0.3.13 / package 0.21.x, WSL2, fastembed bge-small-en-v1.5, sqlite)

Relationship to #740 / #828

Ask

Environment

Metadata

Assignees

Labels

Fields

Projects

Milestone

Relationships

Development

CLI eagerly imports embedding stack (onnxruntime/fastembed) for filter-only search-notes → slow SessionStart brief #886

Description

Summary

Measured (plugin 0.3.13 / package 0.21.x, WSL2, fastembed bge-small-en-v1.5, sqlite)

Relationship to #740 / #828

Ask

Environment

Metadata

Metadata

Assignees

Labels

Fields

Projects

Milestone

Relationships

Development

Issue actions

CLI eagerly imports embedding stack (onnxruntime/fastembed) for filter-only `search-notes` → slow SessionStart brief #886