Releases: Atrv-Shrn/MemStack
v1.4.4
What's New in v1.4.4
Memory Synthesis
New memstack.intelligence.synthesis module — the LLM now synthesises auto-captured exchanges into concise memories instead of storing raw conversation turns. This reduces noise and keeps the vault focused on what matters.
Background Consolidation
New memstack.intelligence.consolidation module — a background task periodically reviews stored memories, merging duplicates and pruning stale entries. Prevents vault drift over time. Includes fixes for event loop blocking and split data loss.
Version & Docs
Version bumped to 1.4.4 across pyproject.toml and __init__.py. README rewritten in condensed reference format. Dev/reference files excluded from release tracking.
Full Changelog
See CHANGELOG.md for complete details.
v1.4.3
What's New in v1.4.3
Merge Pipeline
LLM can now decide to merge incoming content into an existing memory instead of only add, update, or ignore. Merged memories append content with a \n\n separator, preserving the original ID, importance, and created timestamp. New VaultStore.merge() method powers the append logic.
Similarity Thresholds
Lowered similarity_add_threshold (0.3 → 0.25) and similarity_ignore_threshold (0.92 → 0.85) to reduce false deduplication. High-similarity memories now go to the LLM for a decision instead of being auto-ignored by default.
Auto-Ignore Toggle
New similarity_ignore_enabled config field (default: false). When true, restores the old auto-ignore behavior for scores at or above the ignore threshold. When false (default), all scores at or above the add threshold go to the LLM for a decision.
Auto-Capture Improvements
OpenClaw bridge now captures all messages instead of only the last 4. Tool role messages are included alongside user and assistant. Structured content blocks (tool_use, tool_result, text arrays) are serialized as readable text instead of [object Object]. System messages are correctly excluded.
Release Cleanup
Updated .gitignore to exclude internal dev and continuity files (CONTEXT.md, STATUS.md, progress.md, reference/, .claude/).
Full Changelog
See CHANGELOG.md for complete details.
v1.4.2
What's New in v1.4.2
Caching Layer
Ollama client singleton, LRU embedding cache, FTS rebuild interval, TTL search result cache, and LRU vault read/list cache — 4 new config fields (embedding_cache_size, fts_rebuild_interval, search_cache_ttl, vault_cache_size).
Async Handlers
All REST handlers and MCP tools converted to async def with run_in_executor for blocking sync calls. Importance updates deferred via asyncio.create_task() — responses return immediately.
System Prompt Fix
Sections 0, 1, and 6 rewritten with factual tone. Section 6 now states memories are injected automatically instead of instructing models to call memory_inject before every response.
OpenClaw Bridge Cache
TTL cache (30s) on before_prompt_build hook keyed by agent ID + first 500 chars of prompt — identical prompts within the window skip the MemStack HTTP call.
Bug Fixes
- Shared memory importance updates now skip
vault.update_importancefor reserved"shared"agent_id - Vault cache thread safety via
threading.Lock
Full Changelog
See CHANGELOG.md for complete details.
v1.4.1
What's New in v1.4.1
MCP Standalone Server
MCP now runs as a separate process on port 7778 (MEMSTACK_MCP_PORT) instead of being mounted inside the REST API. The CLI manages both processes automatically.
System Prompt Endpoint
GET /agents/{agent_id}/system-prompt returns the behavioral mandate and 7-tool reference. A matching memory_get_system_prompt MCP tool provides identical output.
OpenClaw Bridge
TypeScript plugin with auto-recall, auto-capture, and native memory blocking hooks. Includes a step-by-step connection guide (docs/connect-openclaw.md).
Bug Fixes
- MCP subprocess now receives
--hostand--portflags from CLI daemon launcher - Undefined
agentIdguards in bridge hooks prevent "undefined" namespace pollution
Full Changelog
See CHANGELOG.md for complete details.
v1.4.0
What's New in v1.4.0
Shared Mode
Private-first write routing with shared copies. When MEMSTACK_SHARED_MODE=true, private writes are also copied to a shared directory accessible to all agents. Reserved shared agent_id is rejected in validation. File-level threading locks for concurrent write safety.
Inject Endpoint
GET /agents/{agent_id}/inject?q=&limit= and GET /shared/inject?q=&limit= — auto-injection that searches for relevant memories and returns top-N results above injection_min_score. MCP memory_inject tool with the same logic. Merges agent and shared namespace results when shared mode is enabled.
Bug Fixes
- Slug truncation to 80 chars prevents
OSErroron long content - MCP path doubling fix (
/mcp/mcp→/mcp)
Evals
v1.4 retrieval evaluation — P@1: 0.870, MRR@10: 0.924, R@5: 1.0
New Config
MEMSTACK_INJECTION_MIN_SCORE(default 0.3)MEMSTACK_INJECTION_TOP_N(default 5)MEMSTACK_SHARED_MODE(default false)
Full Changelog: v1.3.0...v1.4.0
v1.3.0
MCP server, file watcher, search fixes, Windows daemon
What's new
- FastMCP server — 5 tools (
memory_write,memory_search,memory_read,memory_delete,memory_list) mounted as sub-app at/mcpwithin the FastAPI process. Input validation for path traversal and invalid identifiers. Search results capped at 100. - File watcher — real-time vault change detection via
watchfiles. Incrementalscan_vault()on startup with mtime-based scanning. Self-write suppression so vault writes don't trigger re-indexing loops. - Smart write pipeline — LLM consultation for ambiguous similarity zones via local Ollama.
- Importance decay & hit increment — lazy importance updates at retrieval time.
system_prompt_kit.md— agent-facing system prompt with behavioral rules and MCP tool reference.- 4 new config fields —
WATCHER_ENABLED,WATCHER_DEBOUNCE_MS,MCP_ENABLED,MCP_PATH
Evaluation results
| Metric | v1.2 | v1.3 | Change |
|---|---|---|---|
| P@1 | 0.6957 | 0.6522 | -6% |
| R@5 | 1.0 | 1.0 | = |
| MRR@10 | 0.8152 | 0.7877 | -3% |
| Latency@10 | 18561ms | 18843ms | +2% |
Precision and MRR dipped slightly due to MCP endpoint routing changes and watcher startup overhead. Latency essentially unchanged.
Bug fixes
- CLI crash when state file has
last_indexedbut nopidkey SearchIndex.add()silently returnedNoneon failure — now returnsboolscan_vault()skipped failed files permanently — now retries on next startup- Windows daemon PID liveness checks used
os.kill(broken with PID reuse) — replaced with Win32 API viactypes - Windows daemon missing
CREATE_NO_WINDOWflag andstdin=DEVNULL .env.examplemerge conflict residue
Full Changelog: v1.2.0...v1.3.0
v1.2.0
Smart write, importance scoring, evals
What's new
- Smart write pipeline — every memory write passes through embed → find_similar → threshold check → LLM consultation → execute decision. Three outcomes: add (new memory), update (replace similar memory preserving importance), ignore (duplicate).
- Importance scoring — memories have importance scores that decay over time (half-life default 7 days) and get bumped on retrieval (default +0.05 per hit). Decay is lazy, computed only at retrieval time.
- LLM consultation — when similarity is ambiguous (between add and ignore thresholds), a local Ollama model decides whether to add, update, or ignore. Falls back to "add" if Ollama is unreachable.
- 6 new config fields —
SIMILARITY_ADD_THRESHOLD,SIMILARITY_IGNORE_THRESHOLD,IMPORTANCE_DECAY_HALFLIFE,IMPORTANCE_HIT_INCREMENT,LLM_MODEL,LLM_HOST - MemoryWriteResponse — write endpoint returns
{decision, id, similarity_score}with status 201 (added) or 200 (updated/ignored)
Evaluation results
| Metric | v1.1 | v1.2 | Change |
|---|---|---|---|
| P@1 | 0.3478 | 0.6957 | +100% |
| R@5 | 1.0 | 1.0 | = |
| MRR@10 | 0.5841 | 0.8152 | +39% |
| Latency@10 | 5324ms | 18561ms | +249% |
Precision and MRR improved significantly with smart write deduplication and importance-weighted reranking. Latency increased due to importance update on retrieval.
Bug fixes
- Pipeline data-loss (index delete after vault write)
- LLM host not configurable
- Empty importance_updated crash
- Inconsistent agent_id validation
- Health version mismatch
- Data loss when update produces same slug as original
Full Changelog: v1.1.0...v1.2.0
v1.1.0
What's New in v1.1.0
Search & Embeddings
- Hybrid search — vector + full-text via LanceDB with importance-weighted RRF reranking
- Semantic chunking — paragraph-first split, sentence split, hard token ceiling with tiktoken
- Embedding providers — Ollama and fastembed with automatic fallback; graceful degradation when unavailable
- Search endpoint —
GET /agents/{agent_id}/memories/search?q=...&limit=...
CLI
- Daemon mode —
memstack start --daemonwith cross-platform process management - Stop command —
memstack stopwith proper Windows (taskkill) and Unix (SIGTERM) support
Configuration
- 7 new settings:
embedding_provider,embedding_model,embedding_autofallback,chunk_max_tokens,chunk_overlap_tokens,rrf_k,importance_rerank_weight
Evaluation Framework
- Golden-set evaluation with precision/recall/MRR metrics
- Parameter sweep runner for chunking config optimization
Bug Fixes
- Daemon timeout 5s → 30s (first-start crash)
- Windows onnxruntime DLL failure: graceful fallback instead of crash
- Windows
memstack stopuses taskkill directly - Test isolation from .env file
Full Changelog: v1.0.0...v1.1.0
v1.0.0
MemStack is a local-first memory server for AI agents. Memories are stored as Markdown files with YAML frontmatter in a vault directory you own. No cloud, no database, no external services. You can read and edit every memory with any text editor or Obsidian. The vault is ground truth.
What's included (v1.0.0)
- Vault storage — per-agent directory namespacing, atomic writes, path traversal protection
- REST API — 5 endpoints on FastAPI/Uvicorn: health check, memory CRUD under
/agents/{agent_id}/memories - CLI —
memstack start(foreground + daemon) andmemstack stopwith cross-platform process management - Configuration — 12 environment variables via Pydantic Settings,
.envfile support - Slug-based IDs — deterministic from content + agent + date
- 102 tests, 95.5% coverage