mcp-chest-memory

English | 日本語

These daily frustrations end today:

Giving the same instructions over and over
Answering the same questions again and again
Watching your LLM stumble in the same place every time
Burning through tokens so fast you keep hitting your limits

mcp-chest-memory makes all of these a thing of the past — automatically.

Add this MCP server — then there is nothing left for you to do.
It automatically remembers what was worked on, why things failed, and what research concluded — across all your projects.

With this MCP installed, your LLM grows together with you: mistakes and repeated questions keep decreasing, and the LLM increasingly behaves like an extension of yourself.

As a welcome side effect, it also cuts your LLM token usage substantially.

Local-first persistent memory for coding agents, served over MCP. Your agent forgets everything when a session ends; chest gives it a durable, searchable "past self" — failures it must not repeat, decisions and their reasons, per-file edit history — stored in a single SQLite file on your machine.

One memory store spans all your projects and all your LLM agents: knowledge is recalled and recorded automatically by the LLM itself, without you having to think about it — so you stop giving the same instructions over and over.

Optimized for Claude Code (bundled skill + hooks), works with any MCP client.

This MCP server is built to be easy to adopt. It scales from personal use to multiple machines and on to a whole project team. Start with personal use and feel the difference for yourself — getting started solo is very easy.

Features

6-layer structured memory — goal / context / emotion / implementation / realize (failures & pitfalls, protected from forgetting) / learning (insights & decisions)
Hybrid recall — SQLite FTS5 unicode61 full-text search (over a Sudachi-tokenized column for CJK) fused with vector similarity via Reciprocal Rank Fusion, then weighted by recency heat, entity momentum, and importance; optional cross-encoder reranker for better result ordering
Multilingual by construction — Japanese/Chinese/Korean text is morpheme-tokenized (Sudachi-WASM) at write time into a dedicated FTS column; whitespace-delimited languages work directly via unicode61 word boundaries
Offline-first embeddings — Xenova/bge-m3 (1024-dim, ONNX, ~560 MB) runs locally via transformers.js; no API key, no network after the one-time model download
Memory lifecycle — ACT-R style activation decay, TTL expiry, archive-first deletion, supersession detection, sleep-mode consolidation
Token-saving file reads — chest_read_smart caches file chunk hashes and returns only what changed since the last read; works in every profile (the file is read client-side, only the diff-cache snapshot is persisted)
Session continuity — work-state snapshots survive context compaction (Claude Code PreCompact/SessionStart hooks)
Docker deployment — same tools, same semantics whether Docker runs on the same machine or on a remote server; all setup via Ansible

Installation

Requirements: Node.js ≥ 24, Ansible ≥ 2.14, Docker CE. The backend runs as a Docker container and all setup is managed by Ansible.

Quick start

Two roles: (A) the client (each user) registers hooks, the MCP server, and the skill on their own machine; (B) the server admin runs one Docker backend (optionally behind nginx). For a single-PC / offline setup, one person plays both roles and runs the backend on 127.0.0.1.

(A) Client setup (Claude Code plugin)

Each user installs the Claude Code plugin on their own machine. The plugin bundles the MCP server, the lifecycle hooks (SessionStart / Stop / PreCompact / UserPromptSubmit), and the chest-memory skill. It works identically on Windows, macOS, and Linux. The executables run on-demand via npx, so Node / npm must be available on the client.

Step 1 — provide the connection settings (CHEST_REMOTE_URL, CHEST_API_TOKEN). Pick one of:

A config file (recommended — works everywhere, incl. the VS Code extension). Create ~/.chest-memory.json:
```
{
  "CHEST_REMOTE_URL": "http://<host-ip>:8765",
  "CHEST_API_TOKEN": "<token>"
}
```
The MCP server reads it directly, so it does not depend on how the host passes environment variables.
Environment variables (CLI). Export them in the shell that launches Claude Code (export CHEST_REMOTE_URL=… CHEST_API_TOKEN=…; Windows: setx … then reopen the terminal). Process env wins over the config file.

VS Code extension / Remote-SSH note: the VS Code extension does not reliably pass environment variables (from ~/.claude/settings.json env or your shell rc) to the MCP server process. Use the config file above, or put the two exports in ~/.vscode-server/server-env-setup (Remote-SSH) and restart the VS Code Server.

# 2. Add this repo as a plugin marketplace (run inside Claude Code)
/plugin marketplace add https://github.com/siosig/mcp-chest-memory
#    A local path works too for local use.

# 3. Install the plugin
/plugin install chest-memory@chest-memory

# 4. Restart Claude Code if prompted

(B) Server admin setup

Deploy the Docker backend (optionally behind an nginx reverse proxy). For a team, one person does this once and all clients share the same backend.

cd ansible

# 1. Configure the server inventory
cp inventory/host_vars/chest_server/vault.yml.example \
   inventory/host_vars/chest_server/vault.yml
# Edit inventory/host_vars/chest_server/vars.yml:
#   ansible_host, ansible_user, chest_repo_dir, chest_repo_url

# 2. Create and encrypt the vault (API token)
echo "vault_chest_api_token: $(openssl rand -hex 32)" \
  > inventory/host_vars/chest_server/vault.yml
ansible-vault encrypt inventory/host_vars/chest_server/vault.yml

# 3. Deploy Docker backend (use localhost for single-PC)
ansible-playbook site.yml -i inventory/hosts.yml --tags docker --ask-vault-pass

The playbook:

--tags docker — installs Docker CE, builds the image, starts the container with persistent SQLite storage, and configures CHEST_API_TOKEN

For single-PC setup, point ansible_host in vars.yml to 127.0.0.1. Keep a single backend replica — one writer process owns the database.

Import existing Claude Code history (optional)

ansible-playbook site.yml -i inventory/hosts.yml --tags migration --ask-vault-pass

Seeds the memory store from every past session under ~/.claude/projects/ on the server. Re-running is safe — each session is wiped and re-inserted idempotently.

Daily usage

What you have to do: (almost) nothing

After installation, just work with Claude Code as usual. The bundled /chest-memory skill teaches the agent to recall and save memories on its own. Everything below is optional:

Say "remember this: ..." to force a save of something specific
Invoke /chest-memory to save the recent context explicitly, or /chest-memory status to check store health
Ask "did we hit this before?" to force a recall
Hooks are wired automatically by the plugin: session auto-capture on Stop, snapshot save/restore around compaction

What runs automatically even if you do nothing

On every save (chest_remember): the layer is classified by the agent, content is stored in SQLite, the FTS5 index updates via triggers, the vector is embedded in the Docker backend by the local model, and realize-layer memories are auto-protected from forgetting
On every recall (chest_recall): FTS + vector hybrid search with decay-aware ranking; access heat is updated so frequently used memories rank higher over time
During a session (skill-driven): recall at task start and before editing files with history; saves after errors are resolved or decisions are made
On every session end (hooks, wired by the plugin): the session is captured on Stop, and work-state snapshots survive context compaction
In the background after saves (throttled, at most once per CHEST_MAINTENANCE_INTERVAL_SEC, default 600 s / 10 min): activation decay recompute, TTL expiry and archive sweep, supersession detection, consolidation of cold memories, and embedding backfill for any pending rows. No scheduler setup is required; chest-index up remains available for manual runs

MCP tools

Tool	Purpose
`chest_remember`	Save a memory into a layer (with importance, TTL, supersedes)
`chest_recall`	Hybrid search across memories (FTS5 + vector + decay-aware ranking)
`chest_recall_file`	Complete edit history of a file with per-edit intent
`chest_update_memory`	Edit a memory in place (preserves links)
`chest_list_entities`	Entity overview sorted by recent activity
`chest_forget`	Delete by id or run risk-based auto-forgetting (realize/goal/pinned protected)
`chest_consolidate`	Compress cold memories into learning summaries
`chest_read_smart`	Diff-cached file read (returns only changed chunks)

How it works

Architecture

Data flow for write (chest_remember) and search (chest_recall). The stdio client always connects to the Docker REST backend: it handles Zod parsing and HTTP forwarding, and all processing — including embedding — runs in the Docker backend. The client never embeds.

Team-wide memory sharing — when knowledge is stored vs. retrieved

When used by a team, every member's Claude Code shares the same configuration and all reads and writes converge on a single SQLite file (single writer) on a shared server. Memory that one member "stores" (failures, decisions, research results) can be "retrieved" by any other member — this is the core value of a team deployment.

The diagram below shows, for the components you operate (Claude side = skills / hooks / mcp; server side = nginx / Docker / sqlite), when knowledge is stored (solid arrows) and when it is retrieved (dashed arrows).

flowchart TD
    subgraph CLIENT ["A team member's Claude Code\nper-machine, all share one config"]
        direction LR
        SKILLS["🧩 skills\n/chest-memory\ntells the agent when to recall / save"]
        HOOKS["🪝 hooks\nStop / PreCompact /\nSessionStart / UserPromptSubmit"]
        MCP["🔌 mcp tools\nchest_remember / chest_recall /\nchest_recall_file …"]
    end

    subgraph SERVER ["Shared server (one Docker host = single writer)"]
        direction TB
        NGINX["🌐 nginx\nTLS termination + Bearer verify\nreverse proxy (everyone's entry point)"]
        DOCKER["🐳 Docker\nchest-memory backend\nembed · FTS · ranking\nauto-maintenance: forgets unused/old memories\n(decay→archive · TTL expiry)\noverwrites with newer ones (supersession)"]
        SQLITE[("🗄️ sqlite / chest.db\n6 memory layers · FTS5 · vectors\nshared by the whole team")]
    end

    %% ===== STORE (write, solid) =====
    SKILLS ==>|"STORE: instructs a save after a fix / a decision"| MCP
    HOOKS ==>|"STORE: Stop=summarize conversation / PreCompact=snapshot work state"| NGINX
    MCP ==>|"STORE: chest_remember(content, layer)"| NGINX

    %% ===== RETRIEVE (read, dashed) =====
    SKILLS -.->|"RETRIEVE: instructs a recall at task start / before editing a file"| MCP
    HOOKS -.->|"RETRIEVE: UserPromptSubmit=inject relevant memories / SessionStart=restore work state"| NGINX
    MCP -.->|"RETRIEVE: chest_recall(query) / chest_recall_file"| NGINX

    NGINX <--> DOCKER
    DOCKER <-->|"STORE=INSERT · embedding / RETRIEVE=FTS5 + vector search"| SQLITE

When each component stores vs. retrieves knowledge:

Component	When it STORES (write)	When it RETRIEVES (read)
🧩 skills (`/chest-memory`)	Right after a fix / a decision; "remember this: …"	At task start; before editing a file with history; "did we do this before?"
🪝 hooks	Stop = summarize the conversation every turn; PreCompact = snapshot work state right before compaction	UserPromptSubmit = inject relevant memories on every prompt; SessionStart = restore work state on resume / compact
🔌 mcp tools	`chest_remember` (save with a layer)	`chest_recall` / `chest_recall_file` (hybrid search)
🌐 nginx	(both directions) TLS termination + Bearer verify, funneling every member's reads/writes into one host	same
🐳 Docker	INSERT · embedding. Auto-maintenance forgets unused/old memories (decay→archive · TTL expiry) and overwrites with newer ones (supersession)	FTS5 + vector hybrid search · decay-aware ranking
🗄️ sqlite	Persists memory (single writer for consistency)	Shares accumulated memory with the whole team

Storing = automatic: skills and hooks tell the agent when to save, so memory accumulates at natural breakpoints without explicit human action. Retrieving = automatic: likewise, skills and the UserPromptSubmit hook pull relevant memories into context at task start and on every prompt. Both work by the mcp tools reaching nginx → Docker → sqlite over HTTP.

Security — protecting token values and personal data

Sharing one store across a team means one shared Bearer token holds full access to everyone's memory, and because conversations, pastes, and file fragments flow into memory, credentials or personal data can be stored unintentionally. chest defends these two fronts with the following approach.

Design principles

Fail closed: deny when the safe scope is unknown (a file read with no declared root returns nothing rather than falling back to "read anything").
Defense in depth: TLS (terminated at nginx) and Bearer verification (at the backend) are layered; the content-length cap is enforced in both the schema and the handler.
Redact once at the taint boundary: credentials are masked exactly once where external input enters the DB (chest_remember / chest_update_memory / Stop-hook import). Derivative writes (consolidate) are excluded because their input is already redacted.
Untrusted-data principle: memory content surfaced by recall or consolidate is treated as data, not instructions (defense against prompt injection via stored memory).

The diagram below shows the security checkpoints input text passes through on its way to storage and back out via recall.

flowchart LR
    IN["📝 input text\nconversation · paste · file fragment"]

    subgraph CLIENTSEC ["Client (Claude Code)"]
        TOK["🔑 CHEST_API_TOKEN\ninjected via Ansible vault / env\nkept out of CLI args & shell history"]
    end

    subgraph SERVERSEC ["Shared server"]
        TLS["🌐 nginx\nTLS termination (no plaintext HTTP)\nlisten address limitable via CHEST_BIND_HOST"]
        AUTH["🛡️ Bearer verify\n≥ 32 chars · constant-time compare\nmismatch rejected immediately"]
        REDACT["✂️ redactCredentials (once at taint boundary)\nreplaces values of token / password / API key /\nAuthorization·Cookie / PEM private key\nwith [REDACTED] before storing"]
        STORE[("🗄️ sqlite\npersists redacted text")]
    end

    OUT["📤 recall output\nmarked as untrusted data, not instructions"]

    IN --> TOK
    TOK -->|"HTTPS with Bearer"| TLS
    TLS --> AUTH
    AUTH -->|"STORE (write)"| REDACT
    REDACT --> STORE
    STORE -->|"RETRIEVE (read)"| OUT

Protecting the token value (CHEST_API_TOKEN)

Mechanism	Detail
Length enforcement	The backend refuses to start if the token is under 32 chars. `openssl rand -hex 32` (64 chars) satisfies it
Constant-time compare	Token comparison uses a constant-time compare to avoid timing attacks
No plaintext	Terminate TLS at nginx so tokens/memory never traverse the network in cleartext. Without a fronting proxy, restrict to a trusted network + `CHEST_BIND_HOST=127.0.0.1`
Avoid exposure	Inject via env / Ansible vault. Do not pass it inline to `claude mcp add` etc. (readable from `/proc/<pid>/cmdline` and shell history on shared machines)
Shared-token nature	Grants full access (no per-client scoping). Since the team shares one, treat it as a high-value secret

Protecting personal/sensitive data (PII / secrets)

Mechanism	Detail
Auto-redaction of credentials	On write, `redactCredentials` replaces the values of token / password / API key / `Authorization`·`Cookie` headers / PEM private keys with `[REDACTED]`
Where applied	`chest_remember` / `chest_update_memory` / Stop-hook session import (once at the taint boundary, idempotent)
Paste detection	Detects oversized pastes and controls their handling (write flow ①)
File confinement	`chest_read_smart` reads only under the declared root. The backend cannot read arbitrary client files (denied when no root)

Honest limitation (no overstatement): auto-redaction targets credential values only — general personal data such as names, email addresses, and phone numbers is NOT auto-masked. Keep sensitive PII out of memory, or mask it on the write side beforehand. Plaintext HTTP, the shared token's full access, and the limits of the data marker are by-design residual risks (see the Security section for details).

The diagrams below detail the "store" and "retrieve" flows down into the server.

Write flow (chest_remember)

Steps ①②③ execute synchronously before the HTTP response is returned. Only ④ is a post-response fire-and-forget. Embedding always runs in the Docker backend.

flowchart TD
    CC([Claude Code])

    subgraph CLIENT ["Client PC — stdio MCP server"]
        MCP_W["RemoteExecutor\nZod parse → HTTP forward"]
    end

    subgraph DOCKER ["Docker backend — LocalExecutor\n①②③ sync before response  ④ async only"]
        V["① Validation & pre-processing\nresolveLayer / length / paste detection\nredactText · Sudachi tokenize"]
        W["② Write\nentity upsert\nINSERT memories (status=pending)\nINSERT events · supersedes archive\nrefreshMomentum"]
        E["③ Embedding (server-side)\nbge-m3 → status=done\nevaluateSupersessionFor\ncosine ≥ 0.97 → archive old memory"]
        M[["④ Background maintenance\nvoid fire-and-forget · 10-min throttle / flock\nrunActivationPhase — ACT-R decay\nrunDecayPhase — cold compress + TTL expiry\nrunSupersessPhase — duplicate detection archive\nrunLocalPendingSweep — pending embed backfill"]]
        DB_W[("chest.db\nmemories · entities · events\nmemories_fts FTS5 · access_log")]
    end

    CC -->|"chest_remember(content, layer, importance …)"| MCP_W
    MCP_W -->|"POST /api/tools  Bearer"| V
    V --> W
    W -->|"INSERT memories · events"| DB_W
    W --> E
    E -->|"UPDATE embedding · archived_at"| DB_W
    E -.->|"void"| M
    M -->|"UPDATE activation · archived_at · consolidation"| DB_W
    E -->|"ok + memory_id"| MCP_W
    MCP_W -->|"ok + memory_id"| CC

Search flow (chest_recall)

All steps are synchronous. The FTS and vector paths run concurrently inside handleChestRecall.

flowchart TD
    CC([Claude Code])

    subgraph CLIENT ["Client PC — stdio MCP server"]
        MCP_R["RemoteExecutor\nZod parse → HTTP forward only"]
    end

    subgraph DOCKER ["Docker backend — LocalExecutor"]
        HANDLER["handleChestRecall\nFTS + vector paths run concurrently"]
        FTS["① FTS path\nformatFtsQuery\nSudachi tokenize query\nFTS5 unicode61 bm25"]
        VEC["② Vector path\nembed query with bge-m3\ncosine similarity top-k\n(same model + dim only)"]
        SCR["③ Scoring & return\nRRF fusion\ncomposite score\n  0.45·relevance + 0.25·heat\n  + 0.15·momentum + 0.15·importance\n  × activation × ttl_penalty × supersession\nbge-reranker (when CHEST_RERANK_ENABLED)\nINSERT memory_access_log"]
        DB_R[("chest.db\nmemories · entities\nmemories_fts FTS5 · access_log")]
    end

    CC -->|"chest_recall(query, layer, limit …)"| MCP_R
    MCP_R -->|"POST /api/tools  Bearer"| HANDLER
    HANDLER --> FTS & VEC
    FTS <-->|"MATCH / bm25"| DB_R
    VEC <-->|"cosine search"| DB_R
    FTS --> SCR
    VEC --> SCR
    SCR -->|"INSERT access_log"| DB_R
    SCR -->|"ranked memories (composite)"| MCP_R
    MCP_R -->|"JSON result"| CC

Setup	Transport	Database lives
Docker (single PC)	stdio → REST (Bearer) → local Docker	host bind mount
Docker (multi-PC)	stdio → REST (Bearer) → remote Docker	host bind mount on server

The MCP tool surface is identical in every setup: the stdio server forwards the same JSON payload to the backend, which runs the very same executor code.

chest_read_smart is special by necessity — it is the one tool that reads a client-side file. Its file I/O (confine to roots → stat → read → chunk → hash) therefore always runs in the stdio server, where the file and the client's declared roots actually exist; only its diff-cache snapshot is persisted, and that persistence flows through the same executor port (the Docker backend). So the token-saving read works in every setup, the backend never reads a client file, and there is still no if (remote) branch inside the tool.

API negotiation and endpoints (v1.5+)

The stdio client negotiates with the backend via GET /capabilities:

api_version / min_required_client_version — compatibility gate (older clients are rejected).
server_has_embedder — reports that the backend embeds new memories itself. Embedding always runs in the Docker backend; the client never embeds.

Endpoints exposed by the REST backend:

Method · path	Purpose
`POST /api/tools/:tool`	Execute the 8 MCP tools (the backend's LocalExecutor)
`GET /healthz`	Health check (unauthenticated)
`GET /capabilities`	Version / features / `server_has_embedder` negotiation
`GET /diagnostics/db`	Server-side Prisma DB health

Memory layers

Six layers define how memories are stored and decay:

Layer	Meaning	Default TTL	Auto-protected
`goal`	Project objectives and targets	none	—
`context`	Background, timing, situational facts	30 days	—
`emotion`	Tone, mood, and emotional state	14 days	—
`implementation`	Code/config that worked or didn't; how things were tried	90 days	—
`realize`	Failures, pitfalls, and traps that must not be repeated	none	yes
`learning`	Insights, decisions, and belief updates	365 days	—

realize-layer memories are created with protected=1 and survive all automatic forgetting sweeps. goal has no TTL and is exempt from forgetting. importance >= 0.9 pins any memory regardless of layer.

context, emotion, and implementation are subject to sleep-mode consolidation: once cold (heat < 30) and older than 7 days, clusters of ≥ 2 per (entity, layer) are compressed into a single protected learning summary.

Accepted layer aliases: decisions/insights/learned → learning; warnings/pitfalls/rule → realize; why/goals → goal; how/tried → implementation.

Forgetting

Forgetting risk is computed per memory with an Ebbinghaus-inspired formula:

risk = heatFactor × importanceFactor × timeFactor

heatFactor       = 1 - (heatScore / 100)
importanceFactor = 1 - importance
timeFactor       = daysSinceLastAccess × (1 + daysSinceLastAccess / 30)

risk	action
< 50	keep
50 – 199	compress — archived and summarised into a `learning` entry
≥ 200	drop — fully deleted

The heat score (0–100) is computed from access frequency and recency: 30-day access count (×3, cap 30) + 90-day count (cap 20) + recency bonus (+20 if ≤ 7 days, −10 if > 90 days) + tenure bonus (cap 15) + importance boost (up to 15). Bands: hot ≥ 70 / warm ≥ 40 / cold ≥ 20 / frozen < 20.

Supersession (overwrite detection)

When a new memory is saved, the next maintenance pass compares it against recent memories of the same entity and layer using cosine similarity. If a near-duplicate is found (cosine ≥ 0.97), the older memory is archived and linked to the new one — so the store never accumulates stale near-copies.

Guards that reduce false positives:

Same entity + same layer required
90-day time window and 200-peer row cap per entity (prevents O(n²) scans)
JSON memories with identical top-level key shapes are not superseded (periodic snapshots / file-edit logs that look structurally similar but hold distinct facts)

The chest_remember tool also accepts a supersedes list for manual supersession without waiting for the batch sweep.

Storage

One SQLite database (WAL mode) holds entities, memories, edges, events, file snapshots, sessions, and consolidation audit rows. Schema is managed by Prisma migrations; the FTS5 virtual table and its sync triggers are plain SQL inside the same migration.

Full-text search: FTS5 unicode61 + tokenized

memories_fts is a content-table FTS5 virtual table over the content_tokenized column (tokenize='unicode61 remove_diacritics 1'). At write time, Japanese/Chinese/Korean text is morpheme-segmented by Sudachi-WASM and stored as space-separated tokens in content_tokenized; European languages are handled by unicode61's built-in word-boundary splitting. Queries shorter than 3 characters fall back to a LIKE path. Scores come from SQLite's built-in bm25().

Set CHEST_FTS_TOKENIZE=false to skip Sudachi tokenization (CJK recall will degrade to latin word boundaries only).

Hybrid ranking

For a recall query both paths run:

FTS path — unicode61 match over content_tokenized, ranked by bm25
Vector path — query embedded by the local model, cosine similarity against stored vectors (only rows whose (model, dim) match the current model), top-k

The two rankings are fused with Reciprocal Rank Fusion (1/(k + rank_fts) + 1/(k + rank_vec)), min-max normalized to a relevance score. The final composite is:

composite = (0.45·relevance + 0.25·heat + 0.15·momentum + 0.15·importance)
            × activation × ttl_penalty × supersession_penalty

heat — access frequency/recency of the memory (hot/warm/cold/frozen)
momentum — recent activity of the owning entity
activation — ACT-R inspired decay computed offline by chest-index from the access log
ttl / supersession penalties — soft demotion before hard expiry

Memory lifecycle

Archive-first: nothing is physically deleted on decay; rows get archived_at and drop out of default recall
Supersession: a newer, near-duplicate memory (cosine ≥ 0.97, same entity/layer, 90-day window) archives its predecessor and records the link
Consolidation: cold low-importance memories are clustered per (entity, layer) and compressed into one protected learning summary
Protection: realize-layer and pinned (importance ≥ 0.9) memories are never auto-forgotten
Snapshots: a per-session work-state snapshot survives context compaction; the SessionStart hook restores it

Maintenance

Maintenance is self-driving: after a save, the server runs (in the background, without delaying the response) activation recompute → decay/archive sweep → supersession sweep → embedding backfill of pending rows. Passes are throttled to once per CHEST_MAINTENANCE_INTERVAL_SEC (default 600 s / 10 min) and guarded by a file lock, so they never overlap a manual chest-index up run. Set CHEST_AUTO_MAINTENANCE=0 to disable the automatic passes and drive everything via chest-index yourself.

All embedding inference (write-time embed, the pending sweep, and recall query embedding) flows through one process-wide concurrency gate. By default it runs serial (CHEST_EMBED_CONCURRENCY=1): on a resource-poor host each inference owns the CPU and finishes quickly, so the write-time embed completes in time instead of timing out and leaving rows pending. The gate acquires per inference batch (CHEST_EMBED_BATCH_SIZE, default 16), and recall query embeds take priority over writes and the sweep — so an interactive recall slips in at the next batch boundary rather than waiting for a long backfill to finish.

Configuration reference

Variable	Default	Meaning
`CHEST_DATA_DIR`	`~/.chest-memory`	Data root (database, model cache) — backend only
`CHEST_DB_PATH`	`<data dir>/chest.db`	SQLite file — backend only
`CHEST_REMOTE_URL`	—	Backend base URL. Required for the MCP stdio client and hook bins (they fail-fast without it; no local DB is created)
`CHEST_API_TOKEN`	—	Shared Bearer token. Required for the MCP stdio client and hook bins; the backend also refuses to start without it (minimum 32 characters)
`CHEST_PORT`	`8765`	REST backend listen port
`CHEST_BIND_HOST`	`0.0.0.0`	REST backend listen host. Set to `127.0.0.1` to bind loopback only when a reverse proxy fronts the backend
`CHEST_MAX_CONTENT_CHARS`	`8000`	Max memory content length (clamped to ≥ 1; 0/negative are ignored)
`CHEST_FORGET_SWEEP_CAP`	`200`	Max memories archived per argument-less `chest_forget` sweep
`CHEST_SWEEP_LIMIT`	`500`	Max rows backfilled per embedding sweep
`CHEST_MAINTENANCE_INTERVAL_SEC`	`600`	Min seconds between background maintenance passes
`CHEST_AUTO_MAINTENANCE`	`1`	Set `0` to disable write-triggered maintenance
`CHEST_EMBED_MODEL`	`Xenova/bge-m3`	Embedding model ID. Set `Xenova/multilingual-e5-small` to keep the pre-1.5 model
`CHEST_EMBED_CONCURRENCY`	`1`	Max embedding inference calls running concurrently process-wide (clamped 1–64). `1` = serial; raise only if the host has CPU to spare
`CHEST_EMBED_BATCH_SIZE`	`16`	Texts per ONNX inference call (clamped 1–256). Lower to reduce the memory/CPU peak on weak machines
`CHEST_FTS_TOKENIZE`	`true`	Sudachi morpheme tokenization on write (`false` or `0` to disable; CJK recall will degrade)
`CHEST_RERANK_ENABLED`	`false`	Enable cross-encoder reranking after RRF fusion (`true` or `1` to enable)
`CHEST_RERANK_MODEL`	`onnx-community/bge-reranker-v2-m3-ONNX`	Reranker model ID (only used when `CHEST_RERANK_ENABLED=true`)
`CHEST_RERANK_TOP_N`	`20`	Number of candidates passed to the reranker (1–200)
`CHEST_RERANK_TIMEOUT_MS`	`5000`	Hard timeout in ms for reranker inference (100–30000); pre-rerank order used on timeout
`CHEST_METRICS_ENABLED`	`true`	Record local usage-efficiency metrics (recall hit / read-smart cache events). Set `0` or `false` to disable — backend only
`CHEST_HIT_SCORE_THRESHOLD`	`0.5`	Composite score at/above which a recall counts as a "confident hit" in `chest-memory-stats` (reporting only; never affects recall)

Usage metrics

The backend records privacy-safe, append-only usage events in chest.db to help tune the system — recall hit rate and chest_read_smart cache rate. Recording is local-only (no network egress), single-writer, and fail-soft: a metric write can never alter or block a chest_recall / chest_read_smart result. Events store only non-sensitive aggregates — query length, a path hash, scores, counts, and byte sizes — never raw query text or file paths.

Report with:

chest-memory-stats            # human-readable, includes a "Usage metrics" block
chest-memory-stats --json     # machine-readable (usage_metrics object)
chest-memory-stats --since 7  # restrict usage metrics to the last 7 days

The two event tables carry an id + ts so a future, separate feature can ship a Loki/Prometheus exporter (for Grafana) without schema or call-site changes; those exporters are not built here. Disable recording with CHEST_METRICS_ENABLED=0.

Security notes

Token length: the REST backend requires CHEST_API_TOKEN to be at least 32 characters and refuses to start otherwise. openssl rand -hex 32 (64 chars) satisfies this.
Token on the command line: passing the token inline to claude mcp add (or any shell command) leaves it visible in /proc/<pid>/cmdline and your shell history on a shared machine. Prefer setting it via your shell's secret manager or an env file, and clear the relevant history entry afterward.
Network exposure: the Docker backend publishes on all interfaces by default and protects connections with the Bearer token only, in cleartext HTTP. Run it only on a trusted network, or restrict it with CHEST_BIND_HOST=127.0.0.1 and front it with a TLS-terminating reverse proxy of your choice.
File reads: chest_read_smart only reads files inside the MCP client's declared roots, and the file is always read in the MCP-server process where those roots exist — never on the backend. A chest_read_smart call POSTed directly to the REST backend is refused (no client roots), so a token holder cannot read arbitrary files on the backend host; only the diff-cache snapshot rows are forwarded to the backend, never a file path the backend would open.

Diagnostics & reliability

chest-index subcommands keep the embedding pipeline healthy:

chest-index up           # activation + decay + supersession + embed-cycle
chest-index status       # embedding status report
chest-index reembed      # re-index vectors after model change
chest-index migrate      # backfill content_tokenized for existing memories

All subcommands follow exit-code convention: 0 success, 1 warnings, 2 at least one failure.

To inspect embed-pending rows directly, query the SQLite database inside the container:

docker exec chest-memory sqlite3 /data/chest.db \
  "SELECT COUNT(*) FROM memories WHERE embedding_status='pending';"

Claude Code integration

This section is for reference only. If you ran /plugin install chest-memory@chest-memory, the skill, hooks, and rules are already configured. Use this section to troubleshoot or understand the individual components.

The chest-memory plugin bundles the following components:

Skill: /chest-memory auto-classifies the recent conversation into realize vs learning and saves it with the rationale shown; /chest-memory status reports store health
Hooks: four hooks are registered when the plugin is installed
Rules: bundled rules tell the agent when to call chest_recall / chest_remember, applied automatically to every project

Stop — `chest-memory-sync`

Fires after every assistant turn ends. Skips when stop_hook_active is set to prevent recursive Stop chains.


stdin	`{ session_id, transcript_path, cwd, stop_hook_active, … }`
stdout	silent
action	Validates that `transcript_path` resolves (via `realpathSync`) inside `~/.claude/projects/`. The JSONL content is POSTed to the backend for server-side import.

PreCompact — `chest-memory-precompact`

Fires immediately before Claude Code compacts the context window (manual or automatic trigger). Never blocks compaction — exits 0 on any error.


stdin	`{ session_id, transcript_path, trigger: "manual"\|"auto" }`
stdout	silent
action	Calls `saveSnapshot(session_id)` — UPSERTs a work-state summary (≤ 2 KB) into the `session_snapshots` table. The save is delegated to the backend via the REST API.

SessionStart — `chest-memory-session-start`

Fires at the very beginning of each session.


stdin	`{ session_id, source: "startup"\|"resume"\|"clear"\|"compact" }`
stdout	`<session_knowledge>…</session_knowledge>` injected as `additionalContext`, or nothing
action	Outputs the saved snapshot only when `source` is `compact` or `resume`. Fresh startups (`startup`) and conversation clears (`clear`) produce no output so they start with a clean slate. The snapshot is fetched from the backend.

UserPromptSubmit — `chest-memory-user-prompt-submit`

Fires whenever Claude Code receives a user prompt. Meaningful prompts are sent to the backend hook recall endpoint for bounded realize and learning memory summaries. Short acknowledgements such as ok, continue, はい, and 続けて are skipped.


stdin	`{ session_id, prompt, cwd, … }`
stdout	`<chest-recall>…</chest-recall>` injected as `additionalContext`, or nothing
action	Calls `POST /api/hooks/recall` with the same Bearer token as the other hooks. Recall runs with access tracking disabled, excludes archived/superseded memories, and frames all emitted content as untrusted data rather than instructions.

All four hooks exit 0 on any error (fail-silent) and write to ~/.chest-memory/hook.log (rotated at 1 MB, owner-only 0600). Each hook command embeds CHEST_REMOTE_URL and CHEST_API_TOKEN (both required) so session data is forwarded to the Docker backend.

Rules — `chest-memory-rules`

The plugin bundles rules that Claude Code loads at session start to tell the agent when to call chest_recall / chest_remember, how to classify layers, and what actions are prohibited. They apply to every project automatically.

Update (to pick up the latest version):

/plugin install chest-memory@chest-memory

Remove (to disable):

/plugin uninstall chest-memory@chest-memory

The /chest-memory skill provides equivalent guidance, so both can coexist without conflict.

Development

pnpm install
pnpm typecheck
pnpm test          # node:test against a throwaway SQLite db
pnpm build

From source (for development or self-hosted backend)

git clone https://github.com/siosig/mcp-chest-memory.git
cd mcp-chest-memory
pnpm install
pnpm build

# Deploy the Docker backend with Ansible:
cd ansible
ansible-playbook site.yml -i inventory/hosts.yml --tags docker --ask-vault-pass
# Install the client (skill, hooks, MCP server) via the plugin:
#   /plugin install chest-memory@chest-memory

Migration Guide (v1.5.0)

v1.5.0 changes the default embedding provider to Xenova/bge-m3 (1024-dim, multilingual) and adds a tokenized FTS column for better Japanese/CJK recall. Existing installations need two one-time migration steps.

1. Re-embed existing memories (new default: bge-m3)

The default provider changed from Xenova/multilingual-e5-small (384-dim) to Xenova/bge-m3 (1024-dim). Vectors from the old model are not searchable with the new one. Run:

chest-index reembed        # reset old vectors to pending and re-embed with bge-m3
chest-index up             # trigger embedding backfill

To keep the old provider, set CHEST_EMBED_MODEL=Xenova/multilingual-e5-small.

2. Backfill tokenized FTS column

chest-index migrate        # backup DB, add content_tokenized column, tokenize all rows

This creates a .bak.<timestamp> backup before touching the database. Pass --check for a dry-run (no writes). Pass --force to skip the backup step.

After migration, short Japanese terms (1–2 characters) that previously missed the trigram minimum are matched by morpheme tokens.

Optional: cross-encoder reranking

Set CHEST_RERANK_ENABLED=true to enable post-recall reranking with onnx-community/bge-reranker-v2-m3-ONNX. The reranker improves result ordering for multilingual queries. On timeout or model failure it degrades gracefully to the pre-rerank order.

Relevant env vars: CHEST_RERANK_MODEL, CHEST_RERANK_TOP_N (default 20), CHEST_RERANK_TIMEOUT_MS (default 5000).

See specs/013-multilingual-recall-quality/quickstart.md for a full walkthrough including eval harness usage.

Security

chest-memory stores a durable, cross-project record of how you and your agents work. That store is valuable, so it is also worth protecting. This section describes the threat model the project designs against and the concrete measures in the code.

Threat model

Two principals can reach the tools, and neither is fully trusted:

The LLM agent itself. An agent reads third-party content (repositories, web pages, issues) and can be steered by prompt injection hidden in that content. So a tool call is not automatically a trustworthy request — it may be an attacker's request laundered through the model.
Any holder of the shared Bearer token. The REST backend authenticates with one shared token; anyone who has it can POST arbitrary tool payloads to the backend host.

The design goal is that neither a prompt-injected agent nor a token holder can read arbitrary host files, dump the whole memory store, or silently destroy or rewrite memories.

Principles

Fail closed. When the safe scope is unknown, deny. File reads with no declared roots return nothing rather than falling back to "read anything".
No deployment branches in tool logic. Profile differences flow only through the executor port (src/core/executor.ts); a tool behaves identically in every profile. Where a tool must refuse on the backend, that falls out of the input (no client roots) rather than an if (remote) branch.
Protect the irreplaceable. realize (pain lessons), pinned (importance >= 0.9), and goal memories are exempt from every automated and caller-driven removal path.
Root-cause over symptomatic fixes. Cross-cutting concerns live in one audited helper (LIKE-escaping, path confinement, atomic writes) instead of being re-implemented per call site.
Defense in depth. TLS terminates at an external proxy and the backend still verifies the token; content caps are enforced in both the schema and the handler.

What the code does

Risk	Measure	Where
Arbitrary host file read via `chest_read_smart`	Reads are confined to the MCP client's declared roots, symlinks are resolved (`realpath`) before the check, and the same canonical path is used for `stat` and `read` (no check/use gap). The read always runs in the stdio server (where the roots exist); only the diff-cache snapshot rows reach the backend. Empty roots deny everything, so a `chest_read_smart` POSTed directly to the REST backend (no client roots) is still refused.	`src/mcp/roots.ts` (`confinePath`), `src/mcp/read-smart.ts`, `src/mcp/snapshot-store.ts`
Full-store disclosure via wildcard input	All user values interpolated into SQL `LIKE` are escaped (`%`, `_`, `\`) with an explicit `ESCAPE` clause, so `query: "%"` matches literally instead of every row.	`src/lib/db/sql-escape.ts`, `chest_recall`, `chest_recall_file`
Silent destruction of protected memory via `supersedes`	`supersedes` skips protected/pinned/goal targets and reports them; the low-level supersede guards the manual path too.	`chest_remember`, `src/lib/supersession.ts`
Mass-archival via an argument-less `chest_forget`	The sweep archives at most `CHEST_FORGET_SWEEP_CAP` (default 200) per call and reports `affected`/`remaining`; protected layers stay exempt.	`chest_forget`
Content-cap bypass via `chest_update_memory`	The same `MAX_CONTENT_CHARS` limit is enforced in the schema and the handler.	`chest_update_memory`
SQL injection	Every query binds values as parameters — no user string is concatenated into SQL. Simple CRUD uses the Prisma ORM (typed columns, no string-built clauses); the remaining raw SQL is reserved for SQLite-specific features (FTS5/`bm25`, vector ranking, claim-style updates) and still parameter-bound. The previous dynamic `SET`-clause builder was replaced by a typed ORM update.	repo-wide
Stored-memory prompt injection	Recall responses carry a notice that memory `content` is untrusted data, not instructions; the consolidation prompt wraps each memory in `<memory_data>` tags with a treat-as-data preamble.	`chest_recall`, `src/mcp/sampling.ts`
Settings corruption / secret leakage	`~/.claude/settings.json` is written atomically (temp file + rename) and owner-only (`0600`); hook logs are `0600`; the Stop-hook importer only accepts transcripts under `~/.claude/projects`.	`src/lib/fs-atomic.ts`, `src/bin/sync-session.ts`
Container/host compromise	The Docker image runs as the non-root `node` user over the bind mount; the maintenance lock lives in the user-owned data dir (not world-writable `/tmp`).	`docker/Dockerfile`, `src/cli/chest-index-flock.ts`
Weak auth / network exposure	The backend requires a Bearer token of at least 32 characters, compares it in constant time, binds the host configured by `CHEST_BIND_HOST`, and limits request bodies to 1 MB (50 MB for the session-ingestion endpoint).	`src/http/`

Residual risks (by design)

Cleartext HTTP: the token and memory content cross the network in the clear unless a TLS-terminating reverse proxy fronts the backend. Run it only on a trusted network, or front it with nginx / Caddy / Traefik with TLS.
The shared token grants full access: there is no per-client scoping. Treat it as a high-value secret.
Data markers reduce but do not eliminate stored prompt-injection risk; they are one layer, not a guarantee.

Found a vulnerability? Please open a private report rather than a public issue.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 77 Commits
.claude-plugin		.claude-plugin
.github/workflows		.github/workflows
ansible		ansible
docker		docker
plugins/chest-memory		plugins/chest-memory
prisma		prisma
scripts		scripts
src		src
test		test
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.ja.md		README.ja.md
README.md		README.md
package.json		package.json
tsconfig.json		tsconfig.json

Folders and files

Latest commit

History

Repository files navigation

mcp-chest-memory

Table of Contents

Features

Installation

Quick start

(A) Client setup (Claude Code plugin)

(B) Server admin setup

Import existing Claude Code history (optional)

Daily usage

What you have to do: (almost) nothing

What runs automatically even if you do nothing

MCP tools

How it works

Architecture

Team-wide memory sharing — when knowledge is stored vs. retrieved

Security — protecting token values and personal data

Write flow (chest_remember)

Search flow (chest_recall)

API negotiation and endpoints (v1.5+)

Memory layers

Forgetting

Supersession (overwrite detection)

Storage

Full-text search: FTS5 unicode61 + tokenized

Hybrid ranking

Memory lifecycle

Maintenance

Configuration reference

Usage metrics

Security notes

Diagnostics & reliability

Claude Code integration

Stop — chest-memory-sync

PreCompact — chest-memory-precompact

SessionStart — chest-memory-session-start

UserPromptSubmit — chest-memory-user-prompt-submit

Rules — chest-memory-rules

Development

From source (for development or self-hosted backend)

Migration Guide (v1.5.0)

1. Re-embed existing memories (new default: bge-m3)

2. Backfill tokenized FTS column

Optional: cross-encoder reranking

Security

Threat model

Principles

What the code does

Residual risks (by design)

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Stop — `chest-memory-sync`

PreCompact — `chest-memory-precompact`

SessionStart — `chest-memory-session-start`

UserPromptSubmit — `chest-memory-user-prompt-submit`

Rules — `chest-memory-rules`

Packages