AXME Code is an MCP server plugin for Claude Code. Its purpose is to accumulate and serve project knowledge across sessions. Without it, every new Claude Code session starts from zero - the agent knows nothing about the stack, decisions, or rules. With it, the agent receives full context at session start: "this project uses Python + FastAPI, here are 60 architectural decisions, here are the safety rules, here is what happened in the last session".
+---------------------------------------------+
| MCP Server (persistent, stdio) | <- lives as long as VS Code window is open
| 19 tools: axme_context, axme_decisions, |
| axme_save_memory, axme_oracle, ... |
+---------------------------------------------+
| Hooks (pre-tool-use, post-tool-use, | <- fire on EVERY Claude Code tool call,
| session-end) | fresh process each time
+---------------------------------------------+
| Detached Auditor (post-session) | <- LLM analysis of transcript after
| axme-code audit-session | window close
+---------------------------------------------+
Launches as a stdio process when VS Code opens. Lives for the entire window lifetime. Provides Claude Code with tools:
| Tool | Purpose |
|---|---|
axme_context |
Compact meta (safety, handoff, worklog) + instructions to load full KB in parallel |
axme_oracle |
Stack + structure + patterns + glossary |
axme_decisions |
List active decisions with enforce levels |
axme_memories |
All memories (feedback + patterns) |
axme_save_decision |
Agent saves a new architectural decision (with slug-based dedup) |
axme_save_memory |
Agent saves feedback/pattern (mistakes, approaches) |
axme_safety |
Current safety rules (git, bash, filesystem) |
axme_update_safety |
Add a new safety rule |
axme_backlog |
List or read backlog items (persistent cross-session task tracking) |
axme_backlog_add |
Add a new backlog item with priority and tags |
axme_backlog_update |
Update backlog item status, priority, or append notes |
axme_status |
Project status (sessions, decisions count, etc.) |
axme_worklog |
Event log (session starts, audit results) |
axme_workspace |
List all repos in workspace |
axme_begin_close |
Start session close: returns extraction checklist for the agent |
axme_finalize_close |
Finalize close: writes handoff, worklog, extractions, sets agentClosed flag |
axme_ask_question |
Record a question only the user can answer |
axme_list_open_questions |
List open questions from previous sessions |
axme_answer_question |
Record the user's answer to an open question |
Key principle: the MCP server is purely deterministic - no LLM inside. It acts as a database with a tool API. The agent asks, the server answers from .axme-code/ storage. The agent learns something new, the server writes it to storage.
Storage consistency: all writes to .axme-code/ go through MCP server code (atomicWrite, appendFileSync), never directly by the LLM. This guarantees correct file formats, proper append-to-end (not beginning), valid JSON in meta.json, and consistent YAML in rules.yaml. The agent provides data, the MCP server writes it correctly.
Fired by Claude Code automatically on every tool call (all tools, including MCP tools - no matcher restriction). Each invocation spawns a fresh node dist/cli.mjs hook <name> process.
- Hard safety enforcement: checks
checkGit(push to main? force push?),checkBash(rm -rf /? npm publish?),checkFilePath(/etc/passwd? .env?) - If violation detected: returns
"deny"and Claude Code blocks the tool call before execution - This is not prompt-based - the hook intercepts the tool call at the Claude Code harness level, before any command runs. The LLM cannot bypass, ignore, or override this block. Even if the agent's prompt is jailbroken or the LLM hallucinates a reason to run a denied command, the hook physically prevents execution. This gives 100% enforcement reliability for safety rules.
- If OK: silently passes
- Also calls
ensureAxmeSessionForClaude- lazily creates an AXME session on the first hook call of the window
- Only fires for
Edit | Write | NotebookEdit - Records the changed file path into
session.filesChanged[] - This is the only real-time mechanism for tracking what the agent modified (Bash mutations are supplemented later during audit via transcript parsing)
- Sets
closedAton the session meta - Runs
runSessionCleanupwhich spawns a detached audit worker
A separate process (axme-code audit-session --workspace X --session Y), spawned via child_process.spawn({ detached: true }) + child.unref(). Has its own process group (setsid), so it survives VS Code close, MCP server kill, Claude Code kill.
The auditor operates in one of two modes based on the agentClosed flag in session meta:
Full extraction mode (agentClosed=false - crash/orphan sessions):
The agent didn't complete the close checklist. The auditor does full extraction using AUDIT_PROMPT.
Verify-only mode (agentClosed=true - agent closed the session):
The agent already extracted memories/decisions/safety via axme_begin_close + axme_finalize_close. The auditor uses VERIFY_ONLY_AUDIT_PROMPT - a lighter prompt that only catches items the agent missed. Most sections will be empty.
- Reads the Claude Code session transcript (
.jsonlfile, can be 10-20MB) - Uses resume offset - if part was already audited (after /compact), reads only new bytes
- Renders transcript into XML format (not chat markers - otherwise the LLM confuses it with conversation continuation)
- Sends to LLM with extraction prompt (full or verify-only):
- Memories (feedback, patterns) - what the agent learned, what went wrong
- Decisions - architectural decisions that were made
- Safety rules - new restrictions (if discussed)
- Handoff - where work stopped, what's next, blockers (skipped in verify-only if agent wrote it)
- Oracle changes - whether stack/structure rescan is needed
- Session summary - narrative for worklog.md (always generated)
- For each extraction, performs Grep dedup - checks if it already exists in storage
- Saves new/updated items to
.axme-code/via scope routing (workspace-level or per-repo) - Does NOT overwrite agent handoff (if
source=agentexists, auditor skips handoff write) - Writes audit log with full telemetry (cost, tokens, resume offsets, extractions)
- Sets
auditStatus=doneon session meta
Cost: ~$0.05-0.15 per session (full mode), significantly less in verify-only mode.
There are two writers, at different times:
When the agent makes a decision or learns something important during work, it calls an MCP tool directly:
axme_save_decision- "we decided to use Valkey instead of Redis"axme_save_memory- "this bug was caused by sync httpx in async handler"axme_update_safety- "add npm publish to denied"
The MCP server accepts the call and writes a file to disk. No LLM involved.
Reads the full transcript and extracts what the agent forgot to save. The agent may have discussed an important decision with the user but never called axme_save_decision. The auditor catches this.
Three levels of instruction reach the agent:
-
CLAUDE.md (in workspace root, auto-loaded by Claude Code):
### During Work - Save memories/decisions/safety rules immediately when discovered -
MCP server
instructions(injected into system prompt on connect):Save memories, decisions, and safety rules immediately when discovered during work. -
Tool descriptions (visible in the agent's tool list):
axme_save_decision- "Save a new architectural decision. Use enforce='required' for rules that must be followed..."axme_save_memory- "Save a feedback or pattern memory. Use 'feedback' for learned mistakes..."
When the user asks to close the session:
1. Agent calls axme_begin_close
+-- MCP returns extraction checklist (what to extract, scope rules, dedup instructions)
2. Agent reviews the session
+-- Compares candidates against axme_context data (already loaded)
+-- For each item: checks for duplicates, contradictions, outdated entries
+-- Saves memories/decisions/safety via existing tools (axme_save_*)
+-- Prepares handoff, worklog entry, startup text
3. Agent calls axme_finalize_close (single call with everything)
+-- MCP writes: handoff.md (atomicWrite), worklog.md (append), meta.json (agentClosed=true)
+-- MCP executes: add/remove/supersede for memories, decisions, safety rules
+-- MCP returns: storage summary (what was saved where)
4. Agent outputs to user: storage summary, then startup text
During session: At session close: After close (background):
Agent works Agent calls begin_close Auditor (LLM) reads transcript
| | |
Learns something Gets checklist + dedup Verify-only mode (agentClosed=true)
| | or full extraction (crash/orphan)
v v |
Calls axme_save_* Reviews session, extracts v
| Calls finalize_close Catches what agent MISSED
v | |
MCP server writes v v
to .axme-code/ MCP writes all files .axme-code/ supplemented
atomically
The agent has full conversation context and produces higher-quality extractions than the auditor (which only sees the transcript, possibly truncated). The auditor is a safety net for:
- Crash/orphan sessions where the agent didn't complete the close checklist
- Items the agent missed even during explicit close
1. VS Code window opens
+-- MCP server starts (axme-code serve)
+-- Orphan scan: checks old sessions with dead pid -> spawns audit workers
+-- Stale mapping adoption: if old mapping has dead ownerPpid, adopt it
2. Agent starts working
+-- First tool call -> pre-tool-use hook -> ensureAxmeSessionForClaude
-> creates AXME session in .axme-code/sessions/<id>/meta.json
-> refreshes ownerPpid on reuse (handles VS Code reload)
-> writes session_start to worklog
3. Agent calls axme_context
+-- MCP server returns oracle + decisions + safety + memory + plans + handoff
4. Every tool call
+-- pre-tool-use: safety check (block/allow) - fires on ALL tools (no matcher)
+-- post-tool-use: track filesChanged (Edit/Write only)
5. Agent saves a decision/memory during work
+-- axme_save_decision / axme_save_memory -> written to .axme-code/
6. User asks to close session
+-- Agent calls axme_begin_close -> gets extraction checklist
+-- Agent extracts memories/decisions/safety (with dedup against axme_context data)
+-- Agent calls axme_finalize_close with all data
+-- MCP writes: handoff.md, worklog.md, extractions, agentClosed=true
+-- Agent outputs: storage summary + startup text to user
7. VS Code window closes (or stdin EOF)
+-- cleanupAndExit -> spawn detached audit worker per owned session
+-- Worker PID lives independently, VS Code is already dead
8. Detached auditor (20-60 sec)
+-- agentClosed=true: verify-only mode (catch missed items only)
+-- agentClosed=false: full extraction (crash/orphan)
+-- Does NOT overwrite agent handoff (source=agent is authoritative)
+-- Writes audit log, saves offset
+-- auditStatus = done
9. Next session
+-- axme_context picks up everything accumulated
Every git commit and git push command must end with a metadata suffix:
git commit -m "fix bug" #!axme pr=42 repo=AxmeAI/axme-code
git push origin feat/foo #!axme pr=none repo=AxmeAI/axme-code
The pre-tool-use hook:
- Checks for the
#!axmesuffix — blocks the command if missing (returns format instruction) - Parses
pr=<number>and verifies viagh pr viewthat the PR is not already merged - If the PR is merged, blocks the command (prevents committing to stale branches)
This replaces all prior cwd/branch detection. The agent explicitly provides repo and PR number, eliminating cwd bugs, network inference, and fail-open errors.
Each repo has its own backlog in .axme-code/backlog/. Items persist across sessions (unlike TODOs which are session-scoped).
Tools: axme_backlog (list/read), axme_backlog_add (create), axme_backlog_update (status/priority/notes).
Items have: ID (B-001..B-NNN), title, status (open/in-progress/done/blocked), priority (high/medium/low), tags, notes, timestamps.
The auditor may find ambiguities during transcript analysis. Instead of guessing, it records a question in .axme-code/open-questions.md. The next session's axme_context surfaces these questions to the agent, which presents them to the user.
Lifecycle: [open] → [answered] (user responds via axme_answer_question) → [applied] (action taken) → [archived].
Tools: axme_ask_question, axme_list_open_questions, axme_answer_question.
axme_context returns a compact overview (~10-15K chars): storage root header, safety rules, handoff, backlog summary, open questions, and instructions to call three tools in parallel:
axme_oracle— stack, structure, patterns, glossaryaxme_decisions— all decisions with enforce levelsaxme_memories— feedback and patterns
Each sub-call returns ~15-25K chars (fits tool output limits). The server tracks which context paths were already delivered in the session and avoids duplicating workspace-level data when repo-level calls follow.
413 tests across 88 suites. Run with Node.js built-in test runner:
npm test # node --test --experimental-strip-types test/*.test.tsCoverage includes: storage engine, sessions, safety hooks, decisions, memory, oracle detection, transcript parser, workspace merge, plans, worklog, backlog, questions, config, presets.
.axme-code/
|-- oracle/ # stack.md, structure.md, patterns.md, glossary.md
|-- decisions/ # D-001-*.md ... D-NNN-*.md (YAML frontmatter)
|-- memory/
| |-- feedback/ # Mistakes and corrections
| +-- patterns/ # Successful approaches
|-- safety/
| +-- rules.yaml # git + bash + filesystem rules
|-- backlog/ # B-001-*.md ... persistent cross-session task tracking
|-- sessions/ # Per-session meta.json
|-- active-sessions/ # Claude session -> AXME session mapping
|-- audited-offsets/ # Resume byte offsets per transcript
|-- audit-logs/ # Per-audit telemetry JSON
|-- audit-worker-logs/ # Worker stderr output
|-- plans/
| |-- handoff-<id>.md # Per-session handoff (last 5 kept, Source: agent or auditor)
| +-- <id>-<slug>.md # Active plans with steps
|-- open-questions.md # Inter-session questions (auditor -> user -> agent)
|-- worklog.jsonl # Append-only structured event log
|-- worklog.md # Narrative session summaries (written by finalize_close + auditor)
+-- config.yaml # Model settings, presets, review config
Each repo in a workspace has its own .axme-code/ with separate decisions, oracle, and safety rules. The workspace-level .axme-code/ stores cross-repo items and sessions.
Repo mode vs workspace mode: When the MCP server cwd has .git/ (is a git repo), it operates in repo mode — no parent workspace auto-detection, all storage goes to repo .axme-code/. Workspace mode only activates when cwd IS the workspace root (no .git/, has workspace markers).
A workspace can contain 50+ repos. Some knowledge is universal ("never push to main", "all SDKs release together"), some is repo-specific ("this repo uses Go 1.24", "httpx.AsyncClient is mandatory in gateway handlers"). Storing everything in one place either pollutes repo-specific context with irrelevant rules or loses cross-repo knowledge.
axme-workspace/ <- WORKSPACE ROOT
|-- .axme-code/ <- WORKSPACE-LEVEL storage
| |-- decisions/ (75 decisions) cross-repo rules
| |-- memory/ (feedback + patterns) universal lessons
| |-- safety/ (rules.yaml) workspace-wide safety
| |-- oracle/ (stack.md, ...) workspace overview
| |-- sessions/ (session tracking) ALL sessions live here
| +-- worklog.jsonl ALL events live here
|
|-- axme-control-plane/
| +-- .axme-code/ <- REPO-LEVEL storage
| |-- decisions/ (60 decisions) repo-specific rules
| |-- memory/ repo-specific lessons
| |-- safety/ (rules.yaml) repo-specific safety
| +-- oracle/ (stack.md, ...) repo tech stack
|
|-- axme-cli/
| +-- .axme-code/ <- REPO-LEVEL storage
| |-- decisions/ (46 decisions)
| |-- ...
|
|-- axme-sdk-python/
| +-- .axme-code/ <- REPO-LEVEL storage
| +-- ...
|
+-- ... (56 repos total, each with .axme-code/)
| Level | What gets stored | Example |
|---|---|---|
| Workspace | Cross-repo rules, universal conventions, session tracking, worklog, audit logs, plans, handoff | "All SDKs release together on same version", "Never merge PRs as agent", "Protected branches: main" |
| Repo | Repo-specific tech stack, coding patterns, architecture decisions, repo-specific safety rules | "Python SDK uses httpx (sync only)", "Go CLI uses Cobra", "axme-control-plane: AsyncClient mandatory in async handlers" |
Every item (decision, memory, safety rule) has an optional scope field that controls where it gets stored. Routing happens identically for saves during a session (via MCP tools) and saves after a session (via the auditor).
scope: undefined / [] / ["all"]
-> writes to workspace root .axme-code/
(discoverable by all repos via merged context)
scope: ["axme-control-plane"]
-> writes to axme-control-plane/.axme-code/
(only visible when working in that repo)
scope: ["axme-cli", "axme-sdk-go"]
-> writes to BOTH repos' .axme-code/
(visible in either repo, not in others)
The routing logic in code (saveScopedDecisions, saveScopedMemories, saveScopedSafetyRule):
if (isAllScope) {
// scope=all -> write to session origin (workspace root in workspace sessions)
save(projectPath, item);
} else {
// scope=["repo-a", "repo-b"] -> write to each listed repo
for (const repoName of scope) {
const repoPath = join(workspacePath, repoName);
save(repoPath, item);
crossProject++;
}
}D-NNN IDs are generated independently per storage location. Workspace root and each repo have their own sequence: workspace D-075, axme-control-plane D-060, axme-cli D-046 - these are all independent counters.
When the agent calls axme_context, both levels are read and merged into a single unified view. The merge strategy differs by data type:
Decisions - concatenate, project wins on ID conflict:
workspace decisions: D-001..D-075 (universal rules)
+
repo decisions: D-001..D-060 (repo-specific)
=
agent sees: 135 decisions total
If workspace has D-042 and repo has D-042 (same ID, different content), the repo version wins. In practice this doesn't happen because IDs are generated independently and slugs differ.
Safety rules - union merge, strictest wins:
workspace rules.yaml: repo rules.yaml:
protectedBranches: [main, master] protectedBranches: [main, develop]
deniedPrefixes: [rm -rf /, ...] deniedPrefixes: [docker push, ...]
allowForcePush: false allowForcePush: false
merged result:
protectedBranches: [main, master, develop] <- union
deniedPrefixes: [rm -rf /, ..., docker push] <- union
allowForcePush: false <- AND (both must allow)
requirePrForMain: true <- OR (either can require)
Principle: deny lists union (a deny at any level wins), boolean allow flags AND (both levels must allow), boolean require flags OR (either level can require).
Memories - concatenate, deduplicate by slug (repo wins):
workspace memories: universal feedback + patterns
+
repo memories: repo-specific feedback
=
agent sees: combined set, repo version wins on slug collision
Oracle - both levels returned, labeled separately:
Workspace oracle: overall workspace structure, project list
Project oracle: repo-specific stack, patterns, glossary
The agent sees both sections in axme_context output, clearly labeled as "Workspace Context" and "Project Context".
Before working in any specific repo, the agent must call axme_context with that repo's path. This loads the repo-level context on top of the workspace context. Without this call, the agent only sees workspace-level rules and misses repo-specific decisions and patterns.
This is enforced by instruction in CLAUDE.md:
### Per-Repo Gate (MANDATORY)
Every repo has its own .axme-code/ storage created during setup.
BEFORE reading code, making changes, or running tests in any repo:
call axme_context with that repo's path to load repo-specific context.
The post-session auditor decides scope for each extraction based on the transcript content. If the agent discussed a Python-specific pattern while working in axme-control-plane, the auditor routes it to that repo. If the agent discussed a universal rule ("never merge PRs"), the auditor routes it to workspace level.
The auditor also uses filesChanged from the session to infer which repos were touched, and routes extracted items accordingly. A memory about a bug in axme-cli/cmd/axme/tasks.go goes to axme-cli/.axme-code/, not to the workspace root.
With 56 repos and 2444+ decisions, a flat storage would be unusable:
- The agent would see all 2444 decisions on every
axme_contextcall, most irrelevant - Safety rules for a Go CLI repo would include Python-specific rules from the gateway
- Oracle would mix TypeScript patterns with Java conventions
Two levels give the agent focused context: only the universal rules plus the repo it's actually working in.