DeusData · k6bdptd77n-arch · Jun 18, 2026 · Jun 19, 2026 · Jun 19, 2026
diff --git a/.claudeignore b/.claudeignore
@@ -0,0 +1,37 @@
+# autoclaude .claudeignore — keep bulky / low-signal paths out of context.
+# Copy to a project root as .claudeignore. Tune per project.
+
+# deps & builds
+node_modules/
+dist/
+build/
+.next/
+target/
+venv/
+.venv/
+__pycache__/
+*.pyc
+
+# lockfiles & generated
+package-lock.json
+yarn.lock
+pnpm-lock.yaml
+poetry.lock
+*.min.js
+*.map
+
+# logs, data, media (huge, low signal)
+*.log
+logs/
+*.csv
+*.parquet
+*.sqlite
+*.db
+coverage/
+.cache/
+
+# secrets (also: never read these)
+.env
+.env.*
+*.pem
+*.key
diff --git a/.github/workflows/fablize-ci.yml b/.github/workflows/fablize-ci.yml
@@ -0,0 +1,27 @@
+name: fablize-ci
+# Procedure-layer CI. Independent of the upstream C workflows (separate filename, scoped
+# paths) so it never clashes on `git pull upstream`.
+on:
+  push:
+    paths:
+      - 'fablize/**'
+      - '.github/workflows/fablize-ci.yml'
+  pull_request:
+    paths:
+      - 'fablize/**'
+      - '.github/workflows/fablize-ci.yml'
+
+jobs:
+  test:
+    runs-on: ubuntu-latest
+    strategy:
+      matrix:
+        python-version: ["3.9", "3.11", "3.13"]
+    steps:
+      - uses: actions/checkout@v4
+      - uses: actions/setup-python@v5
+        with:
+          python-version: ${{ matrix.python-version }}
+      - name: fablize test suite (stdlib only, no deps)
+        working-directory: fablize
+        run: python -m unittest discover -s tests -p 'test_*.py' -v
diff --git a/INTEGRATION.md b/INTEGRATION.md
@@ -0,0 +1,37 @@
+# How the two layers compose
+
+This project is one product made of two complementary layers:
+
+| Layer | Folder | Answers | Form |
+|-------|--------|---------|------|
+| **Memory** | `src/`, `internal/`, … (the C core) | *What is the code?* — definitions, callers, data flow, architecture | MCP server, 14 tools, SQLite graph |
+| **Procedure** | `fablize/` | *How do I work on it?* — clarify, complete, investigate, verify, escalate | stdlib Python + plain-text packs |
+
+The memory layer gives the agent a **map**; the procedure layer gives it a **method**. Neither
+replaces the other — a map without a method wanders, a method without a map crawls file by file.
+
+## Where the procedure calls the memory
+
+The fablize disciplines invoke the MCP tools at the exact points they help most:
+
+| Discipline (`fablize/packs/…`) | Calls these memory tools | Why |
+|---|---|---|
+| **orient-pack** | `index_repository`, `get_architecture`, `search_graph`, `get_code_snippet`, `trace_path` | Build the map before editing — know the seams and the blast radius. |
+| **clarify-pack** (step 0) | `get_architecture`, `search_graph`, `search_code` | Answer unknowns from the code before asking the user — cheaper than a question. |
+| **investigation-protocol** (steps 3–4) | `search_graph`, `trace_path` (data_flow), `get_code_snippet`, `query_graph`, `ingest_traces` | `trace_path` *is* the causal chain; `query_graph` exposes hot-path signals. |
+| **verification-grounding** | `detect_changes`, `trace_path` (inbound) | Confirm the structural effect of a change and catch a forgotten caller. |
+| **spec-lock decisions** (`spec.py`) | `manage_adr` (optional) | A locked architectural decision can be recorded as an ADR in the graph. |
+
+All of this is **prompt-level wiring** — plain text and tool calls. No C was modified; the C
+core stays byte-for-byte upstream, so `git pull upstream` merges cleanly. The procedure layer
+also degrades gracefully: if the memory tools are absent, every discipline still applies by
+reading files directly.
+
+## Design boundary (deliberate)
+
+fablize is **not** reimplemented as MCP tools inside the C server. Its engines stay as
+dependency-free Python the agent drives from a shell — the same shell every agent that
+codebase-memory-mcp configures already has. This keeps the procedure layer portable, testable
+in isolation (`fablize/tests/`), and independent of the C build.
+
+See `fablize/AGENTS.md` for the operating block and `fablize/README.md` for the layer's contents.
diff --git a/NOTICE b/NOTICE
@@ -0,0 +1,19 @@
+codebase-memory-mcp + fablize
+==============================
+
+This distribution combines two independently MIT-licensed components.
+
+1. codebase-memory-mcp (the memory layer — C core, src/, internal/, vendored/, …)
+   Copyright (c) 2025 DeusData
+   Upstream: https://github.com/DeusData/codebase-memory-mcp
+   Licensed under the MIT License (see LICENSE).
+   The C core in this fork is unmodified upstream.
+
+2. fablize (the procedure layer — the fablize/ directory, plus INTEGRATION.md and
+   install-combined.sh at the repo root)
+   Copyright (c) 2025 fivetaku
+   Upstream: https://github.com/fivetaku/fablize
+   Licensed under the MIT License.
+
+Both components are distributed under the MIT License. See LICENSE for the full text.
+Each component retains its own copyright; this NOTICE documents their combination.
diff --git a/README.md b/README.md
@@ -18,6 +18,22 @@
 
 High-quality parsing through [tree-sitter](https://tree-sitter.github.io/tree-sitter/) AST analysis across all 158 languages, enhanced with [**Hybrid LSP** semantic type resolution](#hybrid-lsp) for Python, TypeScript / JavaScript / JSX / TSX, PHP, C#, Go, C, C++, Java, Kotlin, and Rust — producing a persistent knowledge graph of functions, classes, call chains, HTTP routes, and cross-service links. 14 MCP tools. Zero dependencies. Plug and play across 11 coding agents.
 
+> ### 🧭 This distribution: codebase-memory-mcp **+ fablize**
+>
+> This fork pairs the memory engine with **[fablize](fablize/)** — a procedure layer that
+> makes an agent *work* well, not just *see* well. The memory layer answers **what the code
+> is**; fablize answers **how to work on it**: clarify before building, complete with
+> evidence, investigate systematically (using `trace_path` as the literal causal chain),
+> verify the structural effect of a change, and escalate honestly at the model's ceiling.
+> Two complementary layers, one install — see **[INTEGRATION.md](INTEGRATION.md)**.
+>
+> ```bash
+> bash install-combined.sh        # builds the engine, registers MCP, applies the disciplines
+> ```
+>
+> The C core below is **unmodified upstream** — fablize lives entirely in `fablize/` (pure
+> stdlib Python + plain-text packs), so updates from [DeusData/codebase-memory-mcp](https://github.com/DeusData/codebase-memory-mcp) merge cleanly.
+
 > **Research** — The design and benchmarks behind this project are described in the preprint [*Codebase-Memory: Tree-Sitter-Based Knowledge Graphs for LLM Code Exploration via MCP*](https://arxiv.org/abs/2603.27277) (arXiv:2603.27277). Evaluated across 31 real-world repositories: 83% answer quality, 10× fewer tokens, 2.1× fewer tool calls vs. file-by-file exploration.
 
 > **Security & Trust** — This tool reads your codebase and writes to your agent configuration files. That is what it is designed to do. If you prefer to audit before running, the [full source is here](https://github.com/DeusData/codebase-memory-mcp) — every release binary is signed, checksummed, and scanned by 70+ antivirus engines. All processing happens 100% locally; your code never leaves your machine. Found a security issue? We want to know — see [SECURITY.md](SECURITY.md). Security is Priority #1 for us.

diff --git a/fablize/.gitignore b/fablize/.gitignore
@@ -0,0 +1,4 @@
+__pycache__/
+*.pyc
+.fablize/
+dist/
diff --git a/fablize/AGENTS.md b/fablize/AGENTS.md
@@ -0,0 +1,123 @@
+# fablize — operating disciplines for any AI coding agent
+
+> This is the tool-agnostic version of fablize. `AGENTS.md` is read by Cursor, GitHub
+> Copilot, Gemini CLI, Aider, Codex, and other agents the same way `CLAUDE.md` is read by
+> Claude Code. Drop this file (and the `packs/` + `scripts/` it references) into a project
+> and any agent gains the same completion / verification / investigation discipline.
+>
+> Principle: a harness cannot raise a model's ceiling. It makes the model reach its *own*
+> ceiling by enforcing verification, completion, and investigation as procedure. When the
+> ceiling itself is the blocker (open-ended creative detail, self-driven discovery),
+> escalate — don't pretend.
+
+Apply only what the task signals — the smallest matching discipline. Overlap only when the
+task is genuinely multi-category. With no signal, just follow the baseline.
+
+## The two layers of this project
+
+This project pairs a **memory layer** with this **procedure layer**:
+
+- **Memory (codebase-memory-mcp)** — a structural knowledge graph of the code, exposed as MCP
+  tools: `get_architecture`, `search_graph`, `search_code`, `trace_path`, `query_graph`,
+  `get_code_snippet`, `detect_changes`, `ingest_traces`, `manage_adr`, `index_repository`, …
+  It answers *what the code is* — definitions, callers, data flow, architecture — in
+  sub-millisecond queries. Prefer `search_graph` / `search_code` / `trace_path` **instead of
+  grep/glob** for finding code, callers, dependencies, and impact.
+- **Procedure (fablize, below)** — answers *how to work*: clarify, complete with evidence,
+  investigate, verify, escalate.
+
+The disciplines below call the memory tools at the points where they help most (see
+`INTEGRATION.md`). When the memory tools are not present, the disciplines still apply — they
+degrade gracefully to reading files directly.
+
+## [always] Baseline
+
+- Lead with the outcome. Stay within the requested scope — no incidental refactors.
+- Ground every "done" claim in a command you actually ran this session (paste the result).
+- Confirm before destructive or hard-to-reverse actions.
+
+## [unfamiliar / multi-file change] Orient first
+
+Before editing code you have not read this session, build the map: follow
+`packs/orient-pack.txt` — `get_architecture` for the seams → `search_graph` to locate the
+symbols → `trace_path` (inbound) for the blast radius before changing a shared symbol.
+Skip for a self-contained edit in a file already in front of you.
+
+- Lead with the outcome. Stay within the requested scope — no incidental refactors.
+- Ground every "done" claim in a command you actually ran this session (paste the result).
+- Confirm before destructive or hard-to-reverse actions.
+
+## [ambiguous / expensive build] Clarify first
+
+Before building something underspecified (open-ended, multi-file, design/UI, unstated
+scope), follow `packs/clarify-pack.txt`: surface the genuine unknowns → ask ONE batched
+round of 1–4 targeted questions → lock the agreed spec → then build against it. Persist it:
+
+```bash
+python3 scripts/spec.py lock --brief "<summary>" --req "<requirement>" \
+  --constraint "<constraint>" --decision "question::answer"
+python3 scripts/spec.py show     # run first when resuming an ambiguous build
+```
+
+Skip entirely if the request is already specific — asking on a clear task is its own waste.
+First resolve what you can from the code (`get_architecture` / `search_graph`) — a question
+the graph already answers is not a question for the user.
+
+## [2+ sequential stories] Multi-story loop with a verification gate
+
+Decompose into sequential stories, complete one at a time, produce evidence as you go.
+State persists in `./.fablize/` (resume across sessions with `status`).
+
+```bash
+python3 scripts/goals.py create --brief "<summary>" \
+  --goal "title::verifiable objective" --goal "..."   # the LAST goal must be a verification story
+python3 scripts/goals.py next                          # activate the next story + handoff
+# ...work that story only...
+python3 scripts/goals.py checkpoint --id G001 --status complete --evidence "<concrete evidence>"
+# final story is a gate: --verify-cmd "<command>" --verify-evidence "<result>" are required
+python3 scripts/goals.py retry --id G001               # reopen a blocked story for another attempt
+python3 scripts/goals.py status                        # run first when resuming
+```
+
+Rules: `complete` requires non-empty evidence; the final goal cannot complete without a
+verify command + its result. A story that is `blocked` twice trips the escalation gate
+(see below) — bounded self-correction, never an infinite retry loop.
+
+## [debugging / test failure / unknown cause / review] Investigation protocol
+
+Follow `packs/investigation-protocol.txt`: reproduce first → form 3+ competing hypotheses →
+gather evidence per hypothesis → trace the full causal chain (removing the symptom is not
+removing the defect) → verify before and after → report the hypotheses you rejected.
+The memory tools make this concrete: `trace_path` (mode:"data_flow") *is* the causal chain;
+`query_graph` exposes hot-path signals for performance defects; `ingest_traces` folds a
+reproduction back into the graph.
+
+## [render / executable artifact: HTML, SVG, game, UI, chart] Verification grounding
+
+Follow `packs/verification-grounding-pack.txt`: run it in the real renderer → observe the
+actual output → fix what the observation reveals → re-run. A static parse confirms
+well-formed, not correct. For a code change, the analogue is `detect_changes` + `trace_path`
+(inbound) to confirm the structural effect and catch a caller you forgot.
+
+## [at the capability ceiling] Escalate
+
+Signals: stuck on the same problem 2+ times (the goals engine trips this automatically),
+open-ended creation where detail itself is the value, deep review needing out-of-spec
+discovery. These are capability, not procedure. In order: (1) raise the model's thinking
+budget / reasoning effort to its maximum; (2) hand off to a stronger model in a fresh
+session with an evidence package (symptoms, attempts, failure point, repro); (3) otherwise
+report the limit honestly and name where a human must step in.
+
+## Observability
+
+The engines log every event to `~/.fablize/events.jsonl`. Summarize real usage with:
+
+```bash
+python3 scripts/metrics.py            # completion rate, escalations, specs locked
+```
+
+---
+
+The `scripts/` are pure-Python stdlib (no dependencies) — any agent with a shell can run
+them. The `packs/` are plain text — any agent can read them. That is what makes these
+disciplines portable across tools.
diff --git a/fablize/README.md b/fablize/README.md
@@ -0,0 +1,36 @@
+# fablize — the discipline layer
+
+This folder is the **procedure layer** of this project. While the C core
+(`codebase-memory-mcp`) gives an agent a *map* of the code, fablize gives it a
+*method* of working: clarify before building, complete with evidence, investigate
+systematically, verify what was rendered, and escalate honestly at the capability ceiling.
+
+It is self-contained and dependency-free (pure-Python stdlib + plain-text packs), so it
+works with **any** agent that has a shell — exactly the agents `codebase-memory-mcp`
+already configures.
+
+## Contents
+
+| Path | What |
+|------|------|
+| `AGENTS.md` | the operating block, wired to this project's MCP tools |
+| `packs/` | the verified discipline packs (clarify, investigation, verification grounding) |
+| `scripts/goals.py` | multi-story loop with an evidence/verification gate + bounded self-correction |
+| `scripts/spec.py` | locked-spec store so a clarified spec survives compaction/restart |
+| `scripts/metrics.py` | observability over `~/.fablize/events.jsonl` |
+| `scripts/bundle.py` | build a portable, tool-agnostic bundle of the disciplines |
+| `hooks/destructive_guard.py` | PreToolUse guard that asks before hard-to-reverse commands |
+| `tests/` | stdlib unittest suite (no deps) |
+
+## Run the tests
+
+```bash
+python3 -m unittest discover -s tests -v
+```
+
+## How it composes with the memory layer
+
+See [`../INTEGRATION.md`](../INTEGRATION.md) for how the disciplines call the MCP tools
+(`get_architecture`, `search_graph`, `trace_path`, `detect_changes`, …).
+
+MIT licensed.
diff --git a/fablize/hooks/destructive_guard.py b/fablize/hooks/destructive_guard.py
@@ -0,0 +1,69 @@
+#!/usr/bin/env python3
+"""fablize destructive-action guard — a deterministic PreToolUse hook.
+
+The "confirm before destructive or hard-to-reverse actions" rule lives in the operating
+block as text, which a model can skip. This hook makes it a *preventive control*: it
+inspects Bash commands before they run and forces a human approval prompt for the
+genuinely dangerous, hard-to-reverse ones (recursive force-delete, force-push, history
+rewrite, disk wipe, destructive SQL, etc.).
+
+Protocol: reads the PreToolUse payload on stdin, emits a permission decision on stdout.
+  - "ask"  → Claude Code prompts the user to approve before running (default for matches).
+  - silent → exit 0 with no output lets the command proceed normally.
+It never hard-blocks (deny) — the user stays in control; it only inserts a checkpoint.
+"""
+import json
+import re
+import sys
+
+# (compiled pattern, human reason). Order doesn't matter; first match wins for the message.
+RULES = [
+    (r"\brm\s+(-[a-zA-Z]*r[a-zA-Z]*\s+)*-?[a-zA-Z]*f|\brm\s+-[a-zA-Z]*f[a-zA-Z]*r", "recursive/forced file deletion (rm -rf)"),
+    (r"\brm\s+-[a-zA-Z]*r[a-zA-Z]*\s+(/|~|\$HOME|\.)\s*$", "recursive delete of a top-level path"),
+    (r"\bgit\s+push\b.*(--force\b|-f\b)", "git force-push (rewrites remote history)"),
+    (r"\bgit\s+(reset\s+--hard|clean\s+-[a-zA-Z]*f|filter-branch|filter-repo)\b", "git history/working-tree destruction"),
+    (r"\bgit\s+branch\s+-D\b", "force-delete of a git branch"),
+    (r"\b(drop|truncate)\s+(table|database|schema)\b", "destructive SQL (DROP/TRUNCATE)"),
+    (r"\b(mkfs|dd\s+if=|shred|wipefs)\b", "disk/partition wipe"),
+    (r"\b(kubectl|helm)\s+delete\b", "Kubernetes resource deletion"),
+    (r"\b(terraform|tofu)\s+destroy\b", "infrastructure teardown (terraform destroy)"),
+    (r":\(\)\s*\{\s*:\|:&\s*\}", "fork bomb"),
+    (r"\bchmod\s+-R\b|\bchown\s+-R\b", "recursive permission/ownership change"),
+    (r">\s*/dev/sd[a-z]", "raw write to a block device"),
+]
+COMPILED = [(re.compile(p, re.I), why) for p, why in RULES]
+
+
+def match(command):
+    for rx, why in COMPILED:
+        if rx.search(command):
+            return why
+    return None
+
+
+def main():
+    try:
+        payload = json.load(sys.stdin)
+    except (ValueError, OSError):
+        sys.exit(0)  # malformed input — do not interfere
+    if payload.get("tool_name") != "Bash":
+        sys.exit(0)
+    command = (payload.get("tool_input") or {}).get("command", "")
+    if not command:
+        sys.exit(0)
+    why = match(command)
+    if not why:
+        sys.exit(0)
+    out = {
+        "hookSpecificOutput": {
+            "hookEventName": "PreToolUse",
+            "permissionDecision": "ask",
+            "permissionDecisionReason": f"fablize guard: {why}. Confirm this hard-to-reverse action before running.",
+        }
+    }
+    print(json.dumps(out))
+    sys.exit(0)
+
+
+if __name__ == "__main__":
+    main()