jarmen423 · jarmen423 · Apr 8, 2026 · Apr 3, 2026 · Apr 8, 2026 · Copilot
diff --git a/README.md b/README.md
@@ -75,8 +75,8 @@ codememory serve
 codememory search "where is the auth logic?"
 
 # Git graph (rollout build)
-codememory git-init --repo /absolute/path/to/repo --mode local --full-history
-codememory git-sync --repo /absolute/path/to/repo --incremental
+codememory git-init --repo /absolute/path/to/repo
+codememory git-sync --repo /absolute/path/to/repo --full
 codememory git-status --repo /absolute/path/to/repo --json
 ```
 
@@ -136,6 +136,14 @@ Full workflow and options: [docs/TOOL_USE_ANNOTATION.md](docs/TOOL_USE_ANNOTATIO
 | `get_file_dependencies(file_path, domain="code")` | Returns imports and dependents for a file |
 | `identify_impact(file_path, max_depth=3, domain="code")` | Blast radius analysis for changes |
 | `get_file_info(file_path, domain="code")` | File structure overview (classes, functions) |
+| `create_memory_entities(entities)` | Create or update agent-authored memory nodes in Neo4j |
+| `create_memory_relations(relations)` | Create typed relationships between memory nodes |
+| `add_memory_observations(observations)` | Append observation strings to existing memory nodes |
+| `delete_memory_entities(entity_names)` | Delete memory nodes by name |
+| `delete_memory_relations(relations)` | Delete typed relationships between memory nodes |
+| `delete_memory_observations(observations)` | Remove observation strings from memory nodes |
+| `search_memory_nodes(query, limit=5)` | Search memory nodes by name, type, and observations |
+| `read_memory_graph()` | Read a summary of the current memory graph |
 | `get_git_file_history(file_path, limit=20, domain="git")` | File-level commit history and ownership signals (git rollout) |
 | `get_commit_context(sha, include_diff_stats=true)` | Commit metadata and change statistics (git rollout) |
 | `find_recent_risky_changes(path_or_symbol, window_days, domain="hybrid")` | Recent high-risk changes using hybrid signals (git rollout) |

diff --git a/docs/API.md b/docs/API.md
@@ -253,7 +253,7 @@ $ codememory serve
 
 **Server behavior:**
 - Runs until interrupted (Ctrl+C)
-- Exposes 4 MCP tools (see [MCP Tools](#mcp-tools))
+- Exposes MCP tools for code graph queries, git graph queries, and agent-authored memory writes (see [MCP Tools](#mcp-tools))
 - Uses local config or environment variables
 - Graceful shutdown on SIGTERM/SIGINT
 
@@ -846,30 +846,137 @@ print(f"Cost: ${metrics['cost_usd']:.4f}")
 
 ##### `semantic_search()`
 
-Perform vector similarity search.
+Perform vector similarity search with optional multi-repo filtering.
 
 ```python
-def semantic_search(self, query: str, limit: int = 5) -> List[Dict]
+def semantic_search(
+    self,
+    query: str,
+    limit: int = 5,
+    repo_id: Optional[str] = None
+) -> List[Dict]
 ```
 
+**Parameters:**
+| Parameter | Type | Required | Default | Description |
+|-----------|------|----------|---------|-------------|
+| `query` | str | Yes | - | Natural language search query |
+| `limit` | int | No | 5 | Maximum results to return |
+| `repo_id` | Optional[str] | No | None | Restrict results to a specific repo. Falls back to `self.repo_id` if set. |
+
+**Behavior when `repo_id` is active:**
+- Over-fetches `limit × 3` candidates from the vector index
+- Adds a `WHERE entity.repo_id = $repo_id` filter after the DESCRIBE hop
+- Calls `_rerank_results()` to score and trim to `limit`
+
 **Returns:**
 ```python
 [
     {
         "name": "authenticate",
         "sig": "src/auth.py:authenticate",
-        "score": 0.92,
+        "score": 0.92,        # raw vector similarity (0–1)
+        "final_score": 0.94,  # 0.9×vector_score + structural_bonus
         "text": "def authenticate(username, password):..."
     },
     ...
 ]
 ```
 
+- `final_score` is always present when `repo_id` filtering is active (via `_rerank_results()`).
+
 **Example:**
 ```python
-results = builder.semantic_search("JWT validation", limit=3)
+results = builder.semantic_search("JWT validation", limit=3, repo_id="my-service")
 for r in results:
-    print(f"{r['name']} - Score: {r['score']:.2f}")
+    print(f"{r['name']} - Score: {r['score']:.2f}  Final: {r['final_score']:.2f}")
+```
+
+---
+
+##### `_rerank_results()`
+
+Private method. Re-scores a candidate list by combining vector similarity with graph connectivity bonuses, then trims to `limit`.
+
+```python
+def _rerank_results(self, results: List[Dict], limit: int) -> List[Dict]
+```
+
+**Parameters:**
+| Parameter | Type | Description |
+|-----------|------|-------------|
+| `results` | List[Dict] | Candidate results (over-fetched, each with a `score` field) |
+| `limit` | int | Final number of results to return |
+
+**Scoring formula:**
+```
+final_score = 0.9 × vector_score + structural_bonus
+```
+
+**Connectivity bonuses (structural_bonus):**
+| Relation | Bonus |
+|----------|-------|
+| `calls_out` | +0.05 |
+| `called_by` | +0.05 |
+| `methods` | +0.03 |
+
+**Behavior:**
+- Sorts descending by `final_score`
+- Trims list to `limit`
+- Adds `final_score` key to each result dict
+
+**GDS upgrade path:** Replace heuristic bonuses with `entity.pagerank` from `gds.pageRank.write()` once GDS is available.
+
+**Note:** This is a private method — call `semantic_search()` directly; it invokes `_rerank_results()` internally.
+
+---
+
+##### `search_memory_nodes()`
+
+Search the graph for memory nodes (agent-authored notes and observations) with optional repo filtering. Returns both outgoing and incoming relations for each result.
+
+```python
+def search_memory_nodes(
+    self,
+    query: str,
+    limit: int = 5,
+    repo_id: Optional[str] = None
+) -> List[Dict]
+```
+
+**Parameters:**
+| Parameter | Type | Required | Default | Description |
+|-----------|------|----------|---------|-------------|
+| `query` | str | Yes | - | Natural language search query |
+| `limit` | int | No | 5 | Maximum results to return |
+| `repo_id` | Optional[str] | No | None | Restrict results to a specific repo. Falls back to `self.repo_id` if set. |
+
+**Returns:**
+```python
+[
+    {
+        "name": "note_about_auth",
+        "sig": "memory:note_about_auth",
+        "score": 0.88,
+        "text": "Authentication flow requires...",
+        "outgoing_relations": [
+            {"target": "src/auth.py:authenticate", "relation_type": "REFERENCES"}
+        ],
+        "incoming_relations": [
+            {"source": "src/api/routes/auth.py", "relation_type": "DOCUMENTED_BY"}
+        ]
+    },
+    ...
+]
+```
+
+**`incoming_relations` format:** `[{"source": str, "relation_type": str}, ...]`
+
+**Example:**
+```python
+nodes = builder.search_memory_nodes("auth flow notes", limit=5, repo_id="my-service")
+for n in nodes:
+    print(f"{n['name']} ({len(n['incoming_relations'])} incoming)")
 ```
 
 ---
@@ -1129,7 +1236,8 @@ def get_indexing_config(self) -> Dict[str, Any]
 {
     "name": str,              # Entity name
     "sig": str,               # Entity signature
-    "score": float,           # Similarity (0-1)
+    "score": float,           # Raw vector similarity (0–1)
+    "final_score": float,     # Reranked score: 0.9×score + structural_bonus (present when repo_id filtering is active)
     "text": str               # Code snippet
 }
 ```
@@ -1146,5 +1254,5 @@ def get_indexing_config(self) -> Dict[str, Any]
 
 ---
 
-**API Version:** 1.0.0
-**Last Updated:** 2025-02-09
+**API Version:** 1.1.0
+**Last Updated:** 2026-04-05
diff --git a/docs/ARCHITECTURE.md b/docs/ARCHITECTURE.md
@@ -235,6 +235,35 @@ FOR (n:Function|Class|File) ON EACH [n.name, n.docstring, n.path]
 
 ---
 
+## Multi-Repo Partitioning (repo_id)
+
+CodeMemory supports multiple repositories in a single Neo4j database using `repo_id` partitioning.
+
+### Identity Model
+
+| Node | Old identity | New identity |
+|------|-------------|--------------|
+| File | `path` (global) | `(repo_id, path)` (composite) |
+| Function | `signature` (global) | `(repo_id, signature)` (composite) |
+| Class | `qualified_name` (global) | `(repo_id, qualified_name)` (composite) |
+| Memory | `name` (global) | `(repo_id, name)` (composite) |
+
+A `Repository` anchor node (`{repo_id, root_path}`) is also created per repo.
+
+### Backward Compatibility
+
+When `CODEMEMORY_REPO` is not set, `repo_id` is `None` and all queries omit the repo filter — identical to the pre-partitioning behavior.
+
+### Retrieval Model
+
+When `repo_id` is active, `semantic_search()` over-fetches by 3x, filters by `entity.repo_id`, then applies structural reranking (`_rerank_results()`) before returning the final result set. This prevents worktree pollution (multiple indexed copies of the same function appearing in results).
+
+### GDS Upgrade Path
+
+When Aura API credentials are available (`gds.aura.api.credentials(clientId, clientSecret)`), replace the heuristic structural bonus in `_rerank_results()` with GDS-computed `entity.pagerank`. See comments in `graph.py` near `_rerank_results()`.
+
+---
+
 ## 4-Pass Ingestion Pipeline
 
 The ingestion pipeline processes code in 4 sequential passes to build the complete graph.

diff --git a/docs/FIELD_TEST_TEMPLATE.md b/docs/FIELD_TEST_TEMPLATE.md
@@ -28,8 +28,8 @@ codememory index
 codememory status --json
 
 # 2) Git graph setup + sync
-codememory git-init --repo /absolute/path/to/repo --mode local --full-history
-codememory git-sync --repo /absolute/path/to/repo --incremental
+codememory git-init --repo /absolute/path/to/repo
+codememory git-sync --repo /absolute/path/to/repo --full
 codememory git-status --repo /absolute/path/to/repo --json
 
 # 3) Optional MCP checks (domain routing)
@@ -62,7 +62,7 @@ Record exact values from command output.
 ### Performance
 
 - `codememory index` elapsed time:
-- `codememory git-sync --incremental` elapsed time:
+- `codememory git-sync` elapsed time:
 - Embedding calls:
 - Token usage:
 - Estimated cost:
@@ -71,7 +71,7 @@ Record exact values from command output.
 
 - [ ] PASS / FAIL: `git-init` succeeds with expected repo metadata.
 - [ ] PASS / FAIL: first `git-sync` ingests history and sets checkpoint.
-- [ ] PASS / FAIL: second `git-sync --incremental` with no new commits reports zero new commits.
+- [ ] PASS / FAIL: second `git-sync` with no new commits reports zero new commits.
 - [ ] PASS / FAIL: `git-status --json` returns stable envelope (`ok`, `error`, `data`, `metrics`).
 - [ ] PASS / FAIL: code graph queries still work with git graph enabled.
 - [ ] PASS / FAIL: `domain="code"` queries return expected code entities.

diff --git a/docs/GIT_GRAPH.md b/docs/GIT_GRAPH.md
@@ -52,17 +52,12 @@ Use explicit domain routing in MCP tool calls:
 Initialize git graph metadata and checkpoint state for a repository.
 
 ```bash
-codememory git-init \
-  --repo /absolute/path/to/repo \
-  --mode local \
-  --full-history
+codememory git-init --repo /absolute/path/to/repo
 ```
 
 Common options:
 - `--repo PATH`
-- `--mode local|local+github`
-- `--full-history`
-- `--since <rev>`
+- `--json`
 
 Expected output (human-readable):
 
@@ -78,14 +73,17 @@ Checkpoint: <HEAD_SHA>
 Sync commits from git history into the git graph.
 
 ```bash
-codememory git-sync --repo /absolute/path/to/repo --incremental
+# Initial full backfill
+codememory git-sync --repo /absolute/path/to/repo --full
+
+# Later incremental updates
+codememory git-sync --repo /absolute/path/to/repo
 ```
 
 Common options:
 - `--repo PATH`
-- `--incremental`
 - `--full`
-- `--from-ref <ref>`
+- `--json`
 
 Expected output (human-readable):
 
@@ -153,8 +151,8 @@ Expected JSON envelope:
 Quick validation sequence:
 
 ```bash
-codememory git-init --repo /absolute/path/to/repo --mode local --full-history
-codememory git-sync --repo /absolute/path/to/repo --incremental
+codememory git-init --repo /absolute/path/to/repo
+codememory git-sync --repo /absolute/path/to/repo --full
 codememory git-status --repo /absolute/path/to/repo --json
 ```
 

diff --git a/docs/MCP_INTEGRATION.md b/docs/MCP_INTEGRATION.md
@@ -776,7 +776,7 @@ Before refactoring:
 codememory search "function_name"
 codememory impact path/to/file.py
 # Optional git graph sync (git-enabled builds)
-codememory git-sync --repo /absolute/path/to/repo --incremental
+codememory git-sync --repo /absolute/path/to/repo
 ```
 
 ### 5. Keep Index Updated