Skip to content

Commit 5607511

Browse files
maplenkclaude
andcommitted
Fix lint, cppcheck, and smoke test issues for release
- Auto-format source files to pass clang-format (store.c, mcp.c, main.c, store.h) - Fix memory leak in build_search_terms() on malloc failure (cppcheck memleak) - Narrow variable scope for di and full_items (cppcheck variableScope) - Update smoke test to check new tool names (index, context, impact, read_symbol, query) - Add release notes Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent 9300b16 commit 5607511

6 files changed

Lines changed: 302 additions & 106 deletions

File tree

RELEASE_NOTES.md

Lines changed: 153 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,153 @@
1+
# Ranking v2 — From PageRank to Multi-Signal Composite Search
2+
3+
## The Problem
4+
5+
codebase-memory-mcp v1 used a single signal for ranking: **PageRank**. You'd search for "payment processing" and get back whatever had the highest PageRank score among text matches. This worked for well-connected hub nodes but failed badly for:
6+
7+
- **Concept queries** ("authentication and session management") — PageRank doesn't know that "authentication" means `OauthMiddleware`
8+
- **Cross-file exploration** ("complete order creation flow") — PageRank ranks individual nodes, not flows
9+
- **Vocabulary gaps** — code is named `postOrd`, users search for "create order"
10+
11+
The result: on a 15-case benchmark, the old system scored **30 out of a possible ~200**. Most concept and cross-file queries returned irrelevant results.
12+
13+
## The New Architecture
14+
15+
### Multi-Signal Composite Ranking
16+
17+
Instead of PageRank alone, ranking is now a weighted combination of 5 independent signals:
18+
19+
```
20+
score = W_PPR(0.35) × Personalized PageRank
21+
+ W_BM25(0.30) × FTS5 BM25 text relevance
22+
+ W_COCHANGE(0.20) × Co-change frequency
23+
+ W_BETWEENNESS(0.15) × Betweenness centrality
24+
+ W_AUTHORITY(0.10) × In-degree authority (HITS)
25+
```
26+
27+
Each signal captures a different aspect of relevance:
28+
29+
| Signal | What it measures | Helps with |
30+
|--------|-----------------|------------|
31+
| **Personalized PageRank** | Graph proximity to query-relevant seed nodes | Finding related code through call/import edges |
32+
| **BM25** | Text match quality (name, qualified_name, file_path, search_terms) | Direct name matches, prefix matching |
33+
| **Co-change** | Files that change together in git history | Finding coupled code across modules |
34+
| **Betweenness centrality** | Nodes that sit on many shortest paths in the graph | Identifying integration points, middleware, shared utilities |
35+
| **In-degree authority** | Number of incoming edges (callers/importers) | Ranking genuinely important code over stubs and auto-generated files |
36+
37+
### FTS5 Search Pipeline
38+
39+
The text search layer was completely rebuilt:
40+
41+
1. **Prefix matching** — Query `"payment"` becomes `payment*` in FTS5, matching CamelCase-concatenated tokens like `paymentmappingservice` (from `PaymentMappingService`). This was the single biggest improvement.
42+
43+
2. **CamelCase splitting** — New `search_terms` column stores split forms: `OauthMiddleware` is indexed as `"OauthMiddleware Oauth Middleware"`. Now `middleware*` finds it. BM25 weight 0.25 (low enough to not dilute primary name matches).
44+
45+
3. **Stop word filtering** — English stop words plus common code verbs (`checks`, `creates`, `handles`, `gets`, `finds`) are stripped before FTS5 query building. Without this, `"checks*"` matched hundreds of `checkXxx` functions.
46+
47+
4. **Per-file result cap** — FNV-1a hash tracks file paths; any single file is limited to 3 FTS results. Prevents large files (like `_ide_helper.php` with 5000+ stubs) from flooding the candidate set. General algorithm, no hardcoded exclusions.
48+
49+
### Personalized PageRank (PPR)
50+
51+
PPR replaced global PageRank. Instead of a static, query-independent rank, PPR is seeded from the top 10 FTS hits and propagates through call/import/inheritance edges with per-type weights:
52+
53+
```
54+
CALLS=1.0, INHERITS=0.9, HTTP_CALLS=0.8, IMPORTS=0.7, ...
55+
```
56+
57+
15 iterations, damping factor 0.85. This means the graph signal is **query-dependent** — searching for "payment" propagates from payment-related nodes, not from globally popular nodes.
58+
59+
### Betweenness Centrality
60+
61+
Brandes' algorithm computes betweenness centrality across the entire call graph. Nodes that sit on many shortest paths (middleware, shared services, base controllers) score higher. This is precomputed at index time and stored in `node_scores.betweenness`.
62+
63+
### In-Degree Authority (Simplified HITS)
64+
65+
Inspired by Kleinberg's HITS algorithm. Instead of full hub/authority iteration, we use a simplified version: count incoming edges per node, normalize to [0,1]. Nodes called by many others are authoritative; auto-generated stubs with 0 callers are penalized.
66+
67+
### Explore Mode FTS Fallback
68+
69+
The `explore` mode (for broad area queries like "order creation flow") previously used only regex matching. When regex found 0 results, it now falls back to `cbm_store_ranked_search` with 20 results. This turned all C-tier cross-file queries from 0 to scoring.
70+
71+
### Compact Output
72+
73+
Removed debug fields (`ppr`, `bm25`, `betweenness`, `composite_score`) from the locate JSON response. The LLM only needs: `name`, `file`, `type`, `line`. Results are sorted by rank — position conveys importance. This saved ~800 bytes per query.
74+
75+
## Development Process
76+
77+
25 bounded iterations using the autoresearch methodology. Each iteration: modify one thing → build → run 2683 unit tests (guard) → score against 15 benchmark cases → keep or discard.
78+
79+
### Score Progression
80+
81+
```
82+
Iter Score Delta Status What
83+
0 30 — base PageRank-only ranking
84+
1-7 — — discard Weight tuning, LIKE fallback — no improvement
85+
8 46 +16 keep FTS5 prefix queries (word*)
86+
9 59 +13 keep Stop word filtering
87+
10 60 +1 keep Per-file cap (FNV-1a hash)
88+
11 -1 -61 discard Combined changes — catastrophic
89+
12 72 +12 keep Context tool + explore FTS fallback
90+
13 73 +1 keep In-degree authority (HITS)
91+
14 93 +20 keep Locate results 20→10
92+
15 111 +18 keep CamelCase splitting (search_terms)
93+
16 112 +1 keep Per-file cap 5→3
94+
17 152 +40 discard Synonym table — hardcodes project knowledge
95+
18-22 — — discard Weight tuning, neighbors, PPR iterations
96+
23 120 +8 keep Remove debug score fields
97+
25 123 +3 keep Remove composite score field
98+
```
99+
100+
Key lessons:
101+
- **7 failed iterations** before the first improvement. Pure weight tuning doesn't work when the right files aren't in the candidate set.
102+
- **Always test changes in isolation.** Iteration 11 combined two +2 changes and got -61.
103+
- **Don't hardcode.** Synonym tables and file exclusions scored well but were project-specific. Per-file caps and prefix queries are general.
104+
- **Output efficiency matters.** 31 of 123 points came from reducing output bytes, not improving ranking.
105+
106+
## LLM End-to-End Validation
107+
108+
The same 15 cases run through Claude Code with and without codebase-memory-mcp:
109+
110+
| | No MCP (grep/glob) | With MCP |
111+
|--|---------------------|----------|
112+
| **PASS** | 10 | **11** |
113+
| **PARTIAL** | 4 | **3** |
114+
| **FAIL** | 1 | 1 |
115+
| **Cost** | **$4.56** | $5.03 |
116+
| **Turns** | **88** | 131 |
117+
118+
MCP's advantage is modest because Claude Code is already good at grep/glob searching. The real value: MCP gives **direction on the first call** — the LLM then spends turns reading code deeply rather than searching blindly. On concept queries (B-tier), MCP consistently surfaces files the LLM wouldn't find via grep alone.
119+
120+
## Parameters Reference
121+
122+
```c
123+
// BM25 column weights (src/store/store.c)
124+
bm25(node_fts, 10.0, 5.0, 1.0, 0.25) // name, qualified_name, file_path, search_terms
125+
126+
// Composite weights
127+
W_PPR = 0.35
128+
W_BM25 = 0.30
129+
W_COCHANGE = 0.20
130+
W_BETWEENNESS = 0.15
131+
W_AUTHORITY = 0.10
132+
133+
// FTS pipeline
134+
PER_FILE_CAP = 3 // max results per file in FTS candidate set
135+
FILE_TRACK_CAP = 128 // hash table size for file tracking
136+
FTS_CANDIDATE_LIMIT = 500 // SQL LIMIT on FTS5 query
137+
138+
// PPR
139+
seed_count = 10 // top FTS hits used as PPR seeds
140+
iterations = 15
141+
damping = 0.85
142+
143+
// Output
144+
locate_results = 10
145+
explore_fallback = 20
146+
```
147+
148+
## Files Changed
149+
150+
- **`src/store/store.c`** — CamelCase splitting (`camel_case_split`, `build_search_terms`), FTS5 schema migration with backfill, per-file cap, stop word filtering, prefix query builder, in-degree authority, betweenness centrality, composite scoring
151+
- **`src/mcp/mcp.c`** — Locate output compaction, explore FTS fallback, result count tuning
152+
- **`src/store/store.h`** — `cbm_ranked_result_t` typedef cleanup
153+
- **`benchmarks/`** — 15 A/B/C test cases, `score_ranking.sh` scoring script, `run_llm_bench.sh` LLM harness, `autoresearch_cases.json`, result archives, `viewer.html`

scripts/smoke-test.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -271,7 +271,7 @@ fi
271271
echo "OK: tools/list response received (id:2)"
272272

273273
# 5c: Verify expected tools are present
274-
for TOOL in index_repository search_graph trace_call_path get_code_snippet search_code; do
274+
for TOOL in index context impact read_symbol query; do
275275
if ! grep -q "\"$TOOL\"" "$MCP_OUTPUT"; then
276276
echo "FAIL: tool '$TOOL' not found in tools/list response"
277277
rm -f "$MCP_INPUT" "$MCP_OUTPUT"

src/main.c

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -150,11 +150,14 @@ static void print_help(void) {
150150
printf("\nSupported agents (auto-detected):\n");
151151
printf(" Claude Code, Codex CLI, Gemini CLI, Zed, OpenCode, Antigravity, Aider, KiloCode\n");
152152
printf("\nTools (v2):\n");
153-
printf(" context — Find relevant code (modes: locate, explore, architecture, symbols, session, summary)\n");
153+
printf(" context — Find relevant code (modes: locate, explore, architecture, symbols, "
154+
"session, summary)\n");
154155
printf(" impact — Blast radius + tracing (modes: blast, trace, prepare)\n");
155-
printf(" read_symbol — Read source for a function/class (with: none, callers, callees, both)\n");
156+
printf(
157+
" read_symbol — Read source for a function/class (with: none, callers, callees, both)\n");
156158
printf(" query — Cypher queries or graph schema\n");
157-
printf(" index — Index/status/list/delete projects (actions: index, status, list, delete, changes)\n");
159+
printf(" index — Index/status/list/delete projects (actions: index, status, list, "
160+
"delete, changes)\n");
158161
printf("\n Legacy tool names (v1) are supported as aliases.\n");
159162
}
160163

src/mcp/mcp.c

Lines changed: 49 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -737,40 +737,49 @@ typedef struct {
737737
static const tool_def_t TOOLS[] = {
738738
{"context",
739739
"Find relevant code for a task or query. Returns ranked files with function signatures. "
740-
"Use INSTEAD OF grep/glob for code discovery. Modes: locate (default, BM25+PPR ranked search), "
741-
"explore (area exploration), architecture (project overview), symbols (key symbols by PageRank), "
740+
"Use INSTEAD OF grep/glob for code discovery. Modes: locate (default, BM25+PPR ranked "
741+
"search), "
742+
"explore (area exploration), architecture (project overview), symbols (key symbols by "
743+
"PageRank), "
742744
"session (files/symbols touched this session), summary (compact session recap).",
743745
"{\"type\":\"object\",\"properties\":{"
744-
"\"query\":{\"type\":\"string\",\"description\":\"What you're looking for — task, keyword, or symbol name\"},"
746+
"\"query\":{\"type\":\"string\",\"description\":\"What you're looking for — task, keyword, or "
747+
"symbol name\"},"
745748
"\"project\":{\"type\":\"string\"},"
746749
"\"mode\":{\"type\":\"string\",\"enum\":[\"locate\",\"explore\",\"architecture\",\"symbols\","
747750
"\"session\",\"summary\"],\"default\":\"locate\"},"
748751
"\"max_tokens\":{\"type\":\"integer\",\"default\":2000}},"
749752
"\"required\":[\"query\",\"project\"]}"},
750753

751754
{"impact",
752-
"Analyze blast radius of changing a symbol, trace call paths, or prepare a change review scope. "
753-
"Modes: blast (default, impact analysis), trace (call path tracing), prepare (review scope + tests).",
755+
"Analyze blast radius of changing a symbol, trace call paths, or prepare a change review "
756+
"scope. "
757+
"Modes: blast (default, impact analysis), trace (call path tracing), prepare (review scope + "
758+
"tests).",
754759
"{\"type\":\"object\",\"properties\":{"
755760
"\"symbol\":{\"type\":\"string\",\"description\":\"Function, method, or class name\"},"
756761
"\"project\":{\"type\":\"string\"},"
757-
"\"mode\":{\"type\":\"string\",\"enum\":[\"blast\",\"trace\",\"prepare\"],\"default\":\"blast\"},"
762+
"\"mode\":{\"type\":\"string\",\"enum\":[\"blast\",\"trace\",\"prepare\"],\"default\":"
763+
"\"blast\"},"
758764
"\"to\":{\"type\":\"string\",\"description\":\"Target symbol (trace mode only)\"},"
759765
"\"depth\":{\"type\":\"integer\",\"default\":3},"
760766
"\"include_tests\":{\"type\":\"boolean\",\"default\":true},"
761767
"\"max_tokens\":{\"type\":\"integer\",\"default\":2000}},"
762768
"\"required\":[\"symbol\",\"project\"]}"},
763769

764770
{"read_symbol",
765-
"Read source code for a specific function/class. Returns exact source with optional caller/callee signatures.",
771+
"Read source code for a specific function/class. Returns exact source with optional "
772+
"caller/callee signatures.",
766773
"{\"type\":\"object\",\"properties\":{"
767774
"\"symbol\":{\"type\":\"string\",\"description\":\"Qualified name or short function name\"},"
768775
"\"project\":{\"type\":\"string\"},"
769-
"\"with\":{\"type\":\"string\",\"enum\":[\"none\",\"callers\",\"callees\",\"both\"],\"default\":\"none\"}},"
776+
"\"with\":{\"type\":\"string\",\"enum\":[\"none\",\"callers\",\"callees\",\"both\"],"
777+
"\"default\":\"none\"}},"
770778
"\"required\":[\"symbol\",\"project\"]}"},
771779

772780
{"query",
773-
"Execute a Cypher query or get the graph schema. Power-user escape hatch for complex graph analysis. "
781+
"Execute a Cypher query or get the graph schema. Power-user escape hatch for complex graph "
782+
"analysis. "
774783
"Omit cypher to get the schema.",
775784
"{\"type\":\"object\",\"properties\":{"
776785
"\"cypher\":{\"type\":\"string\",\"description\":\"Cypher query (omit for schema)\"},"
@@ -782,9 +791,12 @@ static const tool_def_t TOOLS[] = {
782791
"Index a repository, check status, list projects, or detect changes. "
783792
"Actions: index (default), status, list, delete, changes.",
784793
"{\"type\":\"object\",\"properties\":{"
785-
"\"repo_path\":{\"type\":\"string\",\"description\":\"Path to the repository (for index action)\"},"
786-
"\"project\":{\"type\":\"string\",\"description\":\"Project name (for status/delete/changes)\"},"
787-
"\"action\":{\"type\":\"string\",\"enum\":[\"index\",\"status\",\"list\",\"delete\",\"changes\"],"
794+
"\"repo_path\":{\"type\":\"string\",\"description\":\"Path to the repository (for index "
795+
"action)\"},"
796+
"\"project\":{\"type\":\"string\",\"description\":\"Project name (for "
797+
"status/delete/changes)\"},"
798+
"\"action\":{\"type\":\"string\",\"enum\":[\"index\",\"status\",\"list\",\"delete\","
799+
"\"changes\"],"
788800
"\"default\":\"index\"},"
789801
"\"mode\":{\"type\":\"string\",\"enum\":[\"full\",\"fast\"],\"default\":\"full\"}},"
790802
"\"required\":[]}"},
@@ -4238,11 +4250,15 @@ static char *handle_explore(cbm_mcp_server_t *srv, const char *args) {
42384250
if (ranked_count > 0 && match_count == 0) {
42394251
for (int i = 0; i < ranked_count; i++) {
42404252
yyjson_mut_val *item = yyjson_mut_obj(doc);
4241-
if (ranked[i].name) yyjson_mut_obj_add_str(doc, item, "name", ranked[i].name);
4242-
if (ranked[i].file_path) yyjson_mut_obj_add_str(doc, item, "file", ranked[i].file_path);
4243-
if (ranked[i].label) yyjson_mut_obj_add_str(doc, item, "type", ranked[i].label);
4253+
if (ranked[i].name)
4254+
yyjson_mut_obj_add_str(doc, item, "name", ranked[i].name);
4255+
if (ranked[i].file_path)
4256+
yyjson_mut_obj_add_str(doc, item, "file", ranked[i].file_path);
4257+
if (ranked[i].label)
4258+
yyjson_mut_obj_add_str(doc, item, "type", ranked[i].label);
42444259
yyjson_mut_obj_add_real(doc, item, "score", ranked[i].composite_score);
4245-
if (ranked[i].start_line > 0) yyjson_mut_obj_add_int(doc, item, "line", ranked[i].start_line);
4260+
if (ranked[i].start_line > 0)
4261+
yyjson_mut_obj_add_int(doc, item, "line", ranked[i].start_line);
42464262
yyjson_mut_arr_append(match_arr, item);
42474263
}
42484264
} else {
@@ -4298,35 +4314,43 @@ static char *handle_explore(cbm_mcp_server_t *srv, const char *args) {
42984314
yyjson_mut_obj_add_bool(doc, root, "truncated", true);
42994315
int effective_match_count = match_count > 0 ? match_count : ranked_count;
43004316
yyjson_mut_obj_add_int(doc, root, "total_results",
4301-
effective_match_count + dep_count + filtered_hotspot_count + entry_count);
4317+
effective_match_count + dep_count + filtered_hotspot_count +
4318+
entry_count);
43024319

43034320
size_t used = 64 + strlen(area);
43044321
int shown = 0;
4305-
int full_items = 0;
43064322
bool stop = false;
43074323

43084324
match_arr = yyjson_mut_arr(doc);
43094325
if (ranked_count > 0 && match_count == 0) {
43104326
for (int i = 0; i < ranked_count; i++) {
43114327
size_t estimate = 96;
4312-
if (ranked[i].name) estimate += strlen(ranked[i].name);
4313-
if (ranked[i].file_path) estimate += strlen(ranked[i].file_path);
4314-
if (ranked[i].label) estimate += strlen(ranked[i].label);
4328+
if (ranked[i].name)
4329+
estimate += strlen(ranked[i].name);
4330+
if (ranked[i].file_path)
4331+
estimate += strlen(ranked[i].file_path);
4332+
if (ranked[i].label)
4333+
estimate += strlen(ranked[i].label);
43154334
if (used + estimate > char_budget && shown > 0) {
43164335
stop = true;
43174336
break;
43184337
}
43194338
yyjson_mut_val *item = yyjson_mut_obj(doc);
4320-
if (ranked[i].name) yyjson_mut_obj_add_str(doc, item, "name", ranked[i].name);
4321-
if (ranked[i].file_path) yyjson_mut_obj_add_str(doc, item, "file", ranked[i].file_path);
4322-
if (ranked[i].label) yyjson_mut_obj_add_str(doc, item, "type", ranked[i].label);
4339+
if (ranked[i].name)
4340+
yyjson_mut_obj_add_str(doc, item, "name", ranked[i].name);
4341+
if (ranked[i].file_path)
4342+
yyjson_mut_obj_add_str(doc, item, "file", ranked[i].file_path);
4343+
if (ranked[i].label)
4344+
yyjson_mut_obj_add_str(doc, item, "type", ranked[i].label);
43234345
yyjson_mut_obj_add_real(doc, item, "score", ranked[i].composite_score);
4324-
if (ranked[i].start_line > 0) yyjson_mut_obj_add_int(doc, item, "line", ranked[i].start_line);
4346+
if (ranked[i].start_line > 0)
4347+
yyjson_mut_obj_add_int(doc, item, "line", ranked[i].start_line);
43254348
yyjson_mut_arr_append(match_arr, item);
43264349
used += estimate;
43274350
shown++;
43284351
}
43294352
} else {
4353+
int full_items = 0;
43304354
for (int i = 0; i < match_count; i++) {
43314355
bool compact = full_items >= MAX_FULL_BUDGET_ITEMS;
43324356
size_t estimate = estimate_search_result_chars(matches[i], compact);

0 commit comments

Comments
 (0)