Skip to content

Commit bf629da

Browse files
committed
feat(explorer, analyzer): add exclude param + instance_index disambiguation (v2.0.1)
cortex_code_explorer — exclude dirs at scan time - Add `exclude: Vec<String>` to map_overview and deep_slice actions - filter_entry predicate applied to both the primary walker (file discovery) and the unfiltered reference walker (scan-count diagnostics) - Matched against directory base name so pruning applies at every depth - deep_slice merges per-call exclude list into cfg.scan.exclude_dir_names so exclusion flows through build_scan_options -> scan_workspace - Fixes QA finding #1: repos with node_modules / large asset dirs triggered "Massive Directory" mode despite the actual source tree being small cortex_symbol_analyzer — instance_index for duplicate symbol disambiguation - Add `instance_index: Option<usize>` param to read_source action - Collects ALL matching candidates instead of stopping at first match - When N > 1 instances exist, prepends a disambiguation header showing "Found N instances, showing instance X of N, use instance_index to select" - Clamps out-of-range index to last valid instance (no panic) - Fixes QA finding #2: agents silently received only the first instance with no indication other definitions existed in the same file MCP schema + docs - tool_list schemas updated: exclude array and instance_index integer fields added to the respective action descriptions - .github/copilot-instructions.md quick-reference table expanded with "Key Optional Params" column surfacing exclude and instance_index - CHANGELOG.md updated; version bumped 2.0.0 -> 2.0.1
1 parent 3061e6f commit bf629da

6 files changed

Lines changed: 145 additions & 35 deletions

File tree

.github/copilot-instructions.md

Lines changed: 25 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -9,20 +9,20 @@
99

1010
**Megatool Quick‑Reference**
1111

12-
| Task | Megatool | Action Enum | Required Params |
13-
|---|---|---|---|
14-
| Repo overview (files + public symbols) | `cortex_code_explorer` | `map_overview` | `target_dir` (use `.` for whole repo) |
15-
| Token-budgeted context slice (XML) | `cortex_code_explorer` | `deep_slice` | `target` |
16-
| Extract exact symbol source | `cortex_symbol_analyzer` | `read_source` | `path` + `symbol_name` *(or `path` + `symbol_names` for batch)* |
17-
| Find all usages before signature change | `cortex_symbol_analyzer` | `find_usages` | `symbol_name` + `target_dir` |
18-
| Find trait/interface implementors | `cortex_symbol_analyzer` | `find_implementations` | `symbol_name` + `target_dir` |
19-
| Blast radius before rename/move/delete | `cortex_symbol_analyzer` | `blast_radius` | `symbol_name` + `target_dir` |
20-
| Cross-boundary update checklist | `cortex_symbol_analyzer` | `propagation_checklist` | `symbol_name` *(or legacy `changed_path`)* |
21-
| Save pre-change snapshot | `cortex_chronos` | `save_checkpoint` | `path` + `symbol_name` + `semantic_tag` |
22-
| List snapshots | `cortex_chronos` | `list_checkpoints` | *(none)* |
23-
| Compare snapshots (AST diff) | `cortex_chronos` | `compare_checkpoint` | `symbol_name` + `tag_a` + `tag_b` *(use `tag_b="__live__"` + `path` to diff against current state)* |
24-
| Delete old snapshots (housekeeping) | `cortex_chronos` | `delete_checkpoint` | `symbol_name` and/or `semantic_tag` *(optional: `path`, `namespace`)* — Automatically searches legacy flat `checkpoints/` if no matches in namespace. |
25-
| Compile/lint diagnostics | `run_diagnostics` | *(none)* | `repoPath` |
12+
| Task | Megatool | Action Enum | Required Params | Key Optional Params |
13+
|---|---|---|---|---|
14+
| Repo overview (files + public symbols) | `cortex_code_explorer` | `map_overview` | `target_dir` (use `.` for whole repo) | `exclude` — array of dir names to skip; `search_filter`; `max_chars`; `ignore_gitignore` |
15+
| Token-budgeted context slice (XML) | `cortex_code_explorer` | `deep_slice` | `target` | `exclude` — array of dir names to skip; `budget_tokens`; `skeleton_only`; `query`; `query_limit` |
16+
| Extract exact symbol source | `cortex_symbol_analyzer` | `read_source` | `path` + `symbol_name` *(or `path` + `symbol_names` for batch)* | `instance_index` — 0-based, selects specific instance when duplicates exist; `skeleton_only` |
17+
| Find all usages before signature change | `cortex_symbol_analyzer` | `find_usages` | `symbol_name` + `target_dir` | |
18+
| Find trait/interface implementors | `cortex_symbol_analyzer` | `find_implementations` | `symbol_name` + `target_dir` | |
19+
| Blast radius before rename/move/delete | `cortex_symbol_analyzer` | `blast_radius` | `symbol_name` + `target_dir` | |
20+
| Cross-boundary update checklist | `cortex_symbol_analyzer` | `propagation_checklist` | `symbol_name` *(or legacy `changed_path`)* | `aliases` — cross-language name variants |
21+
| Save pre-change snapshot | `cortex_chronos` | `save_checkpoint` | `path` + `symbol_name` + `semantic_tag` | `namespace` |
22+
| List snapshots | `cortex_chronos` | `list_checkpoints` | *(none)* | `namespace` |
23+
| Compare snapshots (AST diff) | `cortex_chronos` | `compare_checkpoint` | `symbol_name` + `tag_a` + `tag_b` *(use `tag_b="__live__"` + `path` to diff against current state)* | `namespace`; `path` |
24+
| Delete old snapshots (housekeeping) | `cortex_chronos` | `delete_checkpoint` | `symbol_name` and/or `semantic_tag` *(optional: `path`, `namespace`)* — Automatically searches legacy flat `checkpoints/` if no matches in namespace. | |
25+
| Compile/lint diagnostics | `run_diagnostics` | *(none)* | `repoPath` | |
2626

2727
## The Ultimate CortexAST Refactoring SOP
2828

@@ -74,6 +74,17 @@ Follow this sequence for any non-trivial refactor (especially renames, signature
7474
5. Find-up heuristic on the tool's own `path` / `target_dir` / `target` argument — walks ancestors looking for `.git`, `Cargo.toml`, `package.json`
7575
6. `cwd`**refused if it equals `$HOME` or OS root** (CRITICAL error)
7676

77+
**`exclude` best practice (map_overview + deep_slice):**
78+
- If `map_overview` returns "Massive Directory" or the file count is inflated by generated/dependency folders, pass `exclude: ["node_modules", "vendor", "__pycache__", "build"]` to skip them at scan time.
79+
- The `exclude` array matches against each directory's **base name** (not full path), so it prunes at every depth.
80+
- Prefer using `exclude` over widening `target_dir` — it keeps the scan scoped while removing noise.
81+
- Example: `cortex_code_explorer(action="map_overview", target_dir=".", exclude=["node_modules", ".next", "dist"])`
82+
83+
**`instance_index` best practice (read_source):**
84+
- When `read_source` returns a disambiguation header like `⚠️ Found N instances of 'symbol_name'`, the file contains multiple definitions with the same name (e.g. overloaded methods, duplicate arrow functions).
85+
- The response **always shows instance 1 of N by default** (0-based index 0). To read a different one, pass `instance_index: 1` (second), `instance_index: 2` (third), etc.
86+
- If you need to understand all instances, use `find_usages` to locate each one across the codebase, then call `read_source` multiple times with different `instance_index` values.
87+
7788
**Propagation best practice (Hybrid Omni‑Match):**
7889
- `propagation_checklist` automatically matches common casing variants of `symbol_name` (PascalCase / camelCase / snake_case).
7990
- When a symbol is renamed across boundaries (e.g. Rust `TrainingEngineCapabilities` → TS `trainingCaps`), pass `aliases: ["trainingCaps"]` to catch cross-language usage without heavy import tracing.

CHANGELOG.md

Lines changed: 21 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,27 @@ Format follows [Keep a Changelog](https://keepachangelog.com/en/1.0.0/).
99

1010
- _No unreleased changes yet._
1111

12-
## [2.0.0] — Megatool API
12+
## [2.0.1] — 2026-02-21
13+
14+
### Added
15+
- **`exclude` parameter for `cortex_code_explorer`** (`map_overview` + `deep_slice`)
16+
Agents can now pass `exclude: ["node_modules", "vendor", "__pycache__", "build"]` to prune directories at scan time.
17+
- Applied as a `filter_entry` predicate on **both** the filtered walker (file discovery) and the unfiltered walker (scan-count diagnostics), so dropped counts remain accurate.
18+
- Matched against each directory's **base name** — pruning applies at every depth without requiring full path patterns.
19+
- `deep_slice` merges per-call `exclude` into `cfg.scan.exclude_dir_names` so the same exclusion list flows through `build_scan_options → scan_workspace`.
20+
- Fixes the QA finding where repos with `node_modules` or large asset folders triggered "Massive Directory" mode despite the source tree being small.
21+
- **`instance_index` parameter for `cortex_symbol_analyzer` `read_source`**
22+
Resolves the QA finding where files with multiple same-named symbols (overloaded methods, duplicate arrow functions) silently returned only the first instance with no indication that others existed.
23+
- When `N > 1` matches are found, the response prepends a disambiguation header:
24+
`// ⚠️ Disambiguation: Found N instances of 'name'. Showing instance 1 of N (1-based). Use instance_index param (0-based, 0..N-1) to select a specific one.`
25+
- Pass `instance_index: 1` for the second, `instance_index: 2` for the third, etc. Clamps silently to the last valid index.
26+
- Batch mode (`symbol_names`) defaults to instance 0 for each symbol.
27+
28+
### Changed
29+
- **`cortex_code_explorer` MCP schema** updated: `exclude` array field documented under both `map_overview` and `deep_slice` action descriptions.
30+
- **`cortex_symbol_analyzer` MCP schema** updated: `instance_index` integer field added to `read_source` action description.
31+
- **`.github/copilot-instructions.md`** quick-reference table expanded with a "Key Optional Params" column surfacing `exclude` and `instance_index` for faster agent discovery.
32+
— Megatool API
1333

1434
### Breaking Changes (with shims)
1535
- **10 standalone MCP tools consolidated into 4 Megatools** using `action` enum routing.

Cargo.lock

Lines changed: 1 addition & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Cargo.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[package]
22
name = "cortexast"
3-
version = "2.0.0"
3+
version = "2.0.1"
44
edition = "2021"
55
description = "God-tier AST intelligence for LLM agents. Pure Rust MCP server with semantic code navigation, hybrid vector search, and Chronos AST time machine."
66
authors = ["Thanon Aphithanawat <thanon@aphithanawat.me>"]

src/inspector.rs

Lines changed: 67 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -2180,13 +2180,14 @@ pub fn extract_symbols_from_source(path: &Path, source_text: &str) -> Vec<Symbol
21802180
/// }
21812181
/// ```
21822182
pub fn read_symbol(path: &Path, symbol_name: &str) -> Result<String> {
2183-
read_symbol_with_options(path, symbol_name, false)
2183+
read_symbol_with_options(path, symbol_name, false, None)
21842184
}
21852185

21862186
pub fn read_symbol_with_options(
21872187
path: &Path,
21882188
symbol_name: &str,
21892189
skeleton_only: bool,
2190+
instance_index: Option<usize>,
21902191
) -> Result<String> {
21912192
let abs: PathBuf = if path.is_absolute() {
21922193
path.to_path_buf()
@@ -2243,17 +2244,22 @@ pub fn read_symbol_with_options(
22432244
candidates.extend(impl_blocks);
22442245
}
22452246

2246-
// ── Step 2: find best match (exact → case-insensitive) ───────────────
2247-
let found = candidates
2247+
// ── Step 2: find best match (exact → case-insensitive), collect ALL instances ──
2248+
let mut all_matches: Vec<&(String, String, usize, usize)> = candidates
22482249
.iter()
2249-
.find(|(name, _, _, _)| name == symbol_name)
2250-
.or_else(|| {
2251-
candidates
2252-
.iter()
2253-
.find(|(name, _, _, _)| name.eq_ignore_ascii_case(symbol_name))
2254-
});
2250+
.filter(|(name, _, _, _)| name == symbol_name)
2251+
.collect();
2252+
2253+
if all_matches.is_empty() {
2254+
all_matches = candidates
2255+
.iter()
2256+
.filter(|(name, _, _, _)| name.eq_ignore_ascii_case(symbol_name))
2257+
.collect();
2258+
}
22552259

2256-
let Some((name, kind, start_byte, end_byte)) = found else {
2260+
let total_matches = all_matches.len();
2261+
2262+
if total_matches == 0 {
22572263
let mut available: Vec<String> = candidates
22582264
.iter()
22592265
.map(|(n, k, _, _)| format!(" {k} {n}"))
@@ -2278,7 +2284,11 @@ pub fn read_symbol_with_options(
22782284
rendered.join("\n"),
22792285
symbol_name
22802286
));
2281-
};
2287+
}
2288+
2289+
// Select the requested instance (default: first).
2290+
let idx = instance_index.unwrap_or(0).min(total_matches.saturating_sub(1));
2291+
let (name, kind, start_byte, end_byte) = all_matches[idx];
22822292

22832293
// ── Step 3: format and return ─────────────────────────────────────────
22842294
const MAX_SYMBOL_LINES: usize = 500;
@@ -2296,8 +2306,22 @@ pub fn read_symbol_with_options(
22962306
+ 1;
22972307
let symbol_lines = end_line.saturating_sub(start_line) + 1;
22982308

2309+
// Build disambiguation preamble when multiple instances exist.
2310+
let disambiguation = if total_matches > 1 {
2311+
format!(
2312+
"// ⚠️ Disambiguation: Found {total_matches} instances of `{name}` in this file. \
2313+
Showing instance {} of {total_matches} (1-based). \
2314+
Use `instance_index` param (0-based, 0..{}) to select a specific one. \
2315+
Consider using find_usages to inspect all occurrences across the codebase.\n",
2316+
idx + 1,
2317+
total_matches - 1,
2318+
)
2319+
} else {
2320+
String::new()
2321+
};
2322+
22992323
let header = format!(
2300-
"// {kind} `{name}` — {}:L{start_line}-L{end_line}\n",
2324+
"{disambiguation}// {kind} `{name}` — {}:L{start_line}-L{end_line}\n",
23012325
abs.display()
23022326
);
23032327

@@ -3448,14 +3472,15 @@ fn extract_context_lines(lines: &[&str], target_0: usize, ctx: usize) -> String
34483472
/// [struct ] User
34493473
/// ```
34503474
pub fn repo_map(target_dir: &Path) -> Result<String> {
3451-
repo_map_with_filter(target_dir, None, None, false)
3475+
repo_map_with_filter(target_dir, None, None, false, &[])
34523476
}
34533477

34543478
pub fn repo_map_with_filter(
34553479
target_dir: &Path,
34563480
search_filter: Option<&str>,
34573481
max_chars: Option<usize>,
34583482
ignore_gitignore: bool,
3483+
exclude_dirs: &[String],
34593484
) -> Result<String> {
34603485
use ignore::WalkBuilder;
34613486
use std::collections::{BTreeMap, BTreeSet, HashSet};
@@ -3484,9 +3509,27 @@ pub fn repo_map_with_filter(
34843509
.join(target_dir)
34853510
};
34863511

3512+
// Build exclude set from caller-supplied directory names.
3513+
let excluded_dir_set: HashSet<String> = exclude_dirs
3514+
.iter()
3515+
.map(|s| s.trim().trim_matches('/').to_string())
3516+
.filter(|s| !s.is_empty())
3517+
.collect();
3518+
let excluded_dir_set_clone = excluded_dir_set.clone();
3519+
34873520
let walker_filtered = WalkBuilder::new(&abs_dir)
34883521
.standard_filters(!ignore_gitignore)
34893522
.hidden(true)
3523+
.filter_entry(move |dent| {
3524+
if dent.file_type().map(|ft| ft.is_dir()).unwrap_or(false) {
3525+
if let Some(name) = dent.path().file_name().and_then(|s| s.to_str()) {
3526+
if excluded_dir_set_clone.contains(name) {
3527+
return false;
3528+
}
3529+
}
3530+
}
3531+
true
3532+
})
34903533
.build();
34913534

34923535
let cfg = language_config();
@@ -3611,9 +3654,20 @@ pub fn repo_map_with_filter(
36113654

36123655
// Compute gitignore/ignore-filter drops by comparing against an unfiltered walk.
36133656
let (scanned_total, dropped_by_gitignore_or_error) = if !ignore_gitignore {
3657+
let excluded_dir_set_all = excluded_dir_set.clone();
36143658
let walker_all = WalkBuilder::new(&abs_dir)
36153659
.standard_filters(false)
36163660
.hidden(true)
3661+
.filter_entry(move |dent| {
3662+
if dent.file_type().map(|ft| ft.is_dir()).unwrap_or(false) {
3663+
if let Some(name) = dent.path().file_name().and_then(|s| s.to_str()) {
3664+
if excluded_dir_set_all.contains(name) {
3665+
return false;
3666+
}
3667+
}
3668+
}
3669+
true
3670+
})
36173671
.build();
36183672

36193673
let mut all_file_count: usize = 0;

0 commit comments

Comments
 (0)