feat(symgraph): improve ingestion pipeline for large repositories.#1
Merged
grahambrooks merged 3 commits intoMay 30, 2026
Merged
Conversation
Contributor
Author
|
@grahambrooks, is there any chance of getting a review of this PR? |
…licts Brings the coupling-analysis features, new extraction edges, and configurable index storage from main into Zenithar's shadow-DB rebuild PR. Conflicts were in extraction and the CLI DB layer where both sides changed the same code: - src/extraction/mod.rs: main added an edge-extraction walker (field accesses/mutates, imports, enum dispatch, &mut params) on the old recursive traversal; this PR rewrote traversal + call-finding into an iterative work-stack to avoid stack overflows. Resolved by porting the new edge logic onto the iterative structure — a single iterative `find_references` now emits Calls/Accesses/Mutates/References, plus the import + &mut-param hooks — so the recursion-safety fix and the coupling edges both hold. - src/cli/db_utils.rs: this PR's shadow database hardcoded `.symgraph/`. main made the live DB location configurable (git-dir / cache / .symgraph). Resolved by co-locating the shadow with the *resolved* live DB so the atomic rename in `replace_with_shadow` stays on one filesystem regardless of storage strategy (the old `ensure_database_directory` is replaced by inline creation). - src/cli/commands.rs: union of imports; index uses `rebuild_project_database` while status/search/context use the new `resolve_db` (the removed `database_path` is dropped). - src/lib.rs: `build_full_index` now rejects a non-directory root, so a failed rebuild reliably preserves the live DB regardless of where the shadow lives (restores the intent of test_rebuild_project_database_keeps_live_db_on_failure after storage resolution changed its incidental failure path). All 181 tests pass; clippy -D warnings and rustfmt clean. Dogfooded: a full rebuild via the shadow swap produces the new accesses/mutates/imports/dispatch edges and field/enum_member nodes.
Owner
|
Sorry about the delay in review and merge. I'll push a new release tomorrow. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This changes full reindexing from an in-place incremental update into a clean rebuild of a shadow database followed by an atomic swap.
It also hardens indexing and extraction in a few places:
Problem
Before this change, the system treated “reindex everything” too much like “incrementally update the existing DB.”
That caused a few concrete issues:
What changed
1. Full rebuilds now use a shadow database and atomic swap
Why:
2. CLI index now performs a real rebuild
Why:
3. MCP reindex semantics are now explicit
Why:
4. Bulk indexing path is split by mode
Why:
5. Extraction traversal is now iterative
Why:
6. Signature truncation is UTF-8 safe
Why:
Tests added / updated
Added regression coverage for:
Verification
Ran targeted tests:
All passed.
Notes for reviewers
The main design decision here is a clean cutover: