perf: use precomputed neighborEdges in query-engine loops#26
Conversation
Original prompt from Stanislau
|
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
📝 WalkthroughWalkthroughAdded Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Pull request overview
Refactors relation-graph traversal in QueryEngine to avoid repeated full scans of snapshot.relationGraph by introducing a lazily-built adjacency map keyed by edge.from.
Changes:
- Add lazy
adjacencyByFromcache plusgetAdjacencyMap()builder. - Add
edgesFrom(nodeId)andedgesAmong(nodeIds)helpers for O(1) outgoing-edge lookup per node. - Replace full-array scans in
buildFrontier(),buildRouteAnswer(), andbuildPatternAnswer()with the new helpers.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| const result: RelationEdge[] = []; | ||
| for (const nodeId of nodeIds) { | ||
| for (const edge of this.edgesFrom(nodeId)) { | ||
| if (idSet.has(edge.to)) { | ||
| result.push(edge); | ||
| } | ||
| } | ||
| } | ||
| return result; |
There was a problem hiding this comment.
edgesAmong() changes the ordering semantics vs the previous relationGraph.filter(...): results are now grouped by the input nodeIds order (and can also duplicate edges if nodeIds ever contains duplicates). If callers/consumers expect the same stable order as snapshot.relationGraph, consider iterating over idSet (to avoid duplicates) and/or explicitly defining/sorting the returned edge order (e.g., by original edge index) before mapping into QueryResult.relations.
| const result: RelationEdge[] = []; | |
| for (const nodeId of nodeIds) { | |
| for (const edge of this.edgesFrom(nodeId)) { | |
| if (idSet.has(edge.to)) { | |
| result.push(edge); | |
| } | |
| } | |
| } | |
| return result; | |
| return this.snapshot.relationGraph.filter( | |
| (edge) => idSet.has(edge.from) && idSet.has(edge.to), | |
| ); |
There was a problem hiding this comment.
Good catch — fixed in 4c69f65. edgesAmong() now iterates over snapshot.relationGraph directly with idSet membership checks, preserving original ordering and avoiding duplicates.
There was a problem hiding this comment.
Code Review
This pull request introduces an internal adjacency map and helper methods to optimize relation edge lookups within the QueryEngine. The review feedback correctly identifies that these additions are redundant because the required edge information is already indexed in the neighborEdges property of CompiledNode objects. It is recommended to refactor the logic to utilize the existing data structure, which would simplify the implementation and reduce memory overhead.
I am having trouble creating individual review comments. Click here to see my feedback.
src/runtime/query-engine.ts (67)
The adjacencyByFrom map is redundant because the CompiledNode objects in this.snapshot.compiledNodes already store their outgoing edges in the neighborEdges property. You can use this.snapshot.compiledNodes[nodeId]?.neighborEdges directly instead of building and maintaining a separate adjacency map.
src/runtime/query-engine.ts (76-106)
These helper methods and the lazy map initialization are unnecessary as the adjacency information is already indexed in this.snapshot.compiledNodes. Refactoring edgesAmong to use the existing neighborEdges on CompiledNode simplifies the class and reduces memory overhead.
private edgesAmong(nodeIds: string[]): RelationEdge[] {
const idSet = new Set(nodeIds);
const result: RelationEdge[] = [];
for (const nodeId of nodeIds) {
const edges = this.snapshot.compiledNodes[nodeId]?.neighborEdges;
if (!edges) continue;
for (const edge of edges) {
if (idSet.has(edge.to)) {
result.push(edge);
}
}
}
return result;
}
src/runtime/query-engine.ts (945)
Use neighborEdges from the compiled node instead of the redundant edgesFrom helper.
for (const edge of this.snapshot.compiledNodes[nodeId]?.neighborEdges ?? []) {
|
Good point — both you and CodeRabbit flagged this independently. Implemented in the latest commit: removed the redundant |
f98de2f to
8160bc6
Compare
Replace three O(E) full-scan loops over snapshot.relationGraph with a lazy adjacency map (Map<string, RelationEdge[]>) keyed by edge.from. - getAdjacencyMap() builds the map once on first access - edgesFrom(nodeId) returns outgoing edges in O(1) - edgesAmong(nodeIds) returns edges where both endpoints are in the set, using the map instead of filtering the entire array Affects buildFrontier(), buildRouteAnswer(), and buildPatternAnswer(). Closes #5 Co-Authored-By: Stanislau <nedbailov375426@gmail.com>
Co-Authored-By: Stanislau <nedbailov375426@gmail.com>
…cy map Co-Authored-By: Stanislau <nedbailov375426@gmail.com>
8160bc6 to
5645a11
Compare
Code ReviewOverviewSmall, well-scoped perf change. Extracts Strengths
Issues & nitsTitle overstates the changeThe PR says "pre-build adjacency map" but nothing new is pre-built —
|
|
Thanks for the review. Addressed:
|
venikman
left a comment
There was a problem hiding this comment.
One review point on the optimization claim:
edgesFrom() does remove the full relationGraph scan from buildFrontier(), but edgesAmong() still does this.snapshot.relationGraph.filter(...). That means buildRouteAnswer() and buildPatternAnswer() remain O(E) over the full graph, so the PR does not actually eliminate all three scan sites described in the issue/PR text.
If the goal is to close #5 as written, edgesAmong() needs to be expressed in terms of adjacency lookups as well, for example by iterating edgesFrom(id) for each selected node and filtering by the target set. As written, the perf improvement is only partial.
|
Devin is archived and cannot be woken up. Please unarchive Devin if you want to continue using it. |
|
Closing — this PR targets the old monolithic |
|
Closing — targets old monolithic query-engine.ts which was completely split into 6 modules by PR #31. The adjacency list optimization would need re-implementation on the new split architecture if still desired. |
What
Pre-build a lazy adjacency map for the relation graph, replacing three O(E) full-scan loops with O(1) lookups per source node.
Why
Closes #5 —
buildFrontier(),buildRouteAnswer(), andbuildPatternAnswer()all iterated the entirerelationGrapharray to find edges from specific nodes. With a pre-builtMap<string, RelationEdge[]>keyed byedge.from, lookups become constant-time per node.Type
refactor— code restructuringChanges
adjacencyByFromfield toQueryEngine, built once on first access viagetAdjacencyMap()edgesFrom(nodeId)helper returning outgoing edges in O(1)edgesAmong(nodeIds)helper returning edges where both endpoints are in a given set, using the map instead of filtering the full arraythis.snapshot.relationGraphscan sites with the new helpersValidation
bun run lintpasses locally (pre-existing rspress config type error only)bun run checkpasses locally (same pre-existing rspress error)bun run test— pre-existing timeout failures, not caused by this PRBoundary check
FPF-spec.mdonly — no additional corpora addedsrc/mcp/tool-contracts.tsAgent metadata
Requested by: @venikman
Summary by CodeRabbit