refactor: split compiler.ts into distinct owner stages#30
Conversation
Original prompt from Stanislau
|
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 52 minutes and 0 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughThe refactoring splits the monolithic Changes
Sequence DiagramsequenceDiagram
participant Caller
participant compileFpfSource as compileFpfSource<br/>(Orchestrator)
participant SourceParser
participant GraphCompiler
participant IndexProjector
participant ValidationRunner
participant SnapshotAssembler
Caller->>compileFpfSource: sourceText
compileFpfSource->>SourceParser: sourceText
SourceParser->>SourceParser: split preface/catalog<br/>parse headings/metadata<br/>extract anchors
SourceParser-->>compileFpfSource: SourceIR
compileFpfSource->>GraphCompiler: SourceIR
GraphCompiler->>GraphCompiler: buildPatternGraph()<br/>buildRouteGraph()<br/>buildOutlineRelations()<br/>buildExplicitReferenceRelations()
GraphCompiler-->>compileFpfSource: pattern/route/relation<br/>graphs
compileFpfSource->>IndexProjector: SourceIR +<br/>graphs
IndexProjector->>IndexProjector: buildIndexMap()<br/>buildLexicon()<br/>buildCompiledNodes()<br/>buildIndexes()
IndexProjector-->>compileFpfSource: lexicon,<br/>indexes,<br/>compiled nodes
compileFpfSource->>ValidationRunner: snapshot<br/>candidate
ValidationRunner->>ValidationRunner: detect duplicates<br/>resolve references<br/>validate fields
ValidationRunner-->>compileFpfSource: BuildValidation
compileFpfSource->>SnapshotAssembler: artifacts +<br/>validation
SnapshotAssembler-->>compileFpfSource: Snapshot
compileFpfSource-->>Caller: CompilerOutput
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request refactors the compiler.ts module by decomposing its logic into several focused files: source-parser.ts, graph-compiler.ts, index-projector.ts, validation-runner.ts, and query-helpers.ts. The main compiler.ts now serves as an orchestrator for these stages. Feedback suggests consolidating scattered imports and re-exports in source-parser.ts and standardizing type imports in graph-compiler.ts to improve maintainability and readability.
| import { | ||
| cleanMarkdown, | ||
| isMarkdownSeparatorRow, | ||
| normalizeForLookup, | ||
| normalizeLabel, | ||
| slugify, | ||
| splitMarkdownRow, | ||
| stripMarkdownToText, | ||
| } from './text.js'; |
There was a problem hiding this comment.
Consolidate all imports from ./text.js at the top of the file. extractIds and unique are currently imported at the bottom of the file (lines 508 and 510), which is non-standard and can lead to confusion during maintenance.
| import { | |
| cleanMarkdown, | |
| isMarkdownSeparatorRow, | |
| normalizeForLookup, | |
| normalizeLabel, | |
| slugify, | |
| splitMarkdownRow, | |
| stripMarkdownToText, | |
| } from './text.js'; | |
| import { | |
| cleanMarkdown, | |
| extractIds, | |
| isMarkdownSeparatorRow, | |
| normalizeForLookup, | |
| normalizeLabel, | |
| slugify, | |
| splitMarkdownRow, | |
| stripMarkdownToText, | |
| unique, | |
| } from './text.js'; |
There was a problem hiding this comment.
Fixed in 4a79c54 — consolidated extractIds and unique into the top-level ./text.js import block and removed the scattered bottom-of-file import.
| } | ||
|
|
||
| // Re-export extractIds so downstream modules can use it without importing text.js directly | ||
| export { extractIds } from './text.js'; |
There was a problem hiding this comment.
Fixed in 4a79c54 — removed the standalone extractIds re-export and consolidated it with the other re-exports at the end of the file.
| export { | ||
| cleanMarkdown, | ||
| normalizeForLookup, | ||
| normalizeLabel, | ||
| slugify, | ||
| } from './text.js'; | ||
|
|
||
| import { extractIds, unique } from './text.js'; |
There was a problem hiding this comment.
The import statement at the end of the file is misplaced and duplicated (see line 510). Consolidate the re-exports here and remove the misplaced imports to adhere to standard module structure.
| export { | |
| cleanMarkdown, | |
| normalizeForLookup, | |
| normalizeLabel, | |
| slugify, | |
| } from './text.js'; | |
| import { extractIds, unique } from './text.js'; | |
| export { | |
| cleanMarkdown, | |
| extractIds, | |
| normalizeForLookup, | |
| normalizeLabel, | |
| slugify, | |
| } from './text.js'; |
| import type { | ||
| AnchorRef, | ||
| PatternRecord, | ||
| RelationEdge, | ||
| RouteRecord, | ||
| Snapshot, | ||
| } from './types.js'; |
There was a problem hiding this comment.
Include LexiconEntry in the top-level type imports from ./types.js instead of using an inline import() type in function signatures. This improves readability and maintains consistency with how other types are handled in this file.
| import type { | |
| AnchorRef, | |
| PatternRecord, | |
| RelationEdge, | |
| RouteRecord, | |
| Snapshot, | |
| } from './types.js'; | |
| import type { | |
| AnchorRef, | |
| LexiconEntry, | |
| PatternRecord, | |
| RelationEdge, | |
| RouteRecord, | |
| Snapshot, | |
| } from './types.js'; |
There was a problem hiding this comment.
Fixed in 4a79c54 — added LexiconEntry to the top-level type imports from ./types.js.
| } | ||
|
|
||
| export function buildLexiconRelations( | ||
| lexicon: Record<string, import('./types.js').LexiconEntry>, |
There was a problem hiding this comment.
Fixed in 4a79c54 — replaced import('./types.js').LexiconEntry with the top-level LexiconEntry type import.
There was a problem hiding this comment.
Pull request overview
Refactors the runtime compiler by splitting the previously monolithic src/runtime/compiler.ts into focused stage modules while keeping the public compileFpfSource() API stable, and moves query-scoring utilities into a dedicated helper module (re-exported for compatibility).
Changes:
- Introduces
source-parser.ts,graph-compiler.ts,index-projector.ts, andvalidation-runner.tsas discrete compilation stages. - Extracts query scoring/selection utilities into
query-helpers.tsand re-exports them fromcompiler.ts. - Updates
compiler.tsinto a slimmer orchestrator that wires the stages and assembles the finalSnapshot.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| src/runtime/compiler.ts | Orchestrates stage modules, assembles Snapshot, re-exports query helpers for backward compatibility. |
| src/runtime/source-parser.ts | Parses markdown source into SourceIR (sections, catalog metadata, anchor map) plus parsing helpers. |
| src/runtime/graph-compiler.ts | Builds pattern/route graphs and relation edges from SourceIR. |
| src/runtime/index-projector.ts | Projects graphs into index map, lexicon, compiled nodes, and inverted indexes. |
| src/runtime/validation-runner.ts | Computes validation findings for the snapshot candidate. |
| src/runtime/query-helpers.ts | Query scoring/matching helpers consumed by QueryEngine (via compiler re-export). |
Comments suppressed due to low confidence (1)
src/runtime/source-parser.ts:508
- There’s an import statement at the bottom of the module. This is easy to miss and results in multiple imports from the same module; move
extractIds/uniqueinto the existing top-level import list (and keep re-exports separate) to match the repo’s import-at-top convention and reduce confusion.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| /** | ||
| * compiler.ts — SnapshotAssembler orchestrator. | ||
| * | ||
| * Delegates to four focused stage modules: | ||
| * 1. source-parser.ts (SourceParser) | ||
| * 2. graph-compiler.ts (GraphCompiler) | ||
| * 3. index-projector.ts (IndexProjector) | ||
| * 4. validation-runner.ts (ValidationRunner) | ||
| * | ||
| * This file wires the stages together and assembles the final Snapshot. | ||
| * The public `compileFpfSource()` API is unchanged. | ||
| */ |
There was a problem hiding this comment.
PR description/issue #1 acceptance criteria mention each stage should publish stage-local evidence (hashes, counts, timing) into build-audit.json, but the refactor keeps BuildAudit as a single summary (see src/runtime/runtime.ts buildCompilerSummary) and none of the new stage modules emit per-stage audit sections. Either adjust the PR/issue linkage (don’t claim closing #1) or extend BuildAudit + refresh() to collect and write per-stage evidence.
There was a problem hiding this comment.
Fair point. The acceptance criteria in issue #1 focus on the structural split ("each stage is a separate module, QueryEngine is an orchestrator only, existing tests pass"). Per-stage audit evidence (hashes, counts, timing) is a natural follow-up that aligns better with issue #6 (stage-local contract tests), which will add the infrastructure to collect and validate per-stage outputs. This PR intentionally keeps the scope to the structural decomposition to avoid scope creep — per-stage audit sections will be added as part of #6.
| childIds: [], | ||
| }; | ||
| if (parentId) { | ||
| const parent = sections.find((candidate) => candidate.id === parentId); |
There was a problem hiding this comment.
This parent-child wiring does an O(n) sections.find(...) for every heading, making parseHeadingSections O(h^2) in number of headings. Since the parent is already available via the stack (or could be tracked in a Map by id), push the child id directly to the parent section to keep this linear.
There was a problem hiding this comment.
Good catch — fixed in 38c2542. The parent is already on the stack, so I now push the child ID directly to stack.at(-1) instead of doing an O(n) sections.find() scan.
| cluster: pattern?.cluster, | ||
| role: section.role, | ||
| routeBearing: | ||
| section.role === 'route_surface' || |
There was a problem hiding this comment.
section.role === 'route_surface' is currently unreachable for heading sections: inferSectionRole() in source-parser never returns 'route_surface', and the only place that role is assigned is for the synthetic preface anchor (which isn’t part of ir.sections). Either teach inferSectionRole (or parseHeadingSections) to assign 'route_surface' for the intended headings, or drop this branch to avoid dead code.
| section.role === 'route_surface' || |
There was a problem hiding this comment.
Fixed in 38c2542 — removed the section.role === 'route_surface' branch since inferSectionRole() never returns that value for heading sections. The remaining conditions (PREFACE_ROUTE_CITATION, ROUTE_INDEX_CITATION, and routeBearingPatternIds) already cover route-bearing detection correctly.
Split the 1,601-line monolithic compiler.ts into 5 focused stage modules: - source-parser.ts (SourceParser): markdown text → SourceIR (headings, raw metadata, raw route blocks, anchors) - graph-compiler.ts (GraphCompiler): SourceIR → pattern/route/relation graphs - index-projector.ts (IndexProjector): SourceIR + graphs → lexicon, index map, compiled nodes, indexes - validation-runner.ts (ValidationRunner): snapshot candidate → validation findings - query-helpers.ts: query scoring/matching functions consumed by QueryEngine compiler.ts remains as the SnapshotAssembler orchestrator (144 lines), wiring stages together and re-exporting query helpers for backwards compatibility. Public compileFpfSource() API is unchanged. All 39 existing tests pass without modification. Closes #1 Co-Authored-By: Stanislau <nedbailov375426@gmail.com>
- source-parser.ts: move extractIds and unique to top-level import, remove scattered bottom-of-file imports - graph-compiler.ts: add LexiconEntry to top-level type imports, replace inline import() type in buildLexiconRelations signature Co-Authored-By: Stanislau <nedbailov375426@gmail.com>
- source-parser.ts: use stack reference directly instead of O(n) sections.find() for parent-child wiring - index-projector.ts: remove unreachable section.role === 'route_surface' check (inferSectionRole never returns this value) Co-Authored-By: Stanislau <nedbailov375426@gmail.com>
38c2542 to
b0f7020
Compare
HeadingSection.role was computed with fullId=undefined while AnchorRef.role used the actual fullId, causing inconsistent role classification between index-projector and query-engine consumers. Now both use fullId consistently, and buildAnchorMap reuses the already-computed section.role instead of recomputing. Co-Authored-By: Stanislau <nedbailov375426@gmail.com>
What
Split the 1,601-line monolithic
compiler.tsinto 5 focused stage modules undersrc/runtime/, keeping the publiccompileFpfSource()API unchanged.Why
Closes #1 — The single module conflated six responsibilities (source parsing, graph construction, route construction, lexicon/index projection, validation, and snapshot assembly), causing "category collapse" when a compiled artifact was wrong. The split maps each stage to a distinct Role with its own typed input/output contract.
Type
refactor— code restructuringChanges
source-parser.ts(SourceParser, 508 lines): markdown text →SourceIR(headings, raw metadata, raw route blocks, anchors). Exports theSourceIRinterface and all parsing helpers.graph-compiler.ts(GraphCompiler, 519 lines):SourceIR→ pattern/route/relation graphs. Builds pattern graph, route graph, outline relations, explicit reference relations, lexicon relations.index-projector.ts(IndexProjector, 300 lines):SourceIR+ graphs → lexicon, index map, compiled nodes, indexes. Handles inverted index construction and lexeme filtering.validation-runner.ts(ValidationRunner, 71 lines): snapshot candidate → validation findings (duplicate IDs, unresolved references, missing fields, broken routes).query-helpers.ts(190 lines): query scoring/matching functions (scorePatternQuery,scoreRouteQuery,selectBestAnchors, etc.) consumed byQueryEngine.compiler.ts(SnapshotAssembler, 144 lines): slim orchestrator that wires stages together and re-exports query helpers for backwards compatibility.Validation
bun run lintpasses locallybun run checkpasses locallybun run testpasses locally — all 39 tests pass without modificationbun run buildsucceeds (if runtime/server code touched)bun run docs:buildsucceeds (if docs touched)Boundary check
FPF-spec.mdonly — no additional corpora addedsrc/mcp/tool-contracts.tsAgent metadata
Requested by: @venikman
Summary by CodeRabbit