refactor: split compiler.ts into distinct owner stages by venikman · Pull Request #30 · venikman/fpf-memory

venikman · 2026-04-13T00:49:47Z

What

Split the 1,601-line monolithic compiler.ts into 5 focused stage modules under src/runtime/, keeping the public compileFpfSource() API unchanged.

Why

Closes #1 — The single module conflated six responsibilities (source parsing, graph construction, route construction, lexicon/index projection, validation, and snapshot assembly), causing "category collapse" when a compiled artifact was wrong. The split maps each stage to a distinct Role with its own typed input/output contract.

Type

refactor — code restructuring

Changes

source-parser.ts (SourceParser, 508 lines): markdown text → SourceIR (headings, raw metadata, raw route blocks, anchors). Exports the SourceIR interface and all parsing helpers.
graph-compiler.ts (GraphCompiler, 519 lines): SourceIR → pattern/route/relation graphs. Builds pattern graph, route graph, outline relations, explicit reference relations, lexicon relations.
index-projector.ts (IndexProjector, 300 lines): SourceIR + graphs → lexicon, index map, compiled nodes, indexes. Handles inverted index construction and lexeme filtering.
validation-runner.ts (ValidationRunner, 71 lines): snapshot candidate → validation findings (duplicate IDs, unresolved references, missing fields, broken routes).
query-helpers.ts (190 lines): query scoring/matching functions (scorePatternQuery, scoreRouteQuery, selectBestAnchors, etc.) consumed by QueryEngine.
compiler.ts (SnapshotAssembler, 144 lines): slim orchestrator that wires stages together and re-exports query helpers for backwards compatibility.

Validation

bun run lint passes locally
bun run check passes locally
bun run test passes locally — all 39 tests pass without modification
No new warnings introduced
bun run build succeeds (if runtime/server code touched)
bun run docs:build succeeds (if docs touched)
Relevant docs updated (README, docs/, inline JSDoc if applicable)

Boundary check

Runtime source set is FPF-spec.md only — no additional corpora added
No vector database or remote indexing introduced
No Python code added
MCP tool contracts stay in src/mcp/tool-contracts.ts

Agent metadata

Field	Value
Agent	Devin
Session	https://app.devin.ai/sessions/f9b77396c94f4a03971ac0c8cbe5f8b7
Prompt	Split compiler.ts into distinct owner stages (Issue #1)

Requested by: @venikman

Summary by CodeRabbit

Refactor
- Reorganized compiler architecture into focused modules for improved maintainability and code clarity. Public API and functionality remain unchanged.

devin-ai-integration · 2026-04-13T00:49:49Z

Original prompt from Stanislau

how our issues doing?

devin-ai-integration · 2026-04-13T00:49:50Z

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

Disable automatic comment and CI monitoring

coderabbitai · 2026-04-13T00:49:54Z

Warning

Rate limit exceeded

@devin-ai-integration[bot] has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 52 minutes and 0 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 52 minutes and 0 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 2702e8ec-436b-4f70-9af3-549979d5996a

📥 Commits

Reviewing files that changed from the base of the PR and between b0f7020 and c33422a.

📒 Files selected for processing (1)

src/runtime/source-parser.ts

📝 Walkthrough

Walkthrough

The refactoring splits the monolithic compiler.ts (1,601 lines) into five focused modules—source-parser.ts, graph-compiler.ts, index-projector.ts, query-helpers.ts, and validation-runner.ts—each handling a distinct compilation stage. The public compileFpfSource() API remains unchanged, now serving as an orchestrator delegating to these new modules. Query helpers previously inline are re-exported from their new home for backwards compatibility.

Changes

Cohort / File(s)	Summary
Compiler Orchestrator Refactor `src/runtime/compiler.ts`	Refactored into staged orchestrator: removed 1,509 lines of inline implementation (local parsing, graph/lexicon/index builders, validators); now delegates to new modules and re-exports query helpers from `./query-helpers.js`. Public API and snapshot structure unchanged.
Source Parsing Stage `src/runtime/source-parser.ts`	New module parsing markdown text into `SourceIR` (headings, metadata, routes, anchors). Exports `parseSource()` and domain helpers for alias derivation, role inference, relation extraction, and string normalization; re-exports text utilities.
Graph Compilation Stage `src/runtime/graph-compiler.ts`	New module building pattern/route graphs and relation edges. Exports `buildPatternGraph()`, `buildRouteGraph()`, `buildOutlineRelations()`, `buildExplicitReferenceRelations()`, `buildLexiconRelations()`, and `uniqueRelations()` for deduplicating edges.
Index Projection Stage `src/runtime/index-projector.ts`	New module projecting IR into runtime indexes and lexicon. Exports `buildIndexMap()`, `buildLexicon()`, `buildCompiledNodes()`, and `buildIndexes()` for constructing inverted indexes, lexicon entries, and compiled node structures with neighbor edges.
Query Helper Utilities `src/runtime/query-helpers.ts`	New module providing query classification and matching. Exports `isPartCDraftQuery()`, `scorePatternQuery()`, `scoreRouteQuery()`, `selectBestAnchors()`, `formatAnchorSentences()`, `getPartCDraftsByCluster()`, and `findLexemeMatches()` previously inline in `compiler.ts`.
Validation Runner Stage `src/runtime/validation-runner.ts`	New module validating snapshot invariants. Exports `buildValidation()` computing duplicate IDs, unresolved references, missing required fields, and broken routes; includes internal duplicate-detection helper.

Sequence Diagram

sequenceDiagram
    participant Caller
    participant compileFpfSource as compileFpfSource<br/>(Orchestrator)
    participant SourceParser
    participant GraphCompiler
    participant IndexProjector
    participant ValidationRunner
    participant SnapshotAssembler

    Caller->>compileFpfSource: sourceText
    
    compileFpfSource->>SourceParser: sourceText
    SourceParser->>SourceParser: split preface/catalog<br/>parse headings/metadata<br/>extract anchors
    SourceParser-->>compileFpfSource: SourceIR
    
    compileFpfSource->>GraphCompiler: SourceIR
    GraphCompiler->>GraphCompiler: buildPatternGraph()<br/>buildRouteGraph()<br/>buildOutlineRelations()<br/>buildExplicitReferenceRelations()
    GraphCompiler-->>compileFpfSource: pattern/route/relation<br/>graphs
    
    compileFpfSource->>IndexProjector: SourceIR +<br/>graphs
    IndexProjector->>IndexProjector: buildIndexMap()<br/>buildLexicon()<br/>buildCompiledNodes()<br/>buildIndexes()
    IndexProjector-->>compileFpfSource: lexicon,<br/>indexes,<br/>compiled nodes
    
    compileFpfSource->>ValidationRunner: snapshot<br/>candidate
    ValidationRunner->>ValidationRunner: detect duplicates<br/>resolve references<br/>validate fields
    ValidationRunner-->>compileFpfSource: BuildValidation
    
    compileFpfSource->>SnapshotAssembler: artifacts +<br/>validation
    SnapshotAssembler-->>compileFpfSource: Snapshot
    
    compileFpfSource-->>Caller: CompilerOutput

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

feat: add FPF wiki reader — static site generator for browsable spec #10: Wiki generator consumes the refactored compileFpfSource() output and depends directly on snapshot fields (indexRoots, patternGraph.nodes, routeGraph.nodes), so the orchestrator changes affect downstream consumers.
Rewrite MCP server and add ask_fpf wrapper #11: Touches the compiler surface and adjacent query helper modules, with overlapping runtime/query code modifications.

Poem

🐰 Hops of joy! The monolith now hops in stages clear—
Parse, graph, project, validate, assemble here!
No tangled web of 1,601 lines so long,
Just focused modules singing each their song. ✨
Five rabbits better than one sleepy one!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'refactor: split compiler.ts into distinct owner stages' clearly and concisely describes the main objective of the PR—restructuring a monolithic module into separate focused modules.
Description check	✅ Passed	The PR description is comprehensive and follows the template structure, including What, Why, Type, Changes, Validation, Boundary check, and Agent metadata sections. However, build/docs validation steps are marked incomplete.
Linked Issues check	✅ Passed	The PR successfully implements all core acceptance criteria from issue `#1`: split into 5 focused modules (source-parser, graph-compiler, index-projector, validation-runner, query-helpers, plus refactored compiler), each with typed input/output contracts, public API unchanged, and all tests passing.
Out of Scope Changes check	✅ Passed	All changes are directly aligned with issue `#1` objectives: modules implement parsing, graph construction, index projection, validation, and query helpers as specified. No unrelated refactoring, new features, or external dependencies introduced.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch devin/1776040845-split-compiler

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist

Code Review

This pull request refactors the compiler.ts module by decomposing its logic into several focused files: source-parser.ts, graph-compiler.ts, index-projector.ts, validation-runner.ts, and query-helpers.ts. The main compiler.ts now serves as an orchestrator for these stages. Feedback suggests consolidating scattered imports and re-exports in source-parser.ts and standardizing type imports in graph-compiler.ts to improve maintainability and readability.

gemini-code-assist · 2026-04-13T00:51:30Z

+import {
+  cleanMarkdown,
+  isMarkdownSeparatorRow,
+  normalizeForLookup,
+  normalizeLabel,
+  slugify,
+  splitMarkdownRow,
+  stripMarkdownToText,
+} from './text.js';


Consolidate all imports from ./text.js at the top of the file. extractIds and unique are currently imported at the bottom of the file (lines 508 and 510), which is non-standard and can lead to confusion during maintenance.

Suggested change

import {

cleanMarkdown,

isMarkdownSeparatorRow,

normalizeForLookup,

normalizeLabel,

slugify,

splitMarkdownRow,

stripMarkdownToText,

} from './text.js';

import {

cleanMarkdown,

extractIds,

isMarkdownSeparatorRow,

normalizeForLookup,

normalizeLabel,

slugify,

splitMarkdownRow,

stripMarkdownToText,

unique,

} from './text.js';

Fixed in 4a79c54 — consolidated extractIds and unique into the top-level ./text.js import block and removed the scattered bottom-of-file import.

gemini-code-assist · 2026-04-13T00:51:30Z

+}
+
+// Re-export extractIds so downstream modules can use it without importing text.js directly
+export { extractIds } from './text.js';


This re-export is redundant if consolidated with other re-exports from ./text.js at the end of the file. Removing scattered re-exports improves module clarity.

Fixed in 4a79c54 — removed the standalone extractIds re-export and consolidated it with the other re-exports at the end of the file.

gemini-code-assist · 2026-04-13T00:51:30Z

+export {
+  cleanMarkdown,
+  normalizeForLookup,
+  normalizeLabel,
+  slugify,
+} from './text.js';
+
+import { extractIds, unique } from './text.js';


The import statement at the end of the file is misplaced and duplicated (see line 510). Consolidate the re-exports here and remove the misplaced imports to adhere to standard module structure.

Suggested change

export {

cleanMarkdown,

normalizeForLookup,

normalizeLabel,

slugify,

} from './text.js';

import { extractIds, unique } from './text.js';

export {

cleanMarkdown,

extractIds,

normalizeForLookup,

normalizeLabel,

slugify,

} from './text.js';

gemini-code-assist · 2026-04-13T00:51:30Z

+import type {
+  AnchorRef,
+  PatternRecord,
+  RelationEdge,
+  RouteRecord,
+  Snapshot,
+} from './types.js';


Include LexiconEntry in the top-level type imports from ./types.js instead of using an inline import() type in function signatures. This improves readability and maintains consistency with how other types are handled in this file.

Suggested change

import type {

AnchorRef,

PatternRecord,

RelationEdge,

RouteRecord,

Snapshot,

} from './types.js';

import type {

AnchorRef,

LexiconEntry,

PatternRecord,

RelationEdge,

RouteRecord,

Snapshot,

} from './types.js';

Fixed in 4a79c54 — added LexiconEntry to the top-level type imports from ./types.js.

gemini-code-assist · 2026-04-13T00:51:30Z

+}
+
+export function buildLexiconRelations(
+  lexicon: Record<string, import('./types.js').LexiconEntry>,


Use the LexiconEntry type imported at the top of the file for better clarity.

Suggested change

lexicon: Record<string, import('./types.js').LexiconEntry>,

lexicon: Record<string, LexiconEntry>,

Fixed in 4a79c54 — replaced import('./types.js').LexiconEntry with the top-level LexiconEntry type import.

Copilot

Pull request overview

Refactors the runtime compiler by splitting the previously monolithic src/runtime/compiler.ts into focused stage modules while keeping the public compileFpfSource() API stable, and moves query-scoring utilities into a dedicated helper module (re-exported for compatibility).

Changes:

Introduces source-parser.ts, graph-compiler.ts, index-projector.ts, and validation-runner.ts as discrete compilation stages.
Extracts query scoring/selection utilities into query-helpers.ts and re-exports them from compiler.ts.
Updates compiler.ts into a slimmer orchestrator that wires the stages and assembles the final Snapshot.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
src/runtime/compiler.ts	Orchestrates stage modules, assembles `Snapshot`, re-exports query helpers for backward compatibility.
src/runtime/source-parser.ts	Parses markdown source into `SourceIR` (sections, catalog metadata, anchor map) plus parsing helpers.
src/runtime/graph-compiler.ts	Builds pattern/route graphs and relation edges from `SourceIR`.
src/runtime/index-projector.ts	Projects graphs into index map, lexicon, compiled nodes, and inverted indexes.
src/runtime/validation-runner.ts	Computes validation findings for the snapshot candidate.
src/runtime/query-helpers.ts	Query scoring/matching helpers consumed by `QueryEngine` (via compiler re-export).

Comments suppressed due to low confidence (1)

src/runtime/source-parser.ts:508

There’s an import statement at the bottom of the module. This is easy to miss and results in multiple imports from the same module; move extractIds/unique into the existing top-level import list (and keep re-exports separate) to match the repo’s import-at-top convention and reduce confusion.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-13T00:54:15Z

+/**
+ * compiler.ts — SnapshotAssembler orchestrator.
+ *
+ * Delegates to four focused stage modules:
+ *   1. source-parser.ts   (SourceParser)
+ *   2. graph-compiler.ts  (GraphCompiler)
+ *   3. index-projector.ts (IndexProjector)
+ *   4. validation-runner.ts (ValidationRunner)
+ *
+ * This file wires the stages together and assembles the final Snapshot.
+ * The public `compileFpfSource()` API is unchanged.
+ */


PR description/issue #1 acceptance criteria mention each stage should publish stage-local evidence (hashes, counts, timing) into build-audit.json, but the refactor keeps BuildAudit as a single summary (see src/runtime/runtime.ts buildCompilerSummary) and none of the new stage modules emit per-stage audit sections. Either adjust the PR/issue linkage (don’t claim closing #1) or extend BuildAudit + refresh() to collect and write per-stage evidence.

Fair point. The acceptance criteria in issue #1 focus on the structural split ("each stage is a separate module, QueryEngine is an orchestrator only, existing tests pass"). Per-stage audit evidence (hashes, counts, timing) is a natural follow-up that aligns better with issue #6 (stage-local contract tests), which will add the infrastructure to collect and validate per-stage outputs. This PR intentionally keeps the scope to the structural decomposition to avoid scope creep — per-stage audit sections will be added as part of #6.

Copilot · 2026-04-13T00:54:15Z

+      childIds: [],
+    };
+    if (parentId) {
+      const parent = sections.find((candidate) => candidate.id === parentId);


This parent-child wiring does an O(n) sections.find(...) for every heading, making parseHeadingSections O(h^2) in number of headings. Since the parent is already available via the stack (or could be tracked in a Map by id), push the child id directly to the parent section to keep this linear.

Good catch — fixed in 38c2542. The parent is already on the stack, so I now push the child ID directly to stack.at(-1) instead of doing an O(n) sections.find() scan.

Copilot · 2026-04-13T00:54:15Z

+        cluster: pattern?.cluster,
+        role: section.role,
+        routeBearing:
+          section.role === 'route_surface' ||


section.role === 'route_surface' is currently unreachable for heading sections: inferSectionRole() in source-parser never returns 'route_surface', and the only place that role is assigned is for the synthetic preface anchor (which isn’t part of ir.sections). Either teach inferSectionRole (or parseHeadingSections) to assign 'route_surface' for the intended headings, or drop this branch to avoid dead code.

Suggested change

section.role === 'route_surface' ||

Fixed in 38c2542 — removed the section.role === 'route_surface' branch since inferSectionRole() never returns that value for heading sections. The remaining conditions (PREFACE_ROUTE_CITATION, ROUTE_INDEX_CITATION, and routeBearingPatternIds) already cover route-bearing detection correctly.

devin-ai-integration

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 4 additional findings.

Split the 1,601-line monolithic compiler.ts into 5 focused stage modules: - source-parser.ts (SourceParser): markdown text → SourceIR (headings, raw metadata, raw route blocks, anchors) - graph-compiler.ts (GraphCompiler): SourceIR → pattern/route/relation graphs - index-projector.ts (IndexProjector): SourceIR + graphs → lexicon, index map, compiled nodes, indexes - validation-runner.ts (ValidationRunner): snapshot candidate → validation findings - query-helpers.ts: query scoring/matching functions consumed by QueryEngine compiler.ts remains as the SnapshotAssembler orchestrator (144 lines), wiring stages together and re-exporting query helpers for backwards compatibility. Public compileFpfSource() API is unchanged. All 39 existing tests pass without modification. Closes #1 Co-Authored-By: Stanislau <nedbailov375426@gmail.com>

- source-parser.ts: move extractIds and unique to top-level import, remove scattered bottom-of-file imports - graph-compiler.ts: add LexiconEntry to top-level type imports, replace inline import() type in buildLexiconRelations signature Co-Authored-By: Stanislau <nedbailov375426@gmail.com>

- source-parser.ts: use stack reference directly instead of O(n) sections.find() for parent-child wiring - index-projector.ts: remove unreachable section.role === 'route_surface' check (inferSectionRole never returns this value) Co-Authored-By: Stanislau <nedbailov375426@gmail.com>

HeadingSection.role was computed with fullId=undefined while AnchorRef.role used the actual fullId, causing inconsistent role classification between index-projector and query-engine consumers. Now both use fullId consistently, and buildAnchorMap reuses the already-computed section.role instead of recomputing. Co-Authored-By: Stanislau <nedbailov375426@gmail.com>

devin-ai-integration Bot assigned venikman Apr 13, 2026

Copilot AI review requested due to automatic review settings April 13, 2026 00:49

Copilot started reviewing on behalf of venikman April 13, 2026 00:50 View session

gemini-code-assist Bot reviewed Apr 13, 2026

View reviewed changes

Copilot AI reviewed Apr 13, 2026

View reviewed changes

devin-ai-integration Bot reviewed Apr 13, 2026

View reviewed changes

devin-ai-integration Bot and others added 3 commits April 13, 2026 16:50

devin-ai-integration Bot force-pushed the devin/1776040845-split-compiler branch from 38c2542 to b0f7020 Compare April 13, 2026 16:51

This comment was marked as resolved.

Sign in to view

venikman merged commit d27f8de into main Apr 13, 2026
2 checks passed

This was referenced Apr 14, 2026

refactor: extract inline heuristic seeds and route constraints to snapshot-attached rules #28

Merged

feat: persist session cache across process restarts #27

Merged

feat: add stable indexing view and semantic refresh classification #29

Merged

venikman deleted the devin/1776040845-split-compiler branch May 9, 2026 10:59

	lexicon: Record<string, import('./types.js').LexiconEntry>,
	lexicon: Record<string, LexiconEntry>,

Conversation

venikman commented Apr 13, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Why

Type

Changes

Validation

Boundary check

Agent metadata

Summary by CodeRabbit

Uh oh!

devin-ai-integration Bot commented Apr 13, 2026

Uh oh!

devin-ai-integration Bot commented Apr 13, 2026

🤖 Devin AI Engineer

Uh oh!

coderabbitai Bot commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot Apr 13, 2026

venikman commented Apr 13, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 13, 2026 •

edited

Loading