Skip to content

Add stage-local contract tests for compiler and query pipelines #6

@devin-ai-integration

Description

@devin-ai-integration

Objective

Add stage-local contract tests for the compiler and query pipelines. Current test suite (6 files, 1,530 lines) covers full lifecycle end-to-end but cannot pinpoint which internal stage broke when an answer is wrong.

Compiler pipeline tests needed:

Test category What it validates
Parser resilience Same semantic IR under whitespace, reorder, and prose-only rewrites
Graph closure No dangling references, route closure intact, relation typing intact
Index round-trip Aliases, anchors, IDs, and route labels remain resolvable
Validation coverage Known-bad inputs produce specific validation findings
Snapshot determinism Same source → byte-identical snapshot

Query pipeline tests needed:

Test category What it validates
Trace determinism Same snapshot + same query → same selected support set
Seed coverage Known question patterns activate expected candidate categories
Frontier bounds Expansion respects hop budget and anchor limits
Projection stability Answer surface may rephrase; support set must not silently drift
Synthesis isolation Synthesis failure does not alter deterministic answer content

"Done" looks like: Each compiler/query stage has at least one focused contract test that fails precisely when that stage's promise breaks.

Kind

test coverage

Affected area

src/runtime/ (compiler & query engine)

Acceptance criteria

  • At least one contract test per compiler stage (parser, graph, index, validation, snapshot)
  • At least one contract test per query stage (normalizer, seeder, ranker, frontier, projector, synthesis)
  • Tests pinpoint the failing stage — not just "end-to-end answer is wrong"
  • Existing end-to-end tests remain untouched

Agent-delegable?

partially — needs human review

Additional context

  • Current tests: tests/fpf-spec-runtime.test.ts (464 lines), tests/lm-studio-synthesizer.test.ts (488 lines), tests/mcp-server.test.ts (297 lines), tests/docs-projection.test.ts (189 lines), tests/runtime-path-resolution.test.ts (69 lines), tests/server-config.test.ts (23 lines)
  • Best tackled after Split compiler.ts into distinct owner stages #1 (compiler split) and Split query-engine.ts into retrieval stages #2 (query engine split) since stage boundaries will be clearer
  • Some tests (graph closure, trace determinism) can be written against the current monolithic modules

FPF grounding:

  • B.3 (Trust & Assurance Calculus, Stable): Each stage needs its own evidence of correctness anchored at the stage level. End-to-end tests create a single assurance layer where trust is opaque — when a test fails, you cannot attribute the failure to a specific stage.
  • A.15 (Role–Method–Work Alignment, Stable): Each stage's Work must be independently auditable. The current end-to-end-only tests violate A.15's principle that work evidence should be traceable to the specific role/method that produced it.
  • B.3.4 (Evidence Decay & Epistemic Debt, Stable): Tests are evidence artifacts with freshness constraints. When stage code changes, stage-local tests provide targeted evidence refresh rather than requiring full end-to-end reruns.

Note: Previous version cited E.19 (Pattern Quality Gates) and F.15 (SCR/RSCR Harness). E.19 governs admission/refresh of FPF spec patterns to the canonical corpus, not software testing. F.15 is the harness for the FPF unification process (Part F), not software test suites. Both were misapplied and have been replaced with B.3, A.15, and B.3.4.

Measurable impact

Metric Before After (target) How to measure
Test files 6 ≥8 (new contract test files) ls tests/*.test.ts | wc -l
Contract tests per compiler stage 0 ≥5 (parser, graph, index, validation, snapshot) grep -Ec 'describe.*parser|describe.*graph|describe.*index|describe.*validation|describe.*snapshot' tests/*.test.ts
Contract tests per query stage 0 ≥5 (normalizer, seeder, ranker, frontier, projector) grep -Ec 'describe.*normalizer|describe.*seeder|describe.*ranker|describe.*frontier|describe.*projector' tests/*.test.ts
Existing end-to-end tests modified 0 0 (untouched) git diff HEAD -- tests/fpf-spec-runtime.test.ts tests/mcp-server.test.ts | wc -l (should be 0)

Metadata

Metadata

Assignees

Labels

taskA concrete work item — research, refactor, chore, or agent-delegable task

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions