RFC: reqstool architecture evolution — SQLite, LSP, and separation of concerns #319

jimisola · 2026-03-11T18:33:13Z

jimisola
Mar 11, 2026
Maintainer

Hey everyone 👋

I've been thinking about where reqstool is heading architecturally and have drafted a set of issues that together form a significant evolution of the codebase. Before diving into implementation, I'd love your input on the direction.

The big picture

Three interconnected initiatives:

1. In-memory SQLite as single source of truth (#313)

Today our pipeline copies and reshapes data through multiple stages:

Location → RawDataset → CombinedRawDataset → CombinedIndexedDataset → StatisticsContainer

Each stage has its own data structures, hand-built join indexes (svcs_from_req, mvrs_from_svc), and manual cleanup code. Every new feature that touches cross-references — grouping requirements, new statistics, per-URN breakdowns — means more hand-coded dict traversals and more places for bugs.

The proposal: replace the intermediate structures with an in-memory SQLite database. Parse YAML/JUnit → validate with Pydantic → INSERT into SQLite. Filters become DELETE with CASCADE. Statistics become GROUP BY queries. A new grouping dimension is a one-line query, not a new class + fields + tests.

Key design decisions documented in #313:

Filtering: hard deletes controlled by runtime flag (--no-filter skips the DELETE step)
Expression language: keep the custom Lark grammar as a safe boundary (security concern with untrusted imports), compile to SQL WHERE internally
Security: SQLite authorizer callback blocks ATTACH, LOAD_EXTENSION, etc.
Snapshot: users can dump the .db file and query it with any SQLite tool

See the full pros/cons and impact analysis on existing issues in #313.

2. Separation of internal data layer from user-facing contracts (#318)

Today, model_dump_json() on internal Pydantic models IS the JSON output. Internal field renames break external consumers. We want a clear boundary:

Internal: SQLite tables, Python objects (can change freely)
External: JSON Schema-defined output, console tables, reports (versioned contract)
Mapping layer: transforms internal → external (only place that changes when either side evolves)

Related: #315 (define JSON Schema for export/status output), #317 (JSON export strategy with SQLite).

3. Language Server Protocol for editor integration (#314)

Longer-term: an LSP server backed by the SQLite DB. Diagnostics on @Requirements() annotations, hover for requirement status, go-to-definition between reqs/SVCs/tests, completions for requirement IDs. Works in any editor (VS Code, IntelliJ, Neovim, etc.).

Tracking

All of this is tracked in the epic: #316

A few open questions

Migration strategy: Big-bang rewrite or incremental? We could introduce SQLite behind the existing interfaces first (populate DB alongside dicts, verify parity), then swap consumers one by one.

Test data strategy: With SQLite we'll have three possible layers for test data:

YAML fixture files → integration-level, tests the full parse pipeline
Pydantic model construction → unit-level, tests validation and business logic
.db fixture files → unit-level, tests queries and output rendering in isolation

When do we use which? Probably: YAML for integration tests, .db fixtures for query/rendering tests, Pydantic only during the parse-and-insert step.

Dogfooding: We don't use reqstool for reqstool itself today. The codebase has @Requirements("REQ_xxx") decorators but we're not running reqstool against its own requirements in CI. Should we? This would be a great way to validate changes.

What I'd like your input on

Does the SQLite direction make sense? Any concerns with the approach?
Which issues should we prioritize first?
Thoughts on the migration strategy (incremental vs big-bang)?
Any use cases or pain points I've missed?

cc @Jonas-Werne @lfvmarcus @lfvdavid @lfvsimon

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

reqstool

RFC: reqstool architecture evolution — SQLite, LSP, and separation of concerns #319

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

reqstool

RFC: reqstool architecture evolution — SQLite, LSP, and separation of concerns #319

Uh oh!

jimisola Mar 11, 2026 Maintainer

Hey everyone 👋

The big picture

1. In-memory SQLite as single source of truth (#313)

2. Separation of internal data layer from user-facing contracts (#318)

3. Language Server Protocol for editor integration (#314)

Tracking

A few open questions

What I'd like your input on

Replies: 0 comments

jimisola
Mar 11, 2026
Maintainer