Skip to content

ConfiguredThings/RDP.js

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

106 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

RDP.js

Write parsers, not boilerplate. Drop in a grammar, get a fully typed TypeScript parser — zero dependencies, dual ESM/CJS, batteries included.

projectmascot

Hi, I'm recursquirrel, your friendly recursively descending parsing squirrel

Stability

Warning

This library is pre-1.0. Minor versions may introduce breaking changes to the public API, CLI flags, and generated output. Pin to an exact version in production.

What is @configuredthings/rdp.js?

A minimal, typed base class that handles buffer management and position tracking so subclasses can focus purely on grammar rules. TypeScript, dual ESM/CJS, zero runtime dependencies.

Key components:

  • ScannerlessRDParser — concrete base class for character-by-character parsers; subclass and implement each production rule as a method
  • TokenRDParser — concrete base class for token-stream parsers; used by the span-lexer scaffold
  • RDParser — abstract base shared by both; provides position tracking, backtracking, and error reporting
  • rdp-gen — CLI; reads an ISO 14977 EBNF or RFC 5234 ABNF grammar file and emits a strictly-typed TypeScript parser class and exported discriminated-union parse-tree types
    • rdp-gen --traversal / --transformer / --facade / --pipeline — generates a one-time typed starter file for the chosen pattern
    • rdp-gen --lexer span — generates a span-tokeniser + classifier + {Base}TokenParser scaffold
    • rdp-gen init — scaffolds a complete new project with package.json, tsconfig.json, and a starter parser class
  • GrammarInterpreter — runtime interpreter; execute grammars without a code-generation step
  • ObservableRDParser — opt-in parse tracing via an attached ParseObserver; use withObservable() to apply the same to TokenRDParser subclasses

LL(1) grammars and backtracking

Important

rdp-gen generates parsers that assume LL(1) grammars. Feeding it a non-LL(1) grammar will produce a parser that silently returns incorrect results, not a helpful error — with one exception: left recursion is detected at generation time and rejected.

What LL(1) means: the parser scans left-to-right (first L), produces the leftmost derivation (second L), and needs only one byte of lookahead (the 1) to decide which production to apply at each step. Grammars where two alternatives share a common prefix, or where a rule is ambiguous, are not LL(1).

The base class is more general. ScannerlessRDParser exposes restorePosition, which allows hand-written subclasses to implement backtracking and parse grammars beyond LL(1). rdp-gen does not emit backtracking code — that is a hand-crafting concern.

Left recursion can always be eliminated by rewriting the grammar to use iteration ({...} / A, {A}), which is what LL(1) grammars require.

Quick start — scaffold a new project

npm install -g @configuredthings/rdp.js
mkdir my-parser && cd my-parser
rdp-gen init --name my-parser
npm install
npm run build   # compiles src/MyParser.ts → dist/

rdp-gen init writes a package.json, tsconfig.json, and a starter src/MyParser.ts with the private-constructor / static-parse boilerplate pre-filled. Add --observable to extend ObservableRDParser instead.

Quick start — generate a parser from a grammar

# Generate a strictly-typed parser class from an EBNF grammar
rdp-gen date.ebnf --parser-name DateParser --output src/DateParser.ts

# Generate a usage scaffold for the pattern that fits your use case
rdp-gen date.ebnf --parser-name DateParser --traversal interpreter --facade --output src/date.ts

Scaffold flags emit a one-time typed starter file — imports, entry points, stubs, and error handling in place — ready to fill in. Pass --traversal interpreter or --traversal tree-walker to choose the traversal strategy; add --facade or --pipeline to wrap it. Use --transformer [json] for a Transformer-based scaffold, or --lexer span for a span-tokeniser + token-parser scaffold. See the CLI reference for details.

Manual setup

Install: npm install @configuredthings/rdp.js

Required tsconfig.json options:

{
  "compilerOptions": {
    "target": "ES2022",
    "strict": true,
    "noUncheckedIndexedAccess": true,
    "moduleResolution": "node16"
  }
}
  • target: ES2022 — required for native # private fields
  • strict: true — the generated parser and its parse-tree types are verified to compile cleanly under this setting
  • noUncheckedIndexedAccess: true — all array accesses in the generated code are null-aware
  • moduleResolution: node16 or bundler — required for the package exports map

Documentation

Full documentation is at configuredthings.github.io/RDP.js, including:

Live playground

configuredthings.github.io/RDP.js

About

Minimal TypeScript base class for building recursive descent parsers — with a CLI that generates strictly typed parsers from EBNF or ABNF grammars

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors