From fc5f4f858a9e55f7791b73ce94c08ad72155cae4 Mon Sep 17 00:00:00 2001
From: Claude <noreply@anthropic.com>
Date: Thu, 26 Feb 2026 09:40:09 +0000
Subject: [PATCH 1/6] docs: add v1 action plan and MkDocs documentation site

- ACTION_PLAN.md: comprehensive roadmap covering 6 critical bug fixes,
  code-quality improvements, testing gaps, packaging fixes, and a
  prioritised delivery order for the v1.0.0 release
- mkdocs.yml: Material-theme configuration with tabbed navigation,
  syntax highlighting, and GitHub Pages publishing settings
- docs/index.md: project overview, feature table, and quick-start example
- docs/getting-started.md: step-by-step guide for first-time users
- docs/defining-grammar.md: complete grammar construction reference
- docs/lexer.md: terminal ordering, keyword handling, and Lexer API
- docs/parser.md: SLR/LR1/LALR1 comparison, conflicts, semantic actions
- docs/error-handling.md: lexical and syntactic error handling patterns
- docs/serialization.md: pre-building and caching parsing tables
- docs/api-reference.md: full public API documentation
- docs/changelog.md: version history and planned v1 changes

https://claude.ai/code/session_01Vouz5MejqT7sFvTEy8MXz1
---
 ACTION_PLAN.md           | 322 +++++++++++++++++++++++++++++
 docs/api-reference.md    | 424 +++++++++++++++++++++++++++++++++++++++
 docs/changelog.md        |  92 +++++++++
 docs/defining-grammar.md | 227 +++++++++++++++++++++
 docs/error-handling.md   | 177 ++++++++++++++++
 docs/getting-started.md  | 182 +++++++++++++++++
 docs/index.md            |  95 +++++++++
 docs/lexer.md            | 177 ++++++++++++++++
 docs/parser.md           | 144 +++++++++++++
 docs/serialization.md    | 170 ++++++++++++++++
 mkdocs.yml               |  78 +++++++
 11 files changed, 2088 insertions(+)
 create mode 100644 ACTION_PLAN.md
 create mode 100644 docs/api-reference.md
 create mode 100644 docs/changelog.md
 create mode 100644 docs/defining-grammar.md
 create mode 100644 docs/error-handling.md
 create mode 100644 docs/getting-started.md
 create mode 100644 docs/index.md
 create mode 100644 docs/lexer.md
 create mode 100644 docs/parser.md
 create mode 100644 docs/serialization.md
 create mode 100644 mkdocs.yml

diff --git a/ACTION_PLAN.md b/ACTION_PLAN.md
new file mode 100644
index 0000000..cd9e0f4
--- /dev/null
+++ b/ACTION_PLAN.md
@@ -0,0 +1,322 @@
+# PyJapt v1.0 — Action Plan
+
+> **Current version:** 0.4.1
+> **Target version:** 1.0.0
+> **Status:** Planning phase
+
+This document is the authoritative roadmap for bringing PyJapt to a stable v1.0 release. Items are grouped by category and ordered by priority within each category.
+
+---
+
+## 1. Critical Bug Fixes
+
+These bugs affect correctness and must be resolved before v1.
+
+### 1.1 Lexer state not reset on repeated calls
+
+**File:** `pyjapt/lexing.py` — `Lexer.__call__`
+
+`Lexer.__call__` resets `lineno`, `column`, `position`, `text`, and `token`, but it does **not** reset:
+- `self._errors` — errors from a previous run accumulate
+- `self.contain_errors` — flag is stale after the first run with errors
+
+**Fix:** Reset `_errors = []` and `contain_errors = False` at the start of `__call__`.
+
+---
+
+### 1.2 `errors` property signature is broken
+
+**Files:** `pyjapt/lexing.py:102`, `pyjapt/parsing.py:1072`
+
+Both `Lexer.errors` and `ShiftReduceParser.errors` are decorated with `@property` but carry a `clean: bool = True` parameter:
+
+```python
+@property
+def errors(self, clean: bool = True):   # ← wrong: properties don't accept arguments
+```
+
+Python silently ignores the `clean` parameter and always calls it as a property. The `clean` branch (returning tuples with row/column) is therefore unreachable.
+
+**Fix:** Replace with two separate accessors:
+```python
+@property
+def errors(self) -> List[str]:
+    return [m for _, _, m in sorted(self._errors)]
+
+@property
+def errors_with_location(self) -> List[Tuple[int, int, str]]:
+    return sorted(self._errors)
+```
+
+---
+
+### 1.3 `ShiftReduceParser` class-level mutable state
+
+**File:** `pyjapt/parsing.py:1032-1033`
+
+```python
+class ShiftReduceParser:
+    contains_errors: bool = False      # ← shared across all instances
+    current_token: Optional[Token] = None  # ← shared across all instances
+```
+
+Class-level attributes are shared across all instances of a class. Two parsers used in the same program would corrupt each other's state.
+
+**Fix:** Move both to `__init__`.
+
+---
+
+### 1.4 LALR(1) lookahead algorithm mixes strings and Symbol objects
+
+**File:** `pyjapt/parsing.py` — `determining_lookaheads` and `build_lalr1_automaton`
+
+The propagation sentinel `"#"` is a plain string that is mixed into lookahead sets normally occupied by `Symbol` objects. This works coincidentally because `"#"` doesn't collide with any symbol name, but it is fragile and will silently break if a grammar ever names a symbol `#`.
+
+**Fix:** Use a dedicated `PropagationTerminal` singleton (already defined in the file as `PropagationTerminal`) instead of the magic string, or use a `None` sentinel that is excluded from lookahead propagation explicitly.
+
+---
+
+### 1.5 `Grammar.augmented_grammar` semantic rule is wrong
+
+**File:** `pyjapt/parsing.py:462`
+
+```python
+new_start_symbol %= start_symbol + grammar.EPSILON, lambda x: x
+```
+
+The lambda receives a `RuleList`, not the symbol value directly. The production `S' -> start_symbol` (with epsilon swallowed) should return `s[1]`, not the `RuleList` itself:
+
+```python
+new_start_symbol %= start_symbol + grammar.EPSILON, lambda s: s[1]
+```
+
+---
+
+### 1.6 `Grammar.__getitem__` returns `None` for missing keys
+
+**File:** `pyjapt/parsing.py:592-599`
+
+When a production string references a symbol that doesn't exist, `Grammar.__getitem__` silently returns `None`. This produces a confusing `AttributeError` deep in the call chain rather than a clear `GrammarError`.
+
+**Fix:** Raise `GrammarError` with the unknown symbol name.
+
+---
+
+## 2. Code Quality & Modernisation
+
+### 2.1 Move `flake8` to dev dependencies
+
+**File:** `pyproject.toml`
+
+`flake8` is listed under `[tool.poetry.dependencies]` (runtime). It is a linter and must move to `[tool.poetry.dev-dependencies]`.
+
+---
+
+### 2.2 Update deprecated Poetry build backend
+
+**File:** `pyproject.toml`
+
+```toml
+# current (deprecated)
+build-backend = "poetry.masonry.api"
+
+# correct
+build-backend = "poetry.core.masonry.api"
+```
+
+---
+
+### 2.3 Add `mkdocs` and `mkdocs-material` as dev dependencies
+
+**File:** `pyproject.toml`
+
+Documentation builds are part of the development workflow.
+
+---
+
+### 2.4 Rename `pyjapt/typing.py`
+
+The module name `typing` shadows Python's standard-library `typing` module inside the package. Rename it to `pyjapt/types.py` or `pyjapt/_types.py` and update the import in `tests/test_arithmetic_grammar.py`.
+
+---
+
+### 2.5 Export `RuleList` and parsers from the top-level `__init__.py`
+
+**File:** `pyjapt/__init__.py`
+
+`RuleList` and individual parser classes (`SLRParser`, `LR1Parser`, `LALR1Parser`) are not exported from the package root. Users must import from internal submodules. Add them to `__init__.py`:
+
+```python
+from pyjapt.parsing import (
+    ShiftReduceParser, SLRParser, LR1Parser, LALR1Parser, Grammar, RuleList
+)
+```
+
+---
+
+### 2.6 Add type annotations to public API
+
+Currently many method signatures lack return type annotations. Add full annotations to:
+- `Grammar.get_lexer`, `Grammar.get_parser`, `Grammar.serialize_*`
+- `Lexer.__call__`, `Lexer.tokenize`
+- `ShiftReduceParser.__call__`
+- All `add_*` methods on `Grammar`
+
+---
+
+### 2.7 Replace bare `assert` with proper exceptions
+
+Bare `assert` statements are silently disabled when Python runs with the `-O` (optimise) flag:
+
+- `pyjapt/parsing.py:835` — `assert len(grammar.start_symbol.productions) == 1`
+- `pyjapt/parsing.py:906` — `assert not lookaheads.contains_epsilon`
+- `pyjapt/lexing.py` — several in `Grammar.add_terminal`
+
+Replace with `if not ...: raise GrammarError(...)`.
+
+---
+
+### 2.8 Serialised parser resets `augmented_grammar` fields
+
+**File:** `pyjapt/serialization.py`
+
+The serialised parser template does not call `_build_automaton` or compute `firsts`/`follows`, which is correct. But it also doesn't set `augmented_grammar`, `firsts`, or `follows`, which means the serialised parser cannot be safely extended. Document this limitation and add a guard.
+
+---
+
+### 2.9 CI: test against Python 3.10, 3.11, and 3.12
+
+**File:** `.github/workflows/python-test-app.yml`
+
+Add a matrix strategy to test against all supported Python versions.
+
+---
+
+## 3. Testing Improvements
+
+### 3.1 Add tests for LR(1) and LALR(1) parsers
+
+`tests/test_arithmetic_grammar.py` only tests the SLR parser. Add `test_lr1` and `test_lalr1` parameterised over the same set of inputs.
+
+### 3.2 Add tests for lexer error handling
+
+- Unknown character → default error handler → `errors` list populated
+- Custom `lexical_error` decorator
+- `contain_errors` flag is `True` after a failed tokenisation
+- Errors reset correctly across multiple calls
+
+### 3.3 Add tests for parser error handling
+
+- Syntactic error → `errors` list populated
+- `contains_errors` flag is `True`
+- Error recovery (`error` terminal / panic mode)
+- Custom `parsing_error` decorator
+
+### 3.4 Add tests for serialisation
+
+- Round-trip: build grammar → serialise lexer and parser → import → parse identical inputs → same result
+
+### 3.5 Add edge-case tests
+
+- Empty grammar (no productions) raises `GrammarError`
+- Duplicate terminal/non-terminal name raises immediately
+- Production referencing undefined symbol raises `GrammarError`
+- Epsilon productions
+- Grammars with conflicts produce correct conflict counts
+
+### 3.6 Measure and enforce coverage
+
+Add `pytest-cov` and set a minimum coverage threshold (target ≥ 85 %) in CI.
+
+---
+
+## 4. Documentation
+
+The documentation website is built with **MkDocs + Material theme** and lives under `docs/`. See `mkdocs.yml` for the full configuration.
+
+| Page | Status |
+|------|--------|
+| `docs/index.md` | Done |
+| `docs/getting-started.md` | Done |
+| `docs/defining-grammar.md` | Done |
+| `docs/lexer.md` | Done |
+| `docs/parser.md` | Done |
+| `docs/error-handling.md` | Done |
+| `docs/serialization.md` | Done |
+| `docs/api-reference.md` | Done |
+| `docs/changelog.md` | Done |
+
+### 4.1 Add `CHANGELOG.md`
+
+Track every version with date and changes, following [Keep a Changelog](https://keepachangelog.com) format.
+
+### 4.2 Add `CONTRIBUTING.md`
+
+Describe:
+- How to clone and set up the dev environment
+- How to run tests and linting
+- Branching and PR conventions
+- Code of conduct pointer
+
+---
+
+## 5. Packaging & Release
+
+### 5.1 Update version to `1.0.0`
+
+**Files:** `pyjapt/__init__.py` and `pyproject.toml`
+
+### 5.2 Populate package metadata
+
+**File:** `pyproject.toml`
+
+Add:
+```toml
+license = "MIT"
+keywords = ["lexer", "parser", "LR", "LALR", "compiler", "grammar"]
+classifiers = [
+    "Programming Language :: Python :: 3",
+    "License :: OSI Approved :: MIT License",
+    "Topic :: Software Development :: Compilers",
+]
+repository = "https://github.com/alejandroklever/PyJapt"
+documentation = "https://alejandroklever.github.io/PyJapt"
+```
+
+### 5.3 Add a GitHub Actions workflow to build and deploy docs
+
+Publish the MkDocs site to GitHub Pages on every push to `main`.
+
+### 5.4 Tag and publish to PyPI
+
+After all items above are resolved:
+1. Bump version to `1.0.0` in `__init__.py` (the `build.py` script syncs `pyproject.toml` automatically).
+2. Push a `v1.0.0` git tag.
+3. Create a GitHub Release — the existing publish workflow triggers on `release: published`.
+
+---
+
+## 6. Future Work (Post-v1)
+
+The following are explicitly out of scope for v1 but should be tracked:
+
+| Feature | Notes |
+|---------|-------|
+| Operator precedence declarations | Resolves SR conflicts declaratively (like `%left`, `%right` in Yacc) |
+| LL(1) parser support | Mentioned in README as future work |
+| Grammar visualisation | Export automata as DOT / SVG |
+| Incremental parsing | Re-lex only changed regions |
+| Better conflict reporting | Show the conflicting items and lookaheads in a human-readable table |
+| Unicode identifiers in grammars | Non-ASCII symbol names |
+| Async tokenisation | Yield tokens lazily for very large inputs |
+
+---
+
+## Priority Order Summary
+
+| Priority | Item |
+|----------|------|
+| P0 — Must fix before v1 | 1.1, 1.2, 1.3, 1.4, 1.5, 1.6 |
+| P1 — Fix before v1 | 2.1, 2.2, 2.4, 2.5, 3.1, 3.2, 3.3 |
+| P2 — Nice-to-have before v1 | 2.3, 2.6, 2.7, 2.8, 2.9, 3.4, 3.5, 3.6, 5.1 – 5.4 |
+| P3 — Post-v1 | Section 6 |
diff --git a/docs/api-reference.md b/docs/api-reference.md
new file mode 100644
index 0000000..4fa9a3c
--- /dev/null
+++ b/docs/api-reference.md
@@ -0,0 +1,424 @@
+# API Reference
+
+This page lists every public class and method exported by PyJapt.
+
+---
+
+## Top-Level Exports
+
+```python
+from pyjapt import (
+    Grammar,
+    Lexer,
+    Token,
+    ShiftReduceParser,
+    SLRParser,
+    LR1Parser,
+    LALR1Parser,
+)
+```
+
+---
+
+## `Grammar`
+
+The central object for defining a language.
+
+```python
+from pyjapt import Grammar
+g = Grammar()
+```
+
+### Terminals
+
+---
+
+#### `Grammar.add_terminal(name, regex=None, rule=None) -> Terminal`
+
+Create and register a terminal symbol.
+
+| Parameter | Type | Description |
+|-----------|------|-------------|
+| `name` | `str` | Unique terminal name. Must be a valid string. |
+| `regex` | `str \| None` | Regular expression. If `None`, the regex is `re.escape(name)` (literal match). |
+| `rule` | `Callable[[Lexer], Optional[Token]] \| None` | Rule function invoked when this token is matched. |
+
+Returns the new `Terminal` object.
+
+Raises `AssertionError` if `name` is already defined.
+
+---
+
+#### `Grammar.add_terminals(names) -> Tuple[Terminal, ...]`
+
+Convenience wrapper. Splits `names` on whitespace and calls `add_terminal` for each.
+
+```python
+plus, minus, star = g.add_terminals('+ - *')
+```
+
+---
+
+#### `Grammar.terminal(name, regex) -> Callable`
+
+Decorator factory. Creates the terminal **and** registers the decorated function as its rule.
+
+```python
+@g.terminal('int', r'\d+')
+def int_rule(lexer):
+    lexer.position += len(lexer.token.lex)
+    lexer.column   += len(lexer.token.lex)
+    return lexer.token
+```
+
+---
+
+#### `Grammar.add_terminal_error()`
+
+Registers the built-in `error` terminal for use in error-recovery productions. Call this before writing any production that contains `error`.
+
+---
+
+### Non-Terminals
+
+---
+
+#### `Grammar.add_non_terminal(name, start_symbol=False) -> NonTerminal`
+
+Create and register a non-terminal symbol.
+
+| Parameter | Type | Description |
+|-----------|------|-------------|
+| `name` | `str` | Unique non-terminal name. |
+| `start_symbol` | `bool` | Mark as the start symbol. Only one allowed per grammar. |
+
+Raises `Exception` if a second `start_symbol=True` is provided.
+
+---
+
+#### `Grammar.add_non_terminals(names) -> Tuple[NonTerminal, ...]`
+
+Splits `names` on whitespace and calls `add_non_terminal` for each.
+
+```python
+stmt, expr, term = g.add_non_terminals('stmt expr term')
+```
+
+---
+
+### Productions
+
+---
+
+#### `Grammar.production(*production_strings) -> Callable`
+
+Decorator factory that registers the decorated function as the semantic action for one or more productions.
+
+The production string format is `'head -> body'` where `body` is a space-separated list of symbol names.
+
+```python
+@g.production('expr -> expr + term', 'expr -> expr - term')
+def additive(s):
+    return s[1] + s[3] if s[2] == '+' else s[1] - s[3]
+```
+
+---
+
+#### `NonTerminal.__imod__(other) -> NonTerminal`
+
+Operator `%=` overload for adding productions to a non-terminal.
+
+```python
+# Unattributed
+expr %= 'expr + term'
+
+# With semantic action
+expr %= 'expr + term', lambda s: s[1] + s[3]
+
+# Epsilon
+expr %= ''
+```
+
+`other` can be:
+- A `str` (space-separated symbol names)
+- A `Symbol` or `Sentence` (built from Symbol objects with `+`)
+- A `tuple` of `(str | Sentence, callable)` for attributed productions
+- A `SentenceList` (built with `|`) for multiple alternatives
+
+---
+
+### Error Handlers
+
+---
+
+#### `Grammar.lexical_error(handler) -> handler`
+
+Decorator. Registers a custom lexical error handler.
+
+```python
+@g.lexical_error
+def lex_error(lexer):
+    lexer.add_error(lexer.lineno, lexer.column,
+                    f'unexpected "{lexer.token.lex}"')
+    lexer.position += 1
+    lexer.column   += 1
+```
+
+---
+
+#### `Grammar.parsing_error(handler) -> handler`
+
+Decorator. Registers a custom syntactic error handler.
+
+```python
+@g.parsing_error
+def parse_error(parser):
+    tok = parser.current_token
+    parser.add_error(tok.line, tok.column, f'unexpected "{tok.lex}"')
+```
+
+---
+
+### Generating the Lexer and Parser
+
+---
+
+#### `Grammar.get_lexer() -> Lexer`
+
+Build and return a `Lexer` for this grammar.
+
+---
+
+#### `Grammar.get_parser(name, verbose=False) -> ShiftReduceParser`
+
+Build and return a parser.
+
+| `name` | Parser type |
+|--------|-------------|
+| `'slr'` | Simple LR |
+| `'lalr1'` | LALR(1) |
+| `'lr1'` | Canonical LR(1) |
+
+Raises `ValueError` for unknown names.
+
+---
+
+### Serialisation
+
+---
+
+#### `Grammar.serialize_lexer(class_name, grammar_module_name, grammar_variable_name='G')`
+
+Generate `lexertab.py` in the current working directory.
+
+---
+
+#### `Grammar.serialize_parser(parser_type, class_name, grammar_module_name, grammar_variable_name='G')`
+
+Generate `parsertab.py` in the current working directory.
+
+---
+
+### Utility
+
+---
+
+#### `Grammar.to_json() -> str`
+
+Serialise the grammar structure (terminals, non-terminals, productions) to a JSON string. Semantic actions and regexes are **not** included.
+
+---
+
+#### `Grammar.from_json(data) -> Grammar`
+
+Class method. Reconstruct a grammar from the JSON string produced by `to_json()`.
+
+---
+
+#### `Grammar.__getitem__(item) -> Symbol | Production | None`
+
+Look up a symbol or production by name/repr-string.
+
+```python
+plus_symbol = g['+']
+production  = g['expr -> expr + term']
+```
+
+---
+
+## `Token`
+
+```python
+class Token:
+    lex:        str   # lexeme string
+    token_type: Any   # terminal name (str) or Terminal object
+    line:       int   # 1-based line number
+    column:     int   # 1-based column number
+```
+
+### Class methods
+
+#### `Token.empty() -> Token`
+
+Return an empty sentinel token `Token('', '', 0, 0)`.
+
+### Properties
+
+#### `Token.is_valid -> bool`
+
+Always `True` for a regular token. (Subclasses may override for error tokens.)
+
+---
+
+## `Lexer`
+
+```python
+class Lexer:
+    lineno:         int   # current line (1-based)
+    column:         int   # current column (1-based)
+    position:       int   # byte offset in input
+    text:           str   # full input string
+    token:          Token # token being processed
+    contain_errors: bool  # True after first error
+```
+
+### `Lexer.__call__(text) -> List[Token]`
+
+Tokenise `text`. Resets all internal state before each call. Appends an EOF token at the end.
+
+### `Lexer.tokenize(text) -> Generator[Token, None, None]`
+
+Low-level generator. Does **not** reset state. Prefer `__call__` for normal use.
+
+### `Lexer.errors -> List[str]`
+
+Sorted list of error message strings accumulated during the last call.
+
+### `Lexer.add_error(line, col, message)`
+
+Append an error entry. Intended for use inside custom terminal rules and error handlers.
+
+---
+
+## `ShiftReduceParser`
+
+Base class for all three parser variants. Do not instantiate directly; use `Grammar.get_parser`.
+
+```python
+class ShiftReduceParser:
+    SHIFT = 'SHIFT'
+    REDUCE = 'REDUCE'
+    OK = 'OK'
+```
+
+### `ShiftReduceParser.__call__(tokens) -> Any`
+
+Parse a list of `Token` objects and return the semantic value of the start symbol, or `None` if parsing failed.
+
+### `ShiftReduceParser.errors -> List[str]`
+
+Sorted list of syntactic error messages.
+
+### `ShiftReduceParser.add_error(line, column, message)`
+
+Append an error entry from inside a semantic action or error handler.
+
+### `ShiftReduceParser.contains_errors -> bool`
+
+`True` if any parsing error has been detected.
+
+### `ShiftReduceParser.current_token -> Token`
+
+The token being processed at the time the most recent error occurred.
+
+### `ShiftReduceParser.conflicts -> List[Tuple]`
+
+List of detected conflicts, each a `('SR' | 'RR', prod_a, prod_b)` tuple.
+
+### `ShiftReduceParser.shift_reduce_count -> int`
+
+Number of shift-reduce conflicts.
+
+### `ShiftReduceParser.reduce_reduce_count -> int`
+
+Number of reduce-reduce conflicts.
+
+---
+
+## `SLRParser`
+
+```python
+class SLRParser(ShiftReduceParser): ...
+```
+
+Uses the LR(0) automaton and Follow sets for lookaheads.
+
+---
+
+## `LR1Parser`
+
+```python
+class LR1Parser(ShiftReduceParser): ...
+```
+
+Uses the canonical LR(1) automaton with per-item lookaheads.
+
+---
+
+## `LALR1Parser`
+
+```python
+class LALR1Parser(LR1Parser): ...
+```
+
+Uses the merged LALR(1) automaton. Same states as SLR, same power as LR(1) for most grammars.
+
+---
+
+## `RuleList`
+
+Passed to every semantic action as `s`. 1-indexed over the production body.
+
+### `RuleList.__getitem__(index) -> Any`
+
+`s[0]` — head value (output).
+`s[1]` … `s[n]` — body symbol values.
+
+### `RuleList.add_error(index, message)`
+
+Report an error at the position of `s[index]` (int) or at an explicit `(line, column)` tuple.
+
+### `RuleList.force_parsing_error()`
+
+Mark the parse as failed without adding an error message.
+
+---
+
+## `NonTerminal`
+
+Represents a grammar non-terminal.
+
+| Attribute | Type | Description |
+|-----------|------|-------------|
+| `name` | `str` | Symbol name |
+| `productions` | `List[Production]` | Productions where this symbol is the head |
+
+---
+
+## `Terminal`
+
+Represents a grammar terminal.
+
+| Attribute | Type | Description |
+|-----------|------|-------------|
+| `name` | `str` | Symbol name |
+
+---
+
+## `Production`
+
+| Attribute | Type | Description |
+|-----------|------|-------------|
+| `left` | `NonTerminal` | Production head |
+| `right` | `Sentence` | Production body |
+| `rule` | `Callable \| None` | Semantic action |
diff --git a/docs/changelog.md b/docs/changelog.md
new file mode 100644
index 0000000..72c0293
--- /dev/null
+++ b/docs/changelog.md
@@ -0,0 +1,92 @@
+# Changelog
+
+All notable changes to PyJapt are documented here.
+This project follows [Semantic Versioning](https://semver.org) and the
+[Keep a Changelog](https://keepachangelog.com/en/1.0.0/) format.
+
+---
+
+## [Unreleased] — v1.0.0
+
+### Planned — Bug Fixes
+- Reset `_errors` and `contain_errors` in `Lexer.__call__` so repeated calls don't accumulate stale errors.
+- Fix `errors` property signature on `Lexer` and `ShiftReduceParser` (properties cannot accept arguments).
+- Move `contains_errors` and `current_token` from class-level to instance-level in `ShiftReduceParser`.
+- Fix `Grammar.augmented_grammar` semantic action (`lambda s: s[1]` instead of `lambda x: x`).
+- Fix `s.Name` → `s.name` in `Grammar.to_json()` (case mismatch causes `AttributeError`).
+- Raise `GrammarError` in `Grammar.__getitem__` instead of returning `None` for missing symbols.
+- Replace bare `assert` statements with proper `GrammarError` exceptions.
+
+### Planned — Improvements
+- Move `flake8` from runtime to dev dependencies.
+- Update build backend to `poetry.core.masonry.api` (replaces deprecated `poetry.masonry.api`).
+- Export `RuleList`, `SLRParser`, `LR1Parser`, `LALR1Parser` from `pyjapt.__init__`.
+- Rename `pyjapt/typing.py` to `pyjapt/types.py` to avoid shadowing stdlib `typing`.
+- Add full type annotations to the public API.
+- Expand CI matrix to Python 3.10, 3.11, and 3.12.
+
+### Planned — Testing
+- Add tests for LR(1) and LALR(1) parsers.
+- Add tests for lexer and parser error handling.
+- Add tests for serialisation round-trips.
+- Add edge-case tests (empty grammar, duplicate symbols, epsilon productions).
+- Enforce minimum test coverage threshold.
+
+### Planned — Documentation
+- Full MkDocs site with Material theme.
+- Getting-started guide and user-guide sections.
+- Complete API reference.
+- Changelog (this file).
+
+---
+
+## [0.4.1] — 2024-03-25
+
+### Fixed
+- Updated README with corrected examples and improved prose.
+
+---
+
+## [0.4.0] — 2023-02-17
+
+### Added
+- GitHub Actions workflow for publishing to PyPI on release.
+- `requirements.txt` for legacy `pip install` support.
+
+---
+
+## [0.3.0] — 2021-03-??
+
+### Added
+- Default error report in the shift-reduce parser (panic-mode recovery).
+- Improved `RuleList` error API.
+
+### Fixed
+- Reset lexer parameters when analysing a new string (`Lexer.__call__`).
+
+---
+
+## [0.2.9] — 2021-??-??
+
+### Fixed
+- Minor fix in parsing default error detection.
+
+---
+
+## [0.2.x] — 2020
+
+### Added
+- SLR, LR(1), and LALR(1) parsers.
+- Serialisation of lexer and parser to Python source files.
+- `@g.terminal` decorator for inline rule definition.
+- `@g.production` decorator for inline production rules.
+- `@g.lexical_error` and `@g.parsing_error` decorators.
+- `add_terminal_error()` and error terminal support in productions.
+- `Grammar.to_json()` / `Grammar.from_json()`.
+- JSON grammar import/export.
+
+---
+
+## [0.1.x] — 2020
+
+- Initial release with basic lexer and SLR parser.
diff --git a/docs/defining-grammar.md b/docs/defining-grammar.md
new file mode 100644
index 0000000..68d6459
--- /dev/null
+++ b/docs/defining-grammar.md
@@ -0,0 +1,227 @@
+# Defining a Grammar
+
+A `Grammar` object is the single source of truth for your language. This page covers all the ways to build one.
+
+---
+
+## Creating the Grammar
+
+```python
+from pyjapt import Grammar
+
+g = Grammar()
+```
+
+---
+
+## Non-Terminals
+
+Non-terminals are the syntactic categories of your language (e.g. `expr`, `statement`, `program`).
+
+### `add_non_terminal(name, start_symbol=False)`
+
+```python
+program = g.add_non_terminal('program', start_symbol=True)
+stmt    = g.add_non_terminal('stmt')
+expr    = g.add_non_terminal('expr')
+```
+
+- `name` — must be a unique, non-empty string.
+- `start_symbol=True` — marks this as the grammar's start symbol. Only one non-terminal can carry this flag.
+
+Returns a `NonTerminal` object that you use to write productions.
+
+### `add_non_terminals(names)`
+
+Convenience method: accepts a space-separated string and returns a tuple of `NonTerminal` objects in the same order.
+
+```python
+stmt, expr, term, fact = g.add_non_terminals('stmt expr term fact')
+```
+
+---
+
+## Terminals
+
+Terminals are the atomic tokens produced by the lexer.
+
+### `add_terminal(name, regex=None, rule=None)`
+
+```python
+# Literal terminal — the regex is the escaped name
+plus  = g.add_terminal('+')
+minus = g.add_terminal('-')
+
+# Terminal with a custom regex
+num = g.add_terminal('int', regex=r'\d+')
+
+# Terminal with a custom regex AND a lexer rule
+num = g.add_terminal('int', regex=r'\d+', rule=lambda lexer: ...)
+```
+
+- When `regex` is `None`, the regular expression used is `re.escape(name)`, so `+` matches the literal character `+`.
+- `rule` is a function `(Lexer) -> Optional[Token]`. If it returns `None`, the token is discarded.
+
+### `add_terminals(names)`
+
+Accepts a space-separated string and returns a tuple. All created terminals use their name as the literal regex.
+
+```python
+plus, minus, star, div, lpar, rpar = g.add_terminals('+ - * / ( )')
+```
+
+### `@g.terminal(name, regex)`
+
+A decorator that creates the terminal **and** registers the rule in one step.
+
+```python
+@g.terminal('int', r'\d+')
+def int_terminal(lexer):
+    lexer.column   += len(lexer.token.lex)
+    lexer.position += len(lexer.token.lex)
+    lexer.token.lex = int(lexer.token.lex)
+    return lexer.token
+```
+
+The decorated function receives the `Lexer` instance and must either return the `Token` (possibly modified) or return `None`/nothing to discard it.
+
+---
+
+## Productions
+
+Productions define how non-terminals are composed from sequences of terminals and non-terminals.
+
+### Using `%=` with a string (recommended)
+
+```python
+expr %= 'expr + term'    # unattributed
+expr %= 'expr + term', lambda s: s[1] + s[3]  # with semantic action
+```
+
+The string on the right-hand side is a space-separated list of symbol names. Each name must already be declared in the grammar.
+
+Inside the semantic action, `s` is a `RuleList`:
+
+| Index | Meaning |
+|-------|---------|
+| `s[0]` | The head non-terminal's value (set by returning from the action) |
+| `s[1]` | Value of the 1st body symbol |
+| `s[2]` | Value of the 2nd body symbol |
+| `s[n]` | Value of the nth body symbol |
+
+For a terminal, the value is the token's lexeme (`str`).
+For a non-terminal, the value is whatever its production's semantic action returned.
+
+### Using `%=` with `Symbol` objects
+
+```python
+expr %= expr + plus + term
+expr %= expr + plus + term, lambda s: s[1] + s[3]
+```
+
+`Symbol` objects support `+` to build `Sentence` objects, so you can construct productions with the original variable references.
+
+### Epsilon productions
+
+```python
+expr %= ''           # empty string → epsilon production
+expr %= g.EPSILON    # same thing using the EPSILON symbol directly
+```
+
+### `@g.production(*production_strings)`
+
+A decorator alternative to `%=`. It binds the decorated function to one or more production strings.
+
+```python
+@g.production('expr -> expr + term')
+def expr_add(s):
+    return s[1] + s[3]
+```
+
+The string format is `'head -> body'` where `->` separates the head non-terminal from the body symbols.
+
+You can attach the same function to multiple productions:
+
+```python
+@g.production(
+    'expr -> expr + expr',
+    'expr -> expr - expr',
+    'expr -> expr * expr',
+    'expr -> expr / expr',
+)
+def binary_op(s):
+    if s[2] == '+': return s[1] + s[3]
+    if s[2] == '-': return s[1] - s[3]
+    if s[2] == '*': return s[1] * s[3]
+    if s[2] == '/': return s[1] // s[3]
+```
+
+---
+
+## Special Terminals
+
+### `g.EOF`
+
+The end-of-file terminal (`$`). It is added automatically; you should not declare it yourself.
+
+### `g.EPSILON`
+
+Represents the empty word. Use it to write nullable productions.
+
+### `g.ERROR`
+
+A special terminal used for error recovery productions. You must register it explicitly before use:
+
+```python
+g.add_terminal_error()
+```
+
+See [Error Handling](error-handling.md) for full details.
+
+---
+
+## Inspecting the Grammar
+
+```python
+# All non-terminals
+print(g.non_terminals)
+
+# All terminals
+print(g.terminals)
+
+# All productions
+print(g.productions)
+
+# Look up any symbol by name
+sym = g['expr']
+
+# Look up a production by repr-string
+prod = g['expr -> expr + term']
+```
+
+### `Grammar.__str__`
+
+```python
+print(g)
+# Non-Terminals:
+#     expr, term, fact
+# Terminals:
+#     +, -, *, /, (, ), int, whitespace
+# Productions:
+#     [expr -> expr + term, ...]
+```
+
+---
+
+## JSON Import / Export
+
+PyJapt supports a basic JSON representation of the grammar (without semantic actions):
+
+```python
+json_str = g.to_json()
+
+g2 = Grammar.from_json(json_str)
+```
+
+!!! note
+    JSON serialisation does not preserve terminal regexes, terminal rules, or semantic actions. It is useful for inspecting grammar structure, not for production use. Use [Python file serialisation](serialization.md) for production scenarios.
diff --git a/docs/error-handling.md b/docs/error-handling.md
new file mode 100644
index 0000000..43262e9
--- /dev/null
+++ b/docs/error-handling.md
@@ -0,0 +1,177 @@
+# Error Handling
+
+Good error handling is one of PyJapt's core design goals. This page describes how to report and recover from both lexical and syntactic errors.
+
+---
+
+## Lexical Error Handling
+
+### Default behaviour
+
+When the lexer encounters a character that matches no terminal pattern, it calls the *lexical error handler*. By default, this adds an error message to the internal errors list and advances past the bad character.
+
+### Custom handler — `@g.lexical_error`
+
+Decorate a function with `@g.lexical_error` to replace the default handler:
+
+```python
+@g.lexical_error
+def on_lex_error(lexer):
+    line, col = lexer.lineno, lexer.column
+    bad_char  = lexer.token.lex
+
+    lexer.add_error(line, col,
+        f'({line}, {col}) - LexicographicError: unexpected character "{bad_char}"')
+
+    # Always advance to avoid an infinite loop
+    lexer.position += 1
+    lexer.column   += 1
+```
+
+!!! warning "Always advance `lexer.position`"
+    If your handler does not advance `lexer.position`, the lexer will match the same bad character indefinitely.
+
+### Reporting errors from a terminal rule
+
+You can also detect and report errors from inside a terminal rule:
+
+```python
+@g.terminal('comment_error', r'/\*(.|\n)*$')
+def eof_in_comment(lexer):
+    """Match a /* comment that reaches EOF without a closing */"""
+    lexer.contain_errors = True
+    lex = lexer.token.lex
+    for ch in lex:
+        if ch == '\n':
+            lexer.lineno += 1
+            lexer.column  = 1
+        else:
+            lexer.column += 1
+    lexer.position += len(lex)
+    lexer.add_error(
+        lexer.lineno, lexer.column,
+        f'({lexer.lineno}, {lexer.column}) - LexicographicError: EOF in comment'
+    )
+```
+
+### Checking lexical errors
+
+```python
+tokens = lexer(source_code)
+
+if lexer.contain_errors:
+    for message in lexer.errors:
+        print(message)
+```
+
+`lexer.errors` returns a list of error message strings, sorted by position.
+
+---
+
+## Syntactic Error Handling
+
+### Default behaviour
+
+When the parser cannot find an action for the current `(state, token)` pair it enters *panic-mode recovery*: it calls the error handler and then skips input tokens until it finds one that fits the current state.
+
+### Custom handler — `@g.parsing_error`
+
+```python
+@g.parsing_error
+def on_parse_error(parser):
+    tok = parser.current_token
+    parser.add_error(
+        tok.line, tok.column,
+        f'({tok.line}, {tok.column}) - SyntacticError: unexpected "{tok.lex}"'
+    )
+```
+
+The handler receives the `ShiftReduceParser` instance. After it returns, the parser automatically skips tokens until it can continue.
+
+### Error productions
+
+An *error production* lets you match known error patterns and keep parsing with a valid (possibly incomplete) AST node. This is the most precise error-recovery mechanism.
+
+**Setup — register the error terminal:**
+
+```python
+g.add_terminal_error()
+```
+
+**Usage — write productions that include `error`:**
+
+```python
+@g.production('stmt -> let id = expr error')
+def missing_semicolon(s):
+    # s[5] is the Token that triggered the error
+    s.add_error(5, f'({s[5].line}, {s[5].column}) - SyntacticError: '
+                   f"expected ';' instead of '{s[5].lex}'")
+    return LetStatement(s[2], s[4])
+```
+
+`s.add_error(index, message)`:
+
+- If `index` is an `int`, it refers to the position in the rule list — `s[5]` is the token at position 5.
+- If `index` is a `(line, column)` tuple, it is used directly as the location.
+
+When the parser encounters a token that cannot be shifted, and the current state has a transition on the `error` terminal, it replaces the bad token with an `error` token and continues. The `error` token's semantic value is the original `Token` object, so you still have access to `lex`, `line`, and `column`.
+
+### Forcing a parsing error from a semantic action
+
+Sometimes you want to mark an input as invalid from inside a semantic action — for example, to reject an empty expression:
+
+```python
+@g.production('expr -> ')
+def empty_expr(s):
+    s.force_parsing_error()
+    # return nothing or an error sentinel
+```
+
+`force_parsing_error()` sets `parser.contains_errors = True` without adding an error message. Add an explicit message via `s.add_error(...)` if needed.
+
+### Checking syntactic errors
+
+```python
+result = parser(tokens)
+
+if parser.contains_errors:
+    for message in parser.errors:
+        print(message)
+```
+
+---
+
+## Combining Both Error Handlers
+
+A typical setup collects all errors from both the lexer and the parser and prints them sorted by line:
+
+```python
+lexer  = g.get_lexer()
+parser = g.get_parser('lalr1')
+
+tokens = lexer(source_code)
+result = parser(tokens)
+
+all_errors = lexer.errors + parser.errors
+
+if all_errors:
+    for msg in all_errors:
+        print(msg)
+```
+
+---
+
+## Error Message Conventions
+
+PyJapt does not impose a specific error format. A common convention used in compilers is:
+
+```
+(line, column) - ErrorType: description
+```
+
+For example:
+
+```
+(3, 12) - LexicographicError: unexpected character "@"
+(5, 1)  - SyntacticError: expected ';' instead of '}'
+```
diff --git a/docs/getting-started.md b/docs/getting-started.md
new file mode 100644
index 0000000..8cd87d1
--- /dev/null
+++ b/docs/getting-started.md
@@ -0,0 +1,182 @@
+# Getting Started
+
+This guide walks you through installing PyJapt and building your first working lexer and parser.
+
+---
+
+## Prerequisites
+
+- Python **3.10** or later
+- `pip` (any recent version)
+
+---
+
+## Installation
+
+```sh
+pip install pyjapt
+```
+
+Verify the installation:
+
+```python
+import pyjapt
+print(pyjapt.__version__)  # e.g. 0.4.1
+```
+
+---
+
+## Your First Grammar — Arithmetic Expressions
+
+We will build a complete interpreter for arithmetic expressions that supports `+`, `-`, `*`, `/`, integer literals, and parentheses.
+
+### Step 1 — Create the Grammar object
+
+```python
+from pyjapt import Grammar
+
+g = Grammar()
+```
+
+`Grammar` is the central object. Everything — terminals, non-terminals, productions, and the resulting lexer and parser — comes from this one instance.
+
+---
+
+### Step 2 — Declare non-terminals
+
+```python
+expr = g.add_non_terminal('expr', start_symbol=True)
+term, fact = g.add_non_terminals('term fact')
+```
+
+`add_non_terminal` creates a single non-terminal and returns a `NonTerminal` object.
+Pass `start_symbol=True` to mark it as the grammar's start symbol (only one is allowed).
+
+`add_non_terminals` accepts a space-separated string and returns a tuple.
+
+---
+
+### Step 3 — Declare terminals
+
+```python
+g.add_terminals('+ - / * ( )')   # literal terminals
+g.add_terminal('int', regex=r'\d+')  # terminal with a custom regex
+```
+
+Terminals declared with `add_terminals` use their name as the regex literally.
+`add_terminal` lets you provide a custom regular expression.
+
+---
+
+### Step 4 — Handle whitespace
+
+Whitespace is not a meaningful token in this grammar, so we skip it by not returning anything from the rule function.
+
+```python
+@g.terminal('whitespace', r' +')
+def whitespace(lexer):
+    lexer.column += len(lexer.token.lex)
+    lexer.position += len(lexer.token.lex)
+    # no return → token is discarded
+```
+
+---
+
+### Step 5 — Write productions with semantic actions
+
+Productions are attached to non-terminal objects using the `%=` operator.
+The second element of the tuple is a *semantic action* — a function (or lambda) that receives the `RuleList` for that production and returns the production's semantic value.
+
+```python
+# expr → expr + term | expr - term | term
+expr %= 'expr + term', lambda s: s[1] + s[3]
+expr %= 'expr - term', lambda s: s[1] - s[3]
+expr %= 'term',        lambda s: s[1]
+
+# term → term * fact | term / fact | fact
+term %= 'term * fact', lambda s: s[1] * s[3]
+term %= 'term / fact', lambda s: s[1] // s[3]
+term %= 'fact',        lambda s: s[1]
+
+# fact → ( expr ) | int
+fact %= '( expr )',    lambda s: s[2]
+fact %= 'int',         lambda s: int(s[1])
+```
+
+Inside a semantic action `s` is a `RuleList`.
+`s[0]` is the synthesised value of the production's *head* (i.e. what you return).
+`s[1]`, `s[2]`, … are the values of each symbol in the production's *body* (1-indexed).
+
+---
+
+### Step 6 — Generate the lexer and parser
+
+```python
+lexer  = g.get_lexer()
+parser = g.get_parser('slr')  # 'slr', 'lr1', or 'lalr1'
+```
+
+The lexer is a callable that turns a string into a list of `Token` objects.
+The parser is a callable that takes that list and applies the grammar rules, returning the final semantic value.
+
+---
+
+### Step 7 — Parse an expression
+
+```python
+tokens = lexer('(2 + 2) * 2 + 2')
+result = parser(tokens)
+print(result)  # 10
+```
+
+Or more concisely:
+
+```python
+print(parser(lexer('(2 + 2) * 2 + 2')))  # 10
+```
+
+---
+
+## Full Source
+
+```python
+from pyjapt import Grammar
+
+g = Grammar()
+expr = g.add_non_terminal('expr', start_symbol=True)
+term, fact = g.add_non_terminals('term fact')
+g.add_terminals('+ - / * ( )')
+g.add_terminal('int', regex=r'\d+')
+
+@g.terminal('whitespace', r' +')
+def whitespace(lexer):
+    lexer.column += len(lexer.token.lex)
+    lexer.position += len(lexer.token.lex)
+
+expr %= 'expr + term', lambda s: s[1] + s[3]
+expr %= 'expr - term', lambda s: s[1] - s[3]
+expr %= 'term',        lambda s: s[1]
+
+term %= 'term * fact', lambda s: s[1] * s[3]
+term %= 'term / fact', lambda s: s[1] // s[3]
+term %= 'fact',        lambda s: s[1]
+
+fact %= '( expr )',    lambda s: s[2]
+fact %= 'int',         lambda s: int(s[1])
+
+lexer  = g.get_lexer()
+parser = g.get_parser('slr')
+
+print(parser(lexer('(2 + 2) * 2 + 2')))   # 10
+print(parser(lexer('1 + 2 * 5 - 4')))      # 7
+print(parser(lexer('((3 + 4) * 5) - 6 / 2')))  # 32
+```
+
+---
+
+## Next Steps
+
+- [Defining a Grammar](defining-grammar.md) — all grammar construction options in detail.
+- [Configuring the Lexer](lexer.md) — terminal priority, token rules, and ignored tokens.
+- [Building a Parser](parser.md) — SLR vs LR(1) vs LALR(1) and how to pick one.
+- [Error Handling](error-handling.md) — how to report lexical and syntactic errors.
diff --git a/docs/index.md b/docs/index.md
new file mode 100644
index 0000000..6836a2b
--- /dev/null
+++ b/docs/index.md
@@ -0,0 +1,95 @@
+# PyJapt
+
+**PyJapt** — *Just Another Parsing Tool Written in Python* — is a lexer and LR parser generator that lets you define a language grammar in pure Python and immediately produce a working tokeniser and parser from it.
+
+<p align="center">
+  <img width="800" alt="PyJapt Logo Banner" src="https://github.com/alejandroklever/PyJapt/assets/45394625/ce9fd982-8f08-41ba-aa9e-54c2de24212b">
+</p>
+
+---
+
+## Why PyJapt?
+
+| Feature | Description |
+|---------|-------------|
+| **Pure Python** | No C extensions, no generated files to check in, no build step. |
+| **Three LR parser types** | SLR, LR(1), and LALR(1) — choose the power level you need. |
+| **Custom error handling** | Lexical and syntactic error handlers are first-class citizens. |
+| **Semantic actions** | Attach a lambda or a decorated function to any production rule. |
+| **Serialisation** | Pre-build the parsing tables and serialise them to a Python module for faster startup. |
+| **Decorator-based API** | Define terminals and production rules without leaving Python. |
+
+---
+
+## Quick Example
+
+A complete arithmetic expression parser in under 25 lines:
+
+```python
+from pyjapt import Grammar
+
+g = Grammar()
+expr = g.add_non_terminal('expr', True)
+term, fact = g.add_non_terminals('term fact')
+g.add_terminals('+ - / * ( )')
+g.add_terminal('int', regex=r'\d+')
+
+@g.terminal('whitespace', r' +')
+def whitespace(lexer):
+    lexer.column += len(lexer.token.lex)
+    lexer.position += len(lexer.token.lex)
+
+expr %= 'expr + term', lambda s: s[1] + s[3]
+expr %= 'expr - term', lambda s: s[1] - s[3]
+expr %= 'term',        lambda s: s[1]
+
+term %= 'term * fact', lambda s: s[1] * s[3]
+term %= 'term / fact', lambda s: s[1] // s[3]
+term %= 'fact',        lambda s: s[1]
+
+fact %= '( expr )',    lambda s: s[2]
+fact %= 'int',         lambda s: int(s[1])
+
+lexer  = g.get_lexer()
+parser = g.get_parser('slr')
+
+print(parser(lexer('(2 + 2) * 2 + 2')))  # 10
+```
+
+---
+
+## Installation
+
+```sh
+pip install pyjapt
+```
+
+PyJapt requires **Python 3.10** or later and has no runtime dependencies.
+
+---
+
+## How It Works
+
+PyJapt revolves around the `Grammar` class. You describe your language by:
+
+1. **Declaring non-terminals** — the syntactic categories of your language.
+2. **Declaring terminals** — the tokens produced by the lexer.
+3. **Writing productions** — rules that describe how non-terminals are composed, with optional semantic actions.
+4. **Generating the lexer and parser** — call `get_lexer()` and `get_parser(type)`.
+
+```
+Grammar definition
+      │
+      ├─► get_lexer()   → Lexer  (regex-based tokeniser)
+      │
+      └─► get_parser()  → ShiftReduceParser  (SLR / LR1 / LALR1)
+```
+
+---
+
+## Next Steps
+
+- Follow the [Getting Started](getting-started.md) guide to build your first language.
+- Learn how to [define a grammar](defining-grammar.md) in detail.
+- Read about [error handling](error-handling.md) to build robust parsers.
+- Check the [API Reference](api-reference.md) for the complete public API.
diff --git a/docs/lexer.md b/docs/lexer.md
new file mode 100644
index 0000000..0139fd0
--- /dev/null
+++ b/docs/lexer.md
@@ -0,0 +1,177 @@
+# Configuring the Lexer
+
+The lexer produced by `g.get_lexer()` is a regex-based tokeniser. Understanding how it orders and applies patterns is essential for writing grammars with keywords, identifiers, and complex token types.
+
+---
+
+## How the Lexer Works
+
+When called with a string, the lexer scans from left to right trying to match the current position against a single combined regex. The first alternative in that regex that matches wins.
+
+The alternatives are ordered as follows:
+
+1. **Ruled terminals** — terminals declared with `@g.terminal(...)` or via `add_terminal(..., rule=...)`, in the order they were declared.
+2. **Non-literal terminals** — terminals with a custom `regex` argument but no rule, sorted longest-regex-first.
+3. **Literal terminals** — terminals whose regex is their escaped name (declared via `add_terminal(name)` or `add_terminals(...)`), sorted longest-first.
+
+This ordering means that custom rule functions are checked before pattern-only terminals, and longer patterns take priority over shorter ones within each group.
+
+---
+
+## The Token Class
+
+```python
+class Token:
+    lex:        str   # the matched lexeme string
+    token_type: Any   # the terminal's name (str) or Symbol object
+    line:       int   # 1-based line number
+    column:     int   # 1-based column number
+```
+
+---
+
+## The Lexer Object Inside a Rule
+
+When a terminal rule function is called, it receives the `Lexer` instance with the following attributes:
+
+| Attribute | Type | Description |
+|-----------|------|-------------|
+| `lexer.token` | `Token` | The token that was just matched |
+| `lexer.position` | `int` | Current byte offset in the input string |
+| `lexer.lineno` | `int` | Current line number (1-based) |
+| `lexer.column` | `int` | Current column number (1-based) |
+| `lexer.text` | `str` | The full input string |
+| `lexer.contain_errors` | `bool` | Set to `True` if any error has occurred |
+
+**Important:** you are responsible for advancing `lexer.position` and `lexer.column` inside a rule. If you forget, the lexer will match the same input repeatedly.
+
+---
+
+## Common Terminal Patterns
+
+### Discarding whitespace
+
+```python
+@g.terminal('whitespace', r' +')
+def whitespace(lexer):
+    lexer.column   += len(lexer.token.lex)
+    lexer.position += len(lexer.token.lex)
+    # return nothing → token is ignored
+```
+
+### Tracking newlines
+
+```python
+@g.terminal('newline', r'\n+')
+def newline(lexer):
+    lexer.lineno   += len(lexer.token.lex)
+    lexer.position += len(lexer.token.lex)
+    lexer.column    = 1
+```
+
+### Discarding tabs
+
+```python
+@g.terminal('tabulation', r'\t+')
+def tab(lexer):
+    lexer.column   += 4 * len(lexer.token.lex)
+    lexer.position += len(lexer.token.lex)
+```
+
+### Modifying the lexeme
+
+```python
+@g.terminal('int', r'\d+')
+def int_terminal(lexer):
+    lexer.column   += len(lexer.token.lex)
+    lexer.position += len(lexer.token.lex)
+    lexer.token.lex = int(lexer.token.lex)  # convert to Python int
+    return lexer.token
+```
+
+### Single-line comments
+
+```python
+@g.terminal('comment', r'//[^\n]*')
+def line_comment(lexer):
+    lexer.column   += len(lexer.token.lex)
+    lexer.position += len(lexer.token.lex)
+    # discard — no return
+```
+
+### Block comments
+
+```python
+@g.terminal('block_comment', r'/\*(.|\n)*?\*/')
+def block_comment(lexer):
+    lex = lexer.token.lex
+    for ch in lex:
+        if ch == '\n':
+            lexer.lineno += 1
+            lexer.column  = 1
+        else:
+            lexer.column += 1
+    lexer.position += len(lex)
+```
+
+---
+
+## Keywords vs Identifiers
+
+Suppose your language has keywords (`if`, `else`, `while`) and identifiers (`[a-zA-Z_][a-zA-Z0-9_]*`). A naïve approach would match `if` as an identifier because the identifier regex is broader.
+
+The correct solution is to declare keywords as literal terminals and write a single rule for identifiers that checks whether the matched text is a keyword:
+
+```python
+from pyjapt import Grammar
+
+g = Grammar()
+keywords = g.add_terminals('if else while return true false')
+keyword_names = {t.name for t in keywords}
+
+@g.terminal('id', r'[a-zA-Z_][a-zA-Z0-9_]*')
+def id_terminal(lexer):
+    lexer.column   += len(lexer.token.lex)
+    lexer.position += len(lexer.token.lex)
+    if lexer.token.lex in keyword_names:
+        lexer.token.token_type = lexer.token.lex  # reclassify as keyword
+    return lexer.token
+```
+
+Because `id_terminal` is a *ruled* terminal it runs first. If the lexeme is a keyword name, the token type is changed to the keyword name, so the parser sees the keyword terminal instead of an identifier.
+
+---
+
+## Calling the Lexer
+
+```python
+lexer = g.get_lexer()
+
+# tokenise a string
+tokens = lexer('x + 42')
+
+for tok in tokens:
+    print(tok)
+# id: x
+# +: +
+# int: 42
+# $: $   ← EOF token appended automatically
+```
+
+`Lexer.__call__` resets all internal state (position, line number, column, error list) before each run, so the same instance can be reused safely.
+
+---
+
+## Checking for Lexical Errors
+
+After tokenisation, check `lexer.contain_errors` and read `lexer.errors`:
+
+```python
+tokens = lexer(source_code)
+
+if lexer.contain_errors:
+    for msg in lexer.errors:
+        print(msg)
+```
+
+See [Error Handling](error-handling.md) for custom lexical error handlers.
diff --git a/docs/parser.md b/docs/parser.md
new file mode 100644
index 0000000..2944285
--- /dev/null
+++ b/docs/parser.md
@@ -0,0 +1,144 @@
+# Building a Parser
+
+PyJapt provides three LR parser variants. This page explains how they differ, when to use each, and how to work with the parser object.
+
+---
+
+## Choosing a Parser Type
+
+```python
+parser = g.get_parser('slr')    # Simple LR
+parser = g.get_parser('lr1')    # Canonical LR(1)
+parser = g.get_parser('lalr1')  # LALR(1)
+```
+
+| Parser | Power | States | Speed | Best For |
+|--------|-------|--------|-------|----------|
+| `slr`  | Weakest | Fewest | Fastest to build | Simple grammars, prototyping |
+| `lalr1` | Middle | Fewest (same as SLR) | Fast to build | Most real-world grammars (e.g. C, Python) |
+| `lr1`  | Strongest | Most | Slowest to build | Grammars that LALR(1) cannot handle |
+
+**Rule of thumb:** start with `slr`. If you see shift-reduce or reduce-reduce conflicts that your grammar should not have, try `lalr1`. Use `lr1` only when necessary.
+
+---
+
+## How LR Parsing Works
+
+LR parsers are bottom-up. They maintain a *stack* and follow one of three actions at each step:
+
+- **Shift** — push the current input token onto the stack.
+- **Reduce** — pop symbols matching a production's body, run the semantic action, push the head non-terminal.
+- **Accept** — the start symbol covers the entire input; return the top semantic value.
+
+The parsing tables (ACTION and GOTO) encode which action to take for every (state, token) pair.
+
+---
+
+## Conflicts
+
+When two actions are valid for the same (state, lookahead) pair, a conflict arises:
+
+- **Shift-reduce (SR)** — the parser can either shift or reduce. PyJapt resolves SR conflicts in favour of **shift** (same as most tools, because it handles `if-else` correctly).
+- **Reduce-reduce (RR)** — two different reductions are possible. PyJapt keeps whichever was registered first.
+
+Conflicts are printed to `stderr` and stored in `parser.conflicts`:
+
+```python
+parser = g.get_parser('slr')
+# Warning: 1 Shift-Reduce Conflicts
+# Warning: 0 Reduce-Reduce Conflicts
+
+print(parser.shift_reduce_count)   # 1
+print(parser.reduce_reduce_count)  # 0
+print(parser.conflicts)            # [('SR', prod_a, prod_b)]
+```
+
+---
+
+## Semantic Actions
+
+A semantic action is a callable `(RuleList) -> Any` attached to a production.
+
+```python
+fact %= 'int', lambda s: int(s[1])
+```
+
+For longer actions, use `@g.production`:
+
+```python
+@g.production('stmt -> let id = expr ;')
+def let_stmt(s):
+    name  = s[2]    # id lexeme
+    value = s[4]    # expr semantic value
+    return LetStatement(name, value)
+```
+
+The `RuleList` `s` is 1-indexed over the body symbols:
+
+```
+stmt  ->  let  id  =  expr  ;
+s[0]      s[1] s[2] s[3] s[4] s[5]
+(head)
+```
+
+`s[0]` is set to whatever your action returns.
+
+---
+
+## Calling the Parser
+
+The parser is callable:
+
+```python
+result = parser(tokens)  # tokens: List[Token]
+```
+
+`tokens` is the list returned by `lexer(text)`. If you use a different tokeniser, ensure each token has `.token_type` set to the terminal name string.
+
+The return value is the semantic value of the start symbol, or `None` if parsing failed without a recovery path.
+
+---
+
+## Checking for Parsing Errors
+
+```python
+result = parser(tokens)
+
+if parser.contains_errors:
+    for msg in parser.errors:
+        print(msg)
+```
+
+---
+
+## The `verbose` Flag
+
+Pass `verbose=True` to `get_parser` to print every shift and reduce operation during parsing. Useful for debugging grammars.
+
+```python
+parser = g.get_parser('slr', verbose=True)
+parser(lexer('1 + 2'))
+# expr <-> 1 + 2 $
+#
+# Shift: ('1', 3)
+# ...
+```
+
+---
+
+## Parser Internals
+
+You can inspect the generated tables directly:
+
+```python
+# ACTION table: {(state_id, Terminal): ('SHIFT', next_state) | ('REDUCE', Production) | ('OK', None)}
+print(parser.action)
+
+# GOTO table: {(state_id, NonTerminal): next_state}
+print(parser.goto)
+
+# The augmented grammar used internally
+print(parser.augmented_grammar)
+```
+
+These are Python dicts and can be serialised — see [Serialisation](serialization.md).
diff --git a/docs/serialization.md b/docs/serialization.md
new file mode 100644
index 0000000..bf7b3b5
--- /dev/null
+++ b/docs/serialization.md
@@ -0,0 +1,170 @@
+# Serialisation
+
+For large grammars, building the parsing tables from scratch on every run can take seconds. PyJapt lets you *serialise* the pre-computed tables into plain Python modules so that subsequent runs skip the construction step entirely.
+
+---
+
+## How It Works
+
+Calling `serialize_lexer` or `serialize_parser` writes a Python source file (`lexertab.py` / `parsertab.py`) that contains the pre-computed tables as dictionaries. On subsequent runs you import the generated module instead of rebuilding.
+
+The generated classes extend `Lexer` and `ShiftReduceParser` respectively, so they have the full API of their base classes.
+
+---
+
+## Serialising the Lexer
+
+```python
+import inspect
+from pyjapt import Grammar
+
+g = Grammar()
+# ... define grammar ...
+
+if __name__ == '__main__':
+    module_name = inspect.getmodulename(__file__)
+    g.serialize_lexer(
+        class_name='MyLexer',
+        grammar_module_name=module_name,
+        grammar_variable_name='g',
+    )
+```
+
+This writes `lexertab.py` in the current working directory. The generated class looks like:
+
+```python
+# lexertab.py  (generated — do not edit by hand)
+import re
+from pyjapt import Token, Lexer
+from my_grammar import g
+
+class MyLexer(Lexer):
+    def __init__(self):
+        self.pattern     = re.compile(r'...')
+        self.token_rules = {key: rule for ...}
+        self.error_handler = g.lexical_error_handler or self.error
+        ...
+```
+
+---
+
+## Serialising the Parser
+
+```python
+if __name__ == '__main__':
+    module_name = inspect.getmodulename(__file__)
+    g.serialize_parser(
+        parser_type='lalr1',          # 'slr', 'lr1', or 'lalr1'
+        class_name='MyParser',
+        grammar_module_name=module_name,
+        grammar_variable_name='g',
+    )
+```
+
+This writes `parsertab.py`:
+
+```python
+# parsertab.py  (generated — do not edit by hand)
+from abc import ABC
+from pyjapt import ShiftReduceParser
+from my_grammar import g
+
+class MyParser(ShiftReduceParser, ABC):
+    def __init__(self, verbose=False):
+        self.grammar      = g
+        self.action       = self.__action_table()
+        self.goto         = self.__goto_table()
+        self.error_handler = g.parsing_error_handler or self.error
+        ...
+```
+
+---
+
+## Using the Generated Classes
+
+```python
+from lexertab  import MyLexer
+from parsertab import MyParser
+
+lexer  = MyLexer()
+parser = MyParser()
+
+result = parser(lexer(source_code))
+```
+
+---
+
+## Full Example
+
+**`grammar.py`** — define the grammar and conditionally serialise:
+
+```python
+import inspect
+from pyjapt import Grammar
+
+g = Grammar()
+expr = g.add_non_terminal('expr', start_symbol=True)
+term, fact = g.add_non_terminals('term fact')
+g.add_terminals('+ - * / ( )')
+g.add_terminal('int', regex=r'\d+')
+
+@g.terminal('whitespace', r' +')
+def ws(lexer):
+    lexer.column   += len(lexer.token.lex)
+    lexer.position += len(lexer.token.lex)
+
+expr %= 'expr + term', lambda s: s[1] + s[3]
+expr %= 'expr - term', lambda s: s[1] - s[3]
+expr %= 'term',        lambda s: s[1]
+term %= 'term * fact', lambda s: s[1] * s[3]
+term %= 'term / fact', lambda s: s[1] // s[3]
+term %= 'fact',        lambda s: s[1]
+fact %= '( expr )',    lambda s: s[2]
+fact %= 'int',         lambda s: int(s[1])
+
+if __name__ == '__main__':
+    module = inspect.getmodulename(__file__)
+    g.serialize_lexer(class_name='ArithLexer',   grammar_module_name=module, grammar_variable_name='g')
+    g.serialize_parser(parser_type='lalr1',
+                       class_name='ArithParser',  grammar_module_name=module, grammar_variable_name='g')
+```
+
+Run once to generate the tables:
+
+```sh
+python grammar.py
+```
+
+**`main.py`** — import and use:
+
+```python
+from lexertab  import ArithLexer
+from parsertab import ArithParser
+
+lexer  = ArithLexer()
+parser = ArithParser()
+
+while True:
+    line = input('> ')
+    print(parser(lexer(line)))
+```
+
+---
+
+## Regenerating the Tables
+
+The generated files must be regenerated whenever the grammar changes. A simple convention is to commit the grammar file (`grammar.py`) but add `lexertab.py` and `parsertab.py` to `.gitignore` and generate them as a build step.
+
+```gitignore
+# .gitignore
+lexertab.py
+parsertab.py
+```
+
+---
+
+## Caveats
+
+- **Semantic actions are not serialised.** The generated parser still imports the original grammar module (`grammar_module_name`) at runtime to access production rules and semantic actions.
+- **The grammar module must be importable.** Make sure `grammar.py` (or whatever you named it) is on the Python path when running the generated classes.
+- **Files are written to the current working directory.** Run the serialisation script from the directory where you want the files to be created.
diff --git a/mkdocs.yml b/mkdocs.yml
new file mode 100644
index 0000000..08000be
--- /dev/null
+++ b/mkdocs.yml
@@ -0,0 +1,78 @@
+site_name: PyJapt
+site_description: A lexer and LR parser generator written in Python
+site_author: Alejandro Klever
+site_url: https://alejandroklever.github.io/PyJapt
+repo_name: alejandroklever/PyJapt
+repo_url: https://github.com/alejandroklever/PyJapt
+edit_uri: edit/main/docs/
+
+theme:
+  name: material
+  palette:
+    - scheme: default
+      primary: deep purple
+      accent: purple
+      toggle:
+        icon: material/brightness-7
+        name: Switch to dark mode
+    - scheme: slate
+      primary: deep purple
+      accent: purple
+      toggle:
+        icon: material/brightness-4
+        name: Switch to light mode
+  features:
+    - navigation.tabs
+    - navigation.sections
+    - navigation.top
+    - navigation.footer
+    - toc.integrate
+    - search.suggest
+    - search.highlight
+    - content.code.annotate
+    - content.code.copy
+  icon:
+    repo: fontawesome/brands/github
+
+nav:
+  - Home: index.md
+  - Getting Started: getting-started.md
+  - User Guide:
+    - Defining a Grammar: defining-grammar.md
+    - Configuring the Lexer: lexer.md
+    - Building a Parser: parser.md
+    - Error Handling: error-handling.md
+    - Serialization: serialization.md
+  - API Reference: api-reference.md
+  - Changelog: changelog.md
+
+markdown_extensions:
+  - admonition
+  - pymdownx.details
+  - pymdownx.superfences:
+      custom_fences:
+        - name: mermaid
+          class: mermaid
+          format: !!python/name:pymdownx.superfences.fence_code_format
+  - pymdownx.highlight:
+      anchor_linenums: true
+  - pymdownx.inlinehilite
+  - pymdownx.snippets
+  - pymdownx.tabbed:
+      alternate_style: true
+  - attr_list
+  - md_in_html
+  - toc:
+      permalink: true
+
+plugins:
+  - search
+
+extra:
+  social:
+    - icon: fontawesome/brands/github
+      link: https://github.com/alejandroklever/PyJapt
+    - icon: fontawesome/brands/python
+      link: https://pypi.org/project/pyjapt/
+  version:
+    provider: mike

From 131f20c398da8590065dbcd51fd7a69755bf4e30 Mon Sep 17 00:00:00 2001
From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com>
Date: Thu, 26 Feb 2026 15:59:33 +0000
Subject: [PATCH 2/6] Initial plan


From 29b2bcc7c32637e16e6d3db8909e51e610831ad5 Mon Sep 17 00:00:00 2001
From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com>
Date: Thu, 26 Feb 2026 16:00:04 +0000
Subject: [PATCH 3/6] Initial plan


From e7e830e3cabb2fcaa976eb5eaab34d421575ae4b Mon Sep 17 00:00:00 2001
From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com>
Date: Thu, 26 Feb 2026 16:01:03 +0000
Subject: [PATCH 4/6] Fix British spellings to American in all docs

Co-authored-by: alejandroklever <45394625+alejandroklever@users.noreply.github.com>
---
 docs/api-reference.md    |  4 ++--
 docs/changelog.md        |  6 +++---
 docs/defining-grammar.md |  2 +-
 docs/error-handling.md   |  4 ++--
 docs/index.md            |  2 +-
 docs/parser.md           |  4 ++--
 docs/serialization.md    | 14 +++++++-------
 7 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/docs/api-reference.md b/docs/api-reference.md
index 4fa9a3c..836ed20 100644
--- a/docs/api-reference.md
+++ b/docs/api-reference.md
@@ -203,7 +203,7 @@ Raises `ValueError` for unknown names.
 
 ---
 
-### Serialisation
+### Serialization
 
 ---
 
@@ -225,7 +225,7 @@ Generate `parsertab.py` in the current working directory.
 
 #### `Grammar.to_json() -> str`
 
-Serialise the grammar structure (terminals, non-terminals, productions) to a JSON string. Semantic actions and regexes are **not** included.
+Serialize the grammar structure (terminals, non-terminals, productions) to a JSON string. Semantic actions and regexes are **not** included.
 
 ---
 
diff --git a/docs/changelog.md b/docs/changelog.md
index 72c0293..9cab546 100644
--- a/docs/changelog.md
+++ b/docs/changelog.md
@@ -28,7 +28,7 @@ This project follows [Semantic Versioning](https://semver.org) and the
 ### Planned — Testing
 - Add tests for LR(1) and LALR(1) parsers.
 - Add tests for lexer and parser error handling.
-- Add tests for serialisation round-trips.
+- Add tests for serialization round-trips.
 - Add edge-case tests (empty grammar, duplicate symbols, epsilon productions).
 - Enforce minimum test coverage threshold.
 
@@ -62,7 +62,7 @@ This project follows [Semantic Versioning](https://semver.org) and the
 - Improved `RuleList` error API.
 
 ### Fixed
-- Reset lexer parameters when analysing a new string (`Lexer.__call__`).
+- Reset lexer parameters when analyzing a new string (`Lexer.__call__`).
 
 ---
 
@@ -77,7 +77,7 @@ This project follows [Semantic Versioning](https://semver.org) and the
 
 ### Added
 - SLR, LR(1), and LALR(1) parsers.
-- Serialisation of lexer and parser to Python source files.
+- Serialization of lexer and parser to Python source files.
 - `@g.terminal` decorator for inline rule definition.
 - `@g.production` decorator for inline production rules.
 - `@g.lexical_error` and `@g.parsing_error` decorators.
diff --git a/docs/defining-grammar.md b/docs/defining-grammar.md
index 68d6459..e289f3a 100644
--- a/docs/defining-grammar.md
+++ b/docs/defining-grammar.md
@@ -224,4 +224,4 @@ g2 = Grammar.from_json(json_str)
 ```
 
 !!! note
-    JSON serialisation does not preserve terminal regexes, terminal rules, or semantic actions. It is useful for inspecting grammar structure, not for production use. Use [Python file serialisation](serialization.md) for production scenarios.
+    JSON serialization does not preserve terminal regexes, terminal rules, or semantic actions. It is useful for inspecting grammar structure, not for production use. Use [Python file serialization](serialization.md) for production scenarios.
diff --git a/docs/error-handling.md b/docs/error-handling.md
index 43262e9..c07d93f 100644
--- a/docs/error-handling.md
+++ b/docs/error-handling.md
@@ -6,7 +6,7 @@ Good error handling is one of PyJapt's core design goals. This page describes ho
 
 ## Lexical Error Handling
 
-### Default behaviour
+### Default behavior
 
 When the lexer encounters a character that matches no terminal pattern, it calls the *lexical error handler*. By default, this adds an error message to the internal errors list and advances past the bad character.
 
@@ -70,7 +70,7 @@ if lexer.contain_errors:
 
 ## Syntactic Error Handling
 
-### Default behaviour
+### Default behavior
 
 When the parser cannot find an action for the current `(state, token)` pair it enters *panic-mode recovery*: it calls the error handler and then skips input tokens until it finds one that fits the current state.
 
diff --git a/docs/index.md b/docs/index.md
index 6836a2b..6a16168 100644
--- a/docs/index.md
+++ b/docs/index.md
@@ -16,7 +16,7 @@
 | **Three LR parser types** | SLR, LR(1), and LALR(1) — choose the power level you need. |
 | **Custom error handling** | Lexical and syntactic error handlers are first-class citizens. |
 | **Semantic actions** | Attach a lambda or a decorated function to any production rule. |
-| **Serialisation** | Pre-build the parsing tables and serialise them to a Python module for faster startup. |
+| **Serialization** | Pre-build the parsing tables and serialize them to a Python module for faster startup. |
 | **Decorator-based API** | Define terminals and production rules without leaving Python. |
 
 ---
diff --git a/docs/parser.md b/docs/parser.md
index 2944285..6ffa344 100644
--- a/docs/parser.md
+++ b/docs/parser.md
@@ -38,7 +38,7 @@ The parsing tables (ACTION and GOTO) encode which action to take for every (stat
 
 When two actions are valid for the same (state, lookahead) pair, a conflict arises:
 
-- **Shift-reduce (SR)** — the parser can either shift or reduce. PyJapt resolves SR conflicts in favour of **shift** (same as most tools, because it handles `if-else` correctly).
+- **Shift-reduce (SR)** — the parser can either shift or reduce. PyJapt resolves SR conflicts in favor of **shift** (same as most tools, because it handles `if-else` correctly).
 - **Reduce-reduce (RR)** — two different reductions are possible. PyJapt keeps whichever was registered first.
 
 Conflicts are printed to `stderr` and stored in `parser.conflicts`:
@@ -141,4 +141,4 @@ print(parser.goto)
 print(parser.augmented_grammar)
 ```
 
-These are Python dicts and can be serialised — see [Serialisation](serialization.md).
+These are Python dicts and can be serialized — see [Serialization](serialization.md).
diff --git a/docs/serialization.md b/docs/serialization.md
index bf7b3b5..dec32b8 100644
--- a/docs/serialization.md
+++ b/docs/serialization.md
@@ -1,6 +1,6 @@
-# Serialisation
+# Serialization
 
-For large grammars, building the parsing tables from scratch on every run can take seconds. PyJapt lets you *serialise* the pre-computed tables into plain Python modules so that subsequent runs skip the construction step entirely.
+For large grammars, building the parsing tables from scratch on every run can take seconds. PyJapt lets you *serialize* the pre-computed tables into plain Python modules so that subsequent runs skip the construction step entirely.
 
 ---
 
@@ -12,7 +12,7 @@ The generated classes extend `Lexer` and `ShiftReduceParser` respectively, so th
 
 ---
 
-## Serialising the Lexer
+## Serializing the Lexer
 
 ```python
 import inspect
@@ -48,7 +48,7 @@ class MyLexer(Lexer):
 
 ---
 
-## Serialising the Parser
+## Serializing the Parser
 
 ```python
 if __name__ == '__main__':
@@ -96,7 +96,7 @@ result = parser(lexer(source_code))
 
 ## Full Example
 
-**`grammar.py`** — define the grammar and conditionally serialise:
+**`grammar.py`** — define the grammar and conditionally serialize:
 
 ```python
 import inspect
@@ -165,6 +165,6 @@ parsertab.py
 
 ## Caveats
 
-- **Semantic actions are not serialised.** The generated parser still imports the original grammar module (`grammar_module_name`) at runtime to access production rules and semantic actions.
+- **Semantic actions are not serialized.** The generated parser still imports the original grammar module (`grammar_module_name`) at runtime to access production rules and semantic actions.
 - **The grammar module must be importable.** Make sure `grammar.py` (or whatever you named it) is on the Python path when running the generated classes.
-- **Files are written to the current working directory.** Run the serialisation script from the directory where you want the files to be created.
+- **Files are written to the current working directory.** Run the serialization script from the directory where you want the files to be created.

From b1cc15b38a501c76b0c758288e1a1b8332f5949b Mon Sep 17 00:00:00 2001
From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com>
Date: Thu, 26 Feb 2026 16:01:45 +0000
Subject: [PATCH 5/6] Add RuleList to pyjapt top-level exports and update API
 reference docs

Co-authored-by: alejandroklever <45394625+alejandroklever@users.noreply.github.com>
---
 docs/api-reference.md | 1 +
 pyjapt/__init__.py    | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/docs/api-reference.md b/docs/api-reference.md
index 4fa9a3c..50d83cd 100644
--- a/docs/api-reference.md
+++ b/docs/api-reference.md
@@ -15,6 +15,7 @@ from pyjapt import (
     SLRParser,
     LR1Parser,
     LALR1Parser,
+    RuleList,
 )
 ```
 
diff --git a/pyjapt/__init__.py b/pyjapt/__init__.py
index 3252226..426c9bc 100644
--- a/pyjapt/__init__.py
+++ b/pyjapt/__init__.py
@@ -1,4 +1,4 @@
 from pyjapt.lexing import Lexer, Token
-from pyjapt.parsing import ShiftReduceParser, LR1Parser, LALR1Parser, SLRParser, Grammar
+from pyjapt.parsing import ShiftReduceParser, LR1Parser, LALR1Parser, SLRParser, Grammar, RuleList
 
 __version__ = "0.4.1"

From c8a74fe12d9a2f36c1e6983881675c56b6b3d9bb Mon Sep 17 00:00:00 2001
From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com>
Date: Tue, 3 Mar 2026 17:04:22 +0000
Subject: [PATCH 6/6] Initial plan