reqstool
diff --git a/‎.github/workflows/lint.yml‎
Lines changed: 2 additions & 0 deletions b/‎.github/workflows/lint.yml‎
Lines changed: 2 additions & 0 deletions
diff --git a/‎CONTRIBUTING.md‎
Lines changed: 15 additions & 0 deletions b/‎CONTRIBUTING.md‎
Lines changed: 15 additions & 0 deletions
diff --git a/‎PLAN_CODE_SMELLS.md‎
Lines changed: 61 additions & 0 deletions b/‎PLAN_CODE_SMELLS.md‎
Lines changed: 61 additions & 0 deletions
diff --git a/‎baselines/README.md‎
Lines changed: 137 additions & 0 deletions b/‎baselines/README.md‎
Lines changed: 137 additions & 0 deletions
@@ -23,3 +23,5 @@ jobs:
         run: pip install check-jsonschema
       - name: Validate JSON Schemas
         run: check-jsonschema --verbose --schemafile "https://json-schema.org/draft/2020-12/schema" src/reqstool/resources/schemas/*/*
+      - name: Validate generated Pydantic models are up to date
+        run: hatch run dev:codegen-check
@@ -39,3 +39,18 @@ hatch run dev:black src tests
 # Lint with flake8
 hatch run dev:flake8
 ```
+
+## Model Generation
+
+Pydantic data models in `src/reqstool/models/generated/` are auto-generated from the JSON Schemas
+in `src/reqstool/resources/schemas/v1/`. **JSON Schema is the source of truth** — never edit the
+generated files directly.
+
+To regenerate after modifying a schema:
+
+```bash
+hatch run dev:codegen
+```
+
+This runs `datamodel-codegen` to produce Pydantic v2 `BaseModel` classes from all schema files.
+The generated files should be committed alongside the schema changes.
@@ -0,0 +1,61 @@
+# Code Smells — Future Issues
+
+Identified during the Pydantic v2 migration (PR #306). These are **not** related to the migration itself and should be addressed in separate issues/PRs.
+
+---
+
+## High Severity
+
+### Scattered `sys.exit()` in location classes
+- **Files:** `maven_location.py:42-53`, `pypi_location.py:39-58`, `command.py:379-402`
+- **Problem:** Location classes call `sys.exit(1)` directly instead of raising exceptions. Makes unit testing impossible without mocking `sys.exit()`. Inconsistent error context (`RuntimeError` vs `RequestException`).
+- **Fix:** Create exception hierarchy (`LocationException`, `ArtifactNotFoundError`, etc.). Only `command.py` should call `sys.exit()`.
+
+### Mutable singleton `TempDirectoryUtil`
+- **Files:** `common/utils.py:346-372`
+- **Problem:** Mutable class-level state (`tmpdir`, `count`), lazy init singleton, no cleanup guarantee, thread-unsafe counter. Test pollution risk.
+- **Fix:** Replace with dependency-injected `TempDirectoryManager` using context managers for guaranteed cleanup.
+
+---
+
+## Medium Severity
+
+### Duplicated filter parsing logic
+- **Files:** `requirements_model_generator.py:250-299`, `svcs_model_generator.py:72-115`
+- **Problem:** Two near-identical ~50-line methods parse filters with same dict keys, same nesting, same UrnId conversion. Both have `# NOSONAR` suppression.
+- **Fix:** Extract into a shared `FilterParser` utility. Both methods reduce to 3-line calls.
+
+### Monolithic `status.py` with manual table rendering (428 lines)
+- **Files:** `commands/status/status.py:78-200+`
+- **Problem:** Mixes statistics calculation, manual Unicode box-drawing, and ANSI color handling. 70-line function just for merged headers. Works around `tabulate` library limitations.
+- **Fix:** Replace manual table rendering with **Rich** library. Eliminates ~300 lines of string manipulation.
+
+### No unit tests for CLI entry point
+- **Files:** `src/reqstool/command.py` (408 lines)
+- **Problem:** Main CLI entry point has zero unit tests. Only legacy E2E test exists for deprecated `generate-json`. Contains a `TODO $$$` comment.
+- **Fix:** Add `tests/unit/reqstool/test_command.py` covering argument parsing, location routing, error handling, deprecation warnings.
+
+---
+
+### Remaining `@dataclass` in generators/validators
+- **Files:** `combined_indexed_dataset_generator.py`, `indexed_dataset_filter_processor.py`, `syntax_validator.py`
+- **Problem:** Three infrastructure/service classes still use `@dataclass` after the Pydantic migration, creating inconsistency. They have 100+ internal `_` prefixed field references that Pydantic treats as `PrivateAttr`, so converting requires renaming all fields.
+- **Fix:** Convert to `BaseModel`, rename `_` prefixed fields to public names. Large diff but purely cosmetic — these are behavioral classes, not data models.
+
+---
+
+## Low Severity
+
+### Repetitive CLI parser setup
+- **Files:** `command.py:115-169`
+- **Problem:** Four location types repeat ~8 lines each of parser registration. Deprecated wrappers (`report-asciidoc`, `generate-json`) still exist as code.
+- **Fix:** Config-driven builder pattern with a location registry dict.
+
+### Expression language — minor inefficiencies
+- **Files:** `expression_languages/generic_el.py`, `requirements_el.py`, `svcs_el.py`
+- **Assessment:** Lark is the right tool — 26-line grammar, LALR parser, handles operator precedence and regex. **Not over-engineered.**
+- **Minor issues:**
+  1. `RequirementsELTransformer` and `SVCsELTransformer` are empty `pass` subclasses — no runtime value
+  2. New transformer instance created per item evaluated — could cache and reuse
+  3. Regex and nested boolean logic exist in grammar but production YAML only uses simple expressions
+- **Fix:** Remove empty subclasses, optimize transformer reuse. Low effort.
@@ -0,0 +1,137 @@
+# Regression Baselines
+
+This directory contains CLI output captured from `main` **before** the Pydantic v2 migration.
+Use these baselines to verify that code changes don't produce unintended output differences.
+
+**This directory should be removed before merging to `main`.**
+
+## How baselines were captured
+
+All baselines were captured on `main` at commit `784f77b` (the first commit on the feature branch).
+
+### Naming convention
+
+```
+<dataset>__<command>.txt       — CLI stdout (+ stderr for status)
+<dataset>__<command>.exitcode  — exit code of the command
+```
+
+### Datasets
+
+| Baseline prefix | Fixture path |
+|---|---|
+| `test_standard_baseline_ms-001` | `tests/resources/test_data/data/local/test_standard/baseline/ms-001` |
+| `test_standard_baseline_sys-001` | `tests/resources/test_data/data/local/test_standard/baseline/sys-001` |
+| `test_standard_empty_ms_ms-001` | `tests/resources/test_data/data/local/test_standard/empty_ms/ms-001` |
+| `test_standard_empty_ms_sys-001` | `tests/resources/test_data/data/local/test_standard/empty_ms/sys-001` |
+| `test_basic_baseline_ms-101` | `tests/resources/test_data/data/local/test_basic/baseline/ms-101` |
+| `test_basic_lifecycle_ms-101` | `tests/resources/test_data/data/local/test_basic/lifecycle/ms-101` |
+| `test_basic_lifecycle_validation_error` | `tests/resources/test_data/data/local/test_basic/lifecycle/validation_error` |
+| `test_basic_no_impls_basic_ms-101` | `tests/resources/test_data/data/local/test_basic/no_impls/basic/ms-101` |
+| `test_basic_no_impls_with_error_ms-101` | `tests/resources/test_data/data/local/test_basic/no_impls/with_error/ms-101` |
+| `test_delete_mvr_ms-001` | `tests/resources/test_data/data/local/test_delete_mvr/ms-001` |
+| `test_delete_mvr_sys-001` | `tests/resources/test_data/data/local/test_delete_mvr/sys-001` |
+| `test_errors_ms-101` | `tests/resources/test_data/data/local/test_errors/ms-101` |
+| `reqstool_demo` | `../reqstool-demo/docs/reqstool` (sibling project, requires `./mvnw verify` first) |
+
+## Running regression tests
+
+### 1. Run pytest
+
+```bash
+hatch run dev:pytest --override-ini="log_cli=false" -q
+```
+
+### 2. Compare CLI output against baselines
+
+For each in-repo dataset, run all 4 commands and diff against the baseline:
+
+```bash
+# Status (strip ANSI color codes)
+hatch run dev:python src/reqstool/command.py status local -p <FIXTURE_PATH> 2>&1 \
+  | sed 's/\x1b\[[0-9;]*m//g' > /tmp/feature-status.txt
+diff baselines/<DATASET>__status.txt /tmp/feature-status.txt
+
+# Report (AsciiDoc)
+hatch run dev:python src/reqstool/command.py report --format asciidoc local -p <FIXTURE_PATH> \
+  > /tmp/feature-report-adoc.txt 2>&1
+diff baselines/<DATASET>__report_adoc.txt /tmp/feature-report-adoc.txt
+
+# Report (Markdown)
+hatch run dev:python src/reqstool/command.py report --format markdown local -p <FIXTURE_PATH> \
+  > /tmp/feature-report-md.txt 2>&1
+diff baselines/<DATASET>__report_md.txt /tmp/feature-report-md.txt
+
+# Export (JSON)
+hatch run dev:python src/reqstool/command.py export local -p <FIXTURE_PATH> \
+  > /tmp/feature-export.txt 2>&1
+diff baselines/<DATASET>__export.txt /tmp/feature-export.txt
+```
+
+### 3. Quick full regression script
+
+Run all in-repo datasets at once (copy-paste friendly):
+
+```bash
+cd /path/to/reqstool-client
+
+# Key datasets to check
+for ds in \
+  "test_standard_baseline_ms-001 tests/resources/test_data/data/local/test_standard/baseline/ms-001" \
+  "test_standard_baseline_sys-001 tests/resources/test_data/data/local/test_standard/baseline/sys-001" \
+  "test_basic_baseline_ms-101 tests/resources/test_data/data/local/test_basic/baseline/ms-101"; do
+
+  name=$(echo $ds | cut -d' ' -f1)
+  path=$(echo $ds | cut -d' ' -f2)
+
+  echo "=== $name ==="
+
+  hatch run dev:python src/reqstool/command.py status local -p $path 2>&1 \
+    | sed 's/\x1b\[[0-9;]*m//g' > /tmp/f.txt
+  diff -q baselines/${name}__status.txt /tmp/f.txt && echo "  status: OK" || echo "  status: DIFF"
+
+  hatch run dev:python src/reqstool/command.py report --format asciidoc local -p $path > /tmp/f.txt 2>&1
+  diff -q baselines/${name}__report_adoc.txt /tmp/f.txt && echo "  report_adoc: OK" || echo "  report_adoc: DIFF"
+
+  hatch run dev:python src/reqstool/command.py report --format markdown local -p $path > /tmp/f.txt 2>&1
+  diff -q baselines/${name}__report_md.txt /tmp/f.txt && echo "  report_md: OK" || echo "  report_md: DIFF"
+
+  hatch run dev:python src/reqstool/command.py export local -p $path > /tmp/f.txt 2>&1
+  diff -q baselines/${name}__export.txt /tmp/f.txt && echo "  export: OK" || echo "  export: DIFF"
+done
+```
+
+### 4. reqstool-demo regression
+
+Requires the sibling `../reqstool-demo` project with Maven artifacts built (`./mvnw verify`):
+
+```bash
+hatch run dev:python src/reqstool/command.py status local -p ../reqstool-demo/docs/reqstool 2>&1 \
+  | sed 's/\x1b\[[0-9;]*m//g' > /tmp/f.txt
+diff baselines/reqstool_demo__status.txt /tmp/f.txt
+
+hatch run dev:python src/reqstool/command.py report --format asciidoc local -p ../reqstool-demo/docs/reqstool \
+  > /tmp/f.txt 2>&1
+diff baselines/reqstool_demo__report_adoc.txt /tmp/f.txt
+
+hatch run dev:python src/reqstool/command.py report --format markdown local -p ../reqstool-demo/docs/reqstool \
+  > /tmp/f.txt 2>&1
+diff baselines/reqstool_demo__report_md.txt /tmp/f.txt
+
+hatch run dev:python src/reqstool/command.py export local -p ../reqstool-demo/docs/reqstool \
+  > /tmp/f.txt 2>&1
+diff baselines/reqstool_demo__export.txt /tmp/f.txt
+```
+
+## Known expected diffs from Pydantic v2 migration
+
+1. **Export JSON — enum serialization**: Enums now serialize as clean values (`"effective"`)
+   instead of verbose jsonpickle format (`{"_value_": "effective", "_name_": "EFFECTIVE", ...}`).
+   This affects all datasets' `__export.txt` files.
+
+2. **Standard dataset — implementation counts**: The annotations nesting bug fix
+   (`List[List[AnnotationData]]` → `List[AnnotationData]`) changes implementation counts
+   in `test_standard_*` status and report outputs where requirements have multiple annotations.
+   Basic datasets are unaffected (1 annotation per requirement).
+
+If a diff is expected due to an intentional change, note it in the PR description.