Skip to content

Commit 10435bf

Browse files
author
amar-python
committed
Update README for eval coverage
1 parent cc6d6dd commit 10435bf

1 file changed

Lines changed: 40 additions & 4 deletions

File tree

README.md

Lines changed: 40 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@
66
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)
77
[![Environments](https://img.shields.io/badge/Environments-Dev%20%7C%20Test%20%7C%20Staging%20%7C%20Prod-blue)]()
88
[![Test Suites](https://img.shields.io/badge/Tests-5%20suites%20%7C%2085%20assertions-brightgreen)]()
9+
[![Evals](https://img.shields.io/badge/Evals-23%20CSV%20scenarios-brightgreen)]()
910
[![Terraform](https://img.shields.io/badge/Terraform-1.5%2B-7B42BC?logo=terraform&logoColor=white)](https://developer.hashicorp.com/terraform)
1011

1112
---
@@ -20,6 +21,7 @@ This project provides a **production-grade SQL framework** to stand up a T&E man
2021
- **Defect reporting** — deficiency reports (DRs) linked directly to failed results
2122
- **Multi-environment isolation** — separate databases, schemas, and users for Dev, Test, Staging, and Prod
2223
- **Automated data testing** — 85 assertions across 5 SQL test suites, all written in pure PostgreSQL
24+
- **Data-driven evals** — 23 offline CSV validator scenarios plus PostgreSQL-backed idempotency and full-suite checks
2325

2426
All names (database, schema, users, every table) are controlled by a single `\set` configuration block at the top of each environment file. Rename anything in one place and the entire script updates automatically.
2527

@@ -84,6 +86,12 @@ PostgreDataMigrationApp/
8486
│ ├── run_all_tests.sql ← Master test orchestrator
8587
│ └── run_tests.sh ← Bash wrapper (reads config.local.env)
8688
89+
├── evals/ ← Data-driven operational evals
90+
│ ├── runner.py ← Scenario discovery, diff engine, JSON reports
91+
│ ├── datasets/tier_p/ ← 23 offline CSV validator scenarios
92+
│ ├── expected/tier_p/ ← Expected outputs for validator evals
93+
│ └── reports/ ← Generated reports (gitignored)
94+
8795
├── terraform-github-repos/ ← GitHub repos as Infrastructure as Code
8896
│ ├── main.tf
8997
│ ├── variables.tf
@@ -331,6 +339,12 @@ Realistic Australian T&E data is loaded automatically when `include_seed_data` i
331339
# Run Python tests
332340
python -m unittest discover -s tests -p "test*.py" -v
333341

342+
# Run data-driven CSV validator evals
343+
python evals/runner.py --tiers p
344+
345+
# Run all eval tiers; PostgreSQL-backed tiers skip cleanly if PG is unavailable
346+
python evals/runner.py --tiers p,i,s
347+
334348
# Manually via psql
335349
psql -U postgres -d te_mgmt_dev \
336350
--set schema_name=te_dev \
@@ -350,7 +364,7 @@ psql -U postgres -d te_mgmt_dev \
350364
```
351365

352366
```powershell
353-
# Windows/PowerShell runner for Python validator tests
367+
# Windows/PowerShell runner for Python tests
354368
powershell -NoProfile -ExecutionPolicy Bypass -File "tests/run_python_tests.ps1"
355369
# Optional: run a custom test path
356370
powershell -NoProfile -ExecutionPolicy Bypass -File "tests/run_python_tests.ps1" -TestPath "tests/test_csv_validator.py"
@@ -360,19 +374,21 @@ powershell -NoProfile -ExecutionPolicy Bypass -File "tests/run_python_tests.ps1"
360374

361375
- `setup.sh`, `deploy_all.sh`, and `tests/run_tests.sh` are bash scripts.
362376
- On Windows, run shell scripts via WSL2 or Git Bash.
363-
- The Python validator tests are Windows-native and do not require WSL.
364-
- Required for Python validator tests:
377+
- The Python unit tests and Tier P evals are Windows-native and do not require WSL.
378+
- Tier I and Tier S evals require a reachable PostgreSQL instance and `psql` on PATH; if unavailable, they skip cleanly.
379+
- Required for Python tests and offline evals:
365380
- Python on PATH (`python --version`)
366381
- PowerShell available (`pwsh` or `powershell`)
367382
- Recommended Windows command:
368383

369384
```powershell
370385
powershell -NoProfile -ExecutionPolicy Bypass -File "tests/run_python_tests.ps1"
386+
python evals\runner.py --tiers p
371387
```
372388

373389
### CI validation (Windows)
374390

375-
A GitHub Actions workflow is included for Windows validation of the Python validator tests:
391+
A GitHub Actions workflow is included for Windows validation of the Python tests:
376392

377393
- Workflow file: `.github/workflows/python-validator-tests.yml`
378394
- Runner: `windows-latest`
@@ -383,6 +399,26 @@ A GitHub Actions workflow is included for Windows validation of the Python valid
383399
powershell -NoProfile -ExecutionPolicy Bypass -File "tests/run_python_tests.ps1"
384400
```
385401

402+
### Data-driven evals
403+
404+
The `evals/` package complements the SQL and unit tests with scenario fixtures and expected JSON outputs.
405+
406+
| Tier | What it validates | Database required? |
407+
|---|---|---|
408+
| P | `csv/validator.py` across 23 CSV edge cases, including malformed rows, BOM, CRLF, Unicode, quoted newlines, long fields, missing env vars, and invalid UTF-8 bytes | No |
409+
| I | Dev deployment idempotency by deploying twice and comparing seed row counts | Yes |
410+
| S | Fresh Dev deploy followed by the full SQL suite, expecting all 85 assertions to pass | Yes |
411+
412+
Run examples:
413+
414+
```powershell
415+
python evals\runner.py # Tier P only
416+
python evals\runner.py --tiers p,i,s # all tiers; I/S skip if PostgreSQL is unavailable
417+
python evals\runner.py --only 14_quoted_newline --tiers p
418+
```
419+
420+
Each eval run writes a JSON report under `evals/reports/<run_id>/summary.json`; that folder is intentionally gitignored.
421+
386422
### Coverage — 85 assertions across 5 suites
387423

388424
| Suite | Assertions | What is tested |

0 commit comments

Comments
 (0)