Disaster-Terminator · Disaster-Terminator · May 5, 2026
diff --git a/docs/runbooks/redacted-route-quality-review.md b/docs/runbooks/redacted-route-quality-review.md
@@ -0,0 +1,62 @@
+# Redacted route quality review runbook
+
+Use this workflow to review route decisions without exposing raw prompts or private logs.
+
+## 1) Collect only redacted samples
+
+- Input must be JSONL, one object per line.
+- Keep only redacted text snippets suitable for internal sharing.
+- Every line must include `"redacted": true`.
+- Do **not** include raw conversation logs, credentials, tokens, or user identifiers.
+
+## 2) Required JSONL fields
+
+Each sample line must include:
+
+- `text` (string): redacted prompt text
+- `expect` (string): expected **route_id**
+- `redacted` (boolean): must be `true`
+
+Optional fields:
+
+- `source` (string)
+- `note` (string)
+
+Example:
+
+```json
+{"text":"[REDACTED] payment flow timed out in prod","expect":"strong","redacted":true,"source":"incident_review"}
+```
+
+> `expect` must be a configured `route_id` (for example `fast`, `strong`), **not** a deployment `target_model` name.
+
+## 3) Import with route-config validation
+
+Convert JSONL to eval YAML and validate expected routes against your active route config:
+
+```bash
+uv run python scripts/import_review_samples.py \
+  --input tests/samples/redacted_review_fixture.jsonl \
+  --output /tmp/redacted_review_cases.yaml \
+  --routes config/routes.yaml
+```
+
+If a sample uses a `target_model` in `expect`, import fails with a route-id validation error.
+
+## 4) Run `review_decisions` against the decision endpoint
+
+Point the review script at the sidecar decision endpoint:
+
+```bash
+uv run python scripts/review_decisions.py \
+  --cases /tmp/redacted_review_cases.yaml \
+  --endpoint http://127.0.0.1:8080/v1/route/decision \
+  --routes config/routes.yaml
+```
+
+## 5) Interpret PASS/FAIL safely
+
+- `PASS`: returned `route_id` matches expected `route_id`.
+- `FAIL`: returned `route_id` differs from expected `route_id`.
+- Use route-level aggregates and mismatch counts for audits.
+- Do not copy raw prompts into tickets; reference sample IDs/notes instead.
diff --git a/tests/samples/redacted_review_fixture.jsonl b/tests/samples/redacted_review_fixture.jsonl
@@ -0,0 +1 @@
+{"text":"[REDACTED] checkout incident summary","expect":"strong","redacted":true,"source":"synthetic_fixture","note":"synthetic sample for tooling checks"}
diff --git a/tests/test_import_review_samples.py b/tests/test_import_review_samples.py
@@ -1,6 +1,8 @@
 from __future__ import annotations
 
 import json
+from pathlib import Path
+
 import pytest
 import yaml
 
@@ -197,3 +199,23 @@ def test_main_invalid_unredacted_input_fails_before_writing(tmp_path, monkeypatc
     with pytest.raises(ReviewSampleError, match="redacted=true"):
         import_review_samples.main()
     assert not output_path.exists()
+
+
+def test_redacted_fixture_is_redacted_and_uses_route_id_expectation():
+    fixture_path = Path("tests/samples/redacted_review_fixture.jsonl")
+    line = fixture_path.read_text(encoding="utf-8").strip()
+    sample = json.loads(line)
+
+    assert sample["redacted"] is True
+    assert sample["expect"] == "strong"
+    assert sample["expect"] not in {"cheap-router", "pro-router", "free-probe-router"}
+
+
+def test_redacted_fixture_converts_with_route_validation():
+    fixture_path = Path("tests/samples/redacted_review_fixture.jsonl")
+    raw_lines = fixture_path.read_text(encoding="utf-8").splitlines()
+
+    result = convert_review_samples(raw_lines, allowed_route_ids={"fast", "strong", "experimental"})
+
+    assert result["cases"][0]["expect"] == "strong"
+    assert result["cases"][0]["source"] == "production_review:synthetic_fixture"
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1 @@
		{"text":"[REDACTED] checkout incident summary","expect":"strong","redacted":true,"source":"synthetic_fixture","note":"synthetic sample for tooling checks"}