kehoej · kehoej · Apr 13, 2026 · Apr 13, 2026 · Apr 13, 2026
@@ -59,7 +59,7 @@ make help       # List all targets
 cmd/contextception/    CLI entrypoint
 cmd/gen-schema/        JSON schema generator
 internal/
-  analyzer/            Core analysis engine (scoring, categorization, cycles)
+  analyzer/            Core analysis engine (scoring, categorization, risk triage, cycles)
   change/              PR/branch diff analysis
   classify/            File role classification
   cli/                 Command handlers (cobra subcommands)

@@ -31,16 +31,30 @@ make lint     # Run golangci-lint
 
 ```
 cmd/contextception/    CLI entrypoint
+cmd/gen-schema/        JSON schema generator
 internal/
-  analyzer/            Core analysis engine
+  analyzer/            Core analysis engine (scoring, categorization, risk triage)
   change/              PR/branch diff analysis
-  cli/                 Command handlers
-  config/              Configuration parsing
-  db/                  SQLite database layer
+  classify/            File role classification
+  cli/                 Command handlers (cobra subcommands)
+  config/              Configuration parsing (per-repo + global)
+  db/                  SQLite database layer (migrations, store, search)
   extractor/           Language-specific extractors (python, typescript, golang, java, rust)
-  git/                 Git history signals
-  indexer/             Incremental indexing
+  git/                 Git history signal extraction
+  grader/              Internal quality evaluation framework
+  history/             Historical analysis, usage tracking, and feedback storage
+  indexer/             Incremental indexing pipeline
+  mcpserver/           MCP server (tools, stdio transport)
+  model/               Shared data types
   resolver/            Module resolution (per-language)
+  session/             Claude Code session parser (discover, adoption)
+  update/              Version check, self-update, install method detection
+  validation/          Fixture-based validation framework
+  version/             Version injection (set via ldflags)
+protocol/              JSON Schema specifications
+schema/                Go types for schema generation
+integrations/          MCP config examples and slash commands
+testdata/              Test fixtures (synthetic repos + expected outputs)
 ```
 
 ## Adding a New Language

@@ -172,13 +172,47 @@ $ contextception analyze-change origin/main
 
 Returns a PR-level impact report with:
 
+- **Per-file risk scoring:** every changed file gets a risk score (0–100) and tier (SAFE/REVIEW/TEST/CRITICAL)
+- **Risk triage:** files grouped by tier with human-readable risk narratives explaining why each file is flagged
 - **Test gaps:** changed files with no test coverage, flagged before merge
+- **Test suggestions:** auto-generated recommendations for high-risk untested files (which test file to create, what to test)
 - **Coupling detection:** pairs of changed files that depend on each other
 - **Hidden coupling:** co-change partners *not in your diff* that may need updating
-- **Per-file blast radius:** which specific changes carry the most risk
+- **Aggregate risk:** overall PR risk score with percentile ranking against historical baselines (after 10+ analyses)
 - **Aggregated must_read:** merged context across all changed files
 
-Use `--ci --fail-on high` to gate PRs automatically. Results are stored in a local history database, enabling trend tracking with `contextception history`:
+### Risk Tiers
+
+| Tier | Score | Meaning |
+|------|-------|---------|
+| **SAFE** | 0–20 | New files, well-tested utilities, low coupling |
+| **REVIEW** | 21–50 | Moderate risk, standard code review sufficient |
+| **TEST** | 51–75 | High risk, targeted testing recommended |
+| **CRITICAL** | 76–100 | Maximum risk, regressions likely without careful review |
+
+Risk scores combine change status, structural factors (importer count, co-change frequency, fragility, mutual dependencies), and test coverage adjustments.
+
+### Token-optimized output
+
+Use `--compact` for a text summary optimized for LLM consumption (~60–75% fewer tokens than JSON):
+
+```bash
+$ contextception analyze-change --compact
+```
+
+### CI integration
+
+Use `--ci --fail-on` to gate PRs automatically. A risk badge is printed to stderr:
+
+```bash
+# Fail on high blast radius
+contextception analyze-change --ci --fail-on high
+
+# Fail only if risk triage has CRITICAL files
+contextception analyze-change --ci --fail-on critical
+```
+
+Results are stored in a local history database, enabling trend tracking with `contextception history`:
 
 ```bash
 $ contextception history hotspots     # Files that repeatedly appear as hotspots
@@ -249,7 +283,20 @@ Contextception averages ~1,000 tokens per analysis vs. Repomix's full-repo outpu
 
 ## MCP Setup (30 seconds)
 
-Make your AI agent smarter. Add to your `~/.claude.json` (Claude Code) or equivalent MCP config:
+Make your AI agent smarter. The `setup` command auto-detects your editor and configures everything:
+
+```bash
+# Claude Code (MCP server + hooks + slash commands)
+contextception setup
+
+# Cursor or Windsurf
+contextception setup --editor cursor
+contextception setup --editor windsurf
+```
+
+Use `--dry-run` to preview changes, or `--uninstall` to reverse.
+
+Or configure manually — add to your `~/.claude.json` (Claude Code) or equivalent MCP config:
 
 ```json
 {
@@ -273,11 +320,22 @@ This exposes nine tools to the AI agent:
 | `get_entrypoints` | Return entrypoint and foundation files for project orientation |
 | `get_structure` | Return directory structure with file counts and language distribution |
 | `get_archetypes` | Detect representative files across architectural layers |
-| `analyze_change` | Analyze the impact of a git diff / PR (blast radius, test gaps, coupling) |
+| `analyze_change` | Analyze the impact of a git diff / PR (risk scoring, triage, test gaps, coupling) |
 | `rate_context` | Rate how useful a previous `get_context` result was (feedback for accuracy tracking) |
 
 Works with **Claude Code**, **Cursor**, **Windsurf**, and any MCP-compatible tool.
 
+### Slash Commands
+
+Two built-in slash commands for AI-assisted PR review (installed automatically by `contextception setup`):
+
+| Command | Description |
+|---------|-------------|
+| `/pr-risk` | Run risk analysis on the current branch and present an actionable, human-friendly review |
+| `/pr-fix` | Analyze risk, then build an ordered plan to fix every issue found (test gaps, coupling, fragility) |
+
+These work by combining contextception's deterministic risk analysis with the LLM's ability to explain and translate — contextception computes the scores, the LLM presents them in plain language. See [`integrations/`](integrations/) for setup details.
+
 ---
 
 ## Language Support
@@ -320,7 +378,7 @@ contextception session                  Show contextception adoption across Clau
 | `--mode plan\|implement\|review` | Shape output for AI workflow stage |
 | `--token-budget N` | Cap output to fit token limits |
 | `--compact` | Token-optimized text summary (~60-75% fewer tokens than JSON) |
-| `--ci --fail-on high\|medium` | Exit codes for CI pipelines |
+| `--ci --fail-on high\|medium\|critical` | Exit codes for CI pipelines |
 | `--cap N` | Limit must_read entries (overflow to related) |
 | `--no-external` | Exclude external dependencies |
 | `--no-update-check` | Disable automatic update version check |

@@ -119,11 +119,23 @@ The historical cap prevents co-change signals from overwhelming structural evide
 Analyzes the impact of a git diff (PR or branch):
 
 1. Diffs `base..head` to find changed files
-2. Analyzes each changed file independently
-3. Detects coupling between changed files (structural edges)
-4. Identifies test gaps (changed files with no test coverage)
-5. Surfaces hidden coupling (co-change partners not in the diff)
-6. Aggregates blast radius across all changed files
+2. Analyzes each changed file independently (full per-file AnalysisOutput)
+3. Computes per-file risk scores (0--100) with tier classification (SAFE/REVIEW/TEST/CRITICAL)
+4. Detects coupling between changed files (structural edges)
+5. Identifies test gaps (changed files with no test coverage)
+6. Surfaces hidden coupling (co-change partners not in the diff)
+7. Aggregates blast radius and risk triage across all changed files
+8. Generates test suggestions for high-risk untested files
+
+### Risk Scoring Engine (`internal/analyzer/risk.go`)
+
+Per-file risk scoring for change analysis. Formula: `base_score + structural_risk * coverage_multiplier`, clamped to [0, 100].
+
+- **Base score**: added=10 (20 with exports), modified=30, deleted=5, renamed=5
+- **Structural risk**: normalized importer count, co-change frequency, fragility (Ce/(Ca+Ce)), mutual deps, cycles
+- **Coverage adjustment**: direct tests ×0.7, dependency tests ×0.85, no tests ×1.2
+- **Evidence gating**: same-package siblings filtered unless they have import edges, co-change ≥2, or prefix match
+- **Percentile ranking**: stored in `history.sqlite` `risk_scores` table, computed after 10+ records
 
 ### Database Layer (`internal/db/`)
 

@@ -98,6 +98,57 @@ Where Ce = files the subject imports (outdegree), Ca = files that import the sub
 
 ---
 
+## Risk Triage
+
+The `analyze-change` command assigns a per-file **risk score** (0--100) and groups files into tiers:
+
+| Tier | Score Range | Meaning |
+|------|-------------|---------|
+| `SAFE` | 0--20 | New files, well-tested utilities, low coupling |
+| `REVIEW` | 21--50 | Moderate risk, standard code review sufficient |
+| `TEST` | 51--75 | High risk, targeted testing recommended |
+| `CRITICAL` | 76--100 | Maximum risk, regressions likely without careful review |
+
+### Risk Score Formula
+
+The score combines four components:
+
+1. **Base score** by change status: added=10 (20 with exports), modified=30, deleted=5, renamed=5
+2. **Structural risk** (modified files only): normalized importer count, co-change frequency, fragility, mutual dependencies, circular dependencies
+3. **Coverage adjustment**: direct tests ×0.7, dependency tests ×0.85, no tests ×1.2
+4. **Clamp** to [0, 100]
+
+### Per-file fields
+
+| Field | Type | Description |
+|-------|------|-------------|
+| `risk_score` | int | Computed risk score (0--100) |
+| `risk_tier` | string | `SAFE`, `REVIEW`, `TEST`, or `CRITICAL` |
+| `risk_factors` | []string | Factors contributing to the score |
+| `risk_narrative` | string | Human-readable risk explanation |
+
+### Report-level fields
+
+| Field | Type | Description |
+|-------|------|-------------|
+| `risk_triage` | object | Files grouped by tier (critical, test, review, safe) |
+| `aggregate_risk.score` | int | Max per-file score across the PR |
+| `aggregate_risk.percentile` | int | Percentile vs. historical scores (after 10+ analyses) |
+| `aggregate_risk.regression_risk` | string | Summary of regression risk from critical files |
+| `aggregate_risk.test_coverage_ratio` | float | Ratio of changed files with direct tests |
+| `test_suggestions` | []object | Suggested tests for high-risk untested files |
+
+### Evidence-Gated Same-Package Filtering
+
+Same-package siblings (Go, Java, Rust) are only included in `must_read` if they have structural evidence:
+- Direct import/call edge
+- Co-change frequency >= 2
+- Filename prefix match
+
+This reduces noise in large packages where most siblings are irrelevant.
+
+---
+
 ## Hotspot Detection
 
 Identifies files that are both high-churn AND structural bottlenecks:
@@ -370,19 +421,24 @@ contextception session                  Show adoption across Claude Code session
 | `--signatures` | false | Include code signatures for must_read symbols |
 | `--stable-threshold` | adaptive | Indegree threshold for the stable flag |
 | `--ci` | false | CI mode: suppress output, exit code reflects blast radius |
-| `--fail-on` | high | Blast radius level that triggers non-zero exit (`high` or `medium`) |
+| `--fail-on` | high | Trigger non-zero exit: `high`, `medium`, or `critical` (risk triage) |
 | `--mode` | (none) | Workflow mode: `plan`, `implement`, or `review` |
 | `--token-budget` | 0 | Target token budget (auto-adjusts caps) |
 | `--compact` | false | Token-optimized text summary (~60-75% fewer tokens than JSON) |
 
 ### CI mode
 
-When `--ci` is set, output is suppressed and the exit code reflects blast radius:
+When `--ci` is set, output is suppressed and the exit code reflects blast radius. A risk badge is also printed to stderr:
+
+```
+contextception: main..HEAD blast_radius=high files=27
+RISK: 72/100 | 1 CRITICAL | 2 TEST | 5 REVIEW | 19 SAFE
+```
 
 | Exit code | Meaning |
 |-----------|---------|
 | 0 | Blast radius below threshold |
-| 1 | Medium blast radius (with `--fail-on medium`) |
+| 1 | Medium blast radius (with `--fail-on medium`) or CRITICAL files (with `--fail-on critical`) |
 | 2 | High blast radius |
 
 ```bash
@@ -391,6 +447,9 @@ contextception analyze-change --ci --fail-on high
 
 # Fail on medium or high
 contextception analyze-change --ci --fail-on medium
+
+# Fail only if risk triage has CRITICAL files
+contextception analyze-change --ci --fail-on critical
 ```
 
 ### Workflow modes

@@ -38,7 +38,10 @@ contextception setup --editor cursor
 contextception setup --editor windsurf
 ```
 
-Use `--dry-run` to preview changes, or `--uninstall` to reverse. For Claude Code, this also installs hooks that remind the AI to call `get_context` before editing files.
+Use `--dry-run` to preview changes, or `--uninstall` to reverse. For Claude Code, this installs:
+- MCP server configuration
+- PreToolUse hooks that remind the AI to call `get_context` before editing files
+- `/pr-risk` and `/pr-fix` slash commands for AI-assisted PR review
 
 ## Manual Configuration
 
@@ -151,6 +154,17 @@ All integrations expose the same nine tools:
 
 Contextception supports repositories using: Python, TypeScript/JavaScript, Go, Java, Rust.
 
+## Slash Commands
+
+Two slash commands are included for AI-assisted PR review. These are installed automatically by `contextception setup` for Claude Code.
+
+| Command | File | Description |
+|---------|------|-------------|
+| `/pr-risk` | [`claude-code/pr-risk.md`](claude-code/pr-risk.md) | Run risk analysis and present a human-friendly review with verdicts, test coverage, and next steps |
+| `/pr-fix` | [`claude-code/pr-fix.md`](claude-code/pr-fix.md) | Analyze risk, then build an ordered fix plan for every issue (test gaps, coupling, fragility) |
+
+For Cursor/Windsurf, place the command files in `.cursor/rules/` or `.windsurf/rules/` respectively. For other agents, see [`pr-risk-review.md`](pr-risk-review.md) for the full prompt template.
+
 ## Further Reading
 
 - [MCP Tutorial](../docs/mcp-tutorial.md) — step-by-step guide to adding context intelligence to any AI agent