From 87e4dac41289ae1d41534e8ca40117cd40bbd563 Mon Sep 17 00:00:00 2001 From: csoceanu Date: Thu, 7 May 2026 13:47:10 +0300 Subject: [PATCH 1/4] feat: add docs-review skill for documentation staleness detection Add a new docs-review skill to the review agent that detects stale documentation caused by code changes. The skill builds an identifier checklist from every changed file, greps documentation for matches, and evaluates candidates in two passes (quick scan then deep read). Changes: - Create skills/docs-review/SKILL.md with 6-step process - Add docs-review to review harness and agent definition - Add documentation currency as 7th review dimension - Update pr-review to delegate to docs-review after code-review Co-Authored-By: Claude Opus 4.6 (1M context) --- .../scaffold/fullsend-repo/agents/review.md | 23 ++- .../fullsend-repo/harness/review.yaml | 1 + .../fullsend-repo/skills/docs-review/SKILL.md | 172 ++++++++++++++++++ .../fullsend-repo/skills/pr-review/SKILL.md | 44 +++-- 4 files changed, 222 insertions(+), 18 deletions(-) create mode 100644 internal/scaffold/fullsend-repo/skills/docs-review/SKILL.md diff --git a/internal/scaffold/fullsend-repo/agents/review.md b/internal/scaffold/fullsend-repo/agents/review.md index 7a736e7b3..14ba915a9 100644 --- a/internal/scaffold/fullsend-repo/agents/review.md +++ b/internal/scaffold/fullsend-repo/agents/review.md @@ -11,6 +11,7 @@ model: opus skills: - code-review - pr-review + - docs-review --- # Review Agent @@ -33,7 +34,7 @@ push commits, or merge PRs — you evaluate and report. ## Identity -You evaluate code changes across six review dimensions: +You evaluate code changes across seven review dimensions: 1. **Correctness** — logic errors, edge cases, test adequacy, test integrity @@ -47,21 +48,31 @@ You evaluate code changes across six review dimensions: non-rendering Unicode, bidirectional overrides 6. **Style/conventions** — naming, patterns, documentation beyond what linters catch +7. **Documentation currency** — whether the PR's code changes have + made in-repo documentation stale, incomplete, or misleading -The `code-review` skill defines the full evaluation procedure for each -dimension. +The `code-review` skill defines the evaluation procedure for dimensions +1–6. The `docs-review` skill handles dimension 7 (documentation +currency). ## Skill routing -This agent has two skills. Select based on invocation context: +This agent has three skills. Select based on invocation context: - **`pr-review`** — the prompt references a PR number, PR URL, or GitHub PR context. This skill gathers PR metadata, delegates code - evaluation to `code-review`, adds PR-specific checks, and posts a - review via the GitHub API. + evaluation to `code-review` and documentation staleness checks to + `docs-review`, adds PR-specific checks, and posts a review via + the GitHub API. - **`code-review`** — the prompt is about a local branch diff with no PR, or another skill is delegating code evaluation. This skill evaluates the diff and source files directly. +- **`docs-review`** — delegated by `pr-review` after code evaluation + completes. Evaluates whether in-repo documentation has been made + stale by the code changes. Follow the skill's checklist and + two-pass evaluation process completely — do not skip entries or + shortcut the evaluation. Read-only — produces findings but does + not update docs. When invoked via `--print` for pre-push review, use `code-review`. When invoked for a GitHub PR, use `pr-review`. diff --git a/internal/scaffold/fullsend-repo/harness/review.yaml b/internal/scaffold/fullsend-repo/harness/review.yaml index 71ddc4df2..d3c958adf 100644 --- a/internal/scaffold/fullsend-repo/harness/review.yaml +++ b/internal/scaffold/fullsend-repo/harness/review.yaml @@ -6,6 +6,7 @@ policy: policies/review.yaml skills: - skills/pr-review - skills/code-review + - skills/docs-review pre_script: scripts/pre-review.sh diff --git a/internal/scaffold/fullsend-repo/skills/docs-review/SKILL.md b/internal/scaffold/fullsend-repo/skills/docs-review/SKILL.md new file mode 100644 index 000000000..ef642ed29 --- /dev/null +++ b/internal/scaffold/fullsend-repo/skills/docs-review/SKILL.md @@ -0,0 +1,172 @@ +--- +name: docs-review +description: >- + Evaluates whether a PR's code changes have made in-repo documentation + stale. Discovers documentation files, matches changed identifiers + against doc content, and produces findings for stale or missing + documentation. Read-only — flags staleness but does not update docs. + Can be delegated to by the pr-review skill. +--- + +# Docs Review + +Code changes can silently invalidate documentation. A renamed function, +a changed API signature, a removed configuration option — each can leave +docs describing behavior that no longer exists. This skill detects that +drift by matching the PR's code changes against in-repo documentation +and flagging docs whose descriptions contradict the new code. + +## Process + +Follow these steps in order. Do not skip steps. + +### 1. Receive the change context + +Use the diff and changed file list provided by the invoking skill +(typically `pr-review`). If invoked standalone, obtain the diff: + +```bash +DEFAULT_BRANCH=$(git rev-parse --abbrev-ref origin/HEAD | cut -d/ -f2) +git diff $(git merge-base HEAD "$DEFAULT_BRANCH")..HEAD +``` + +Record the list of files changed in the PR: + +```bash +git diff --name-only $(git merge-base HEAD "$DEFAULT_BRANCH")..HEAD +``` + +If no change can be identified, stop and report — do not guess. + +### 2. Build the identifier checklist + +Go through **every** changed file in the PR. For each file, extract +identifiers from the modified lines (lines starting with `+` or `-`) +and from diff hunk headers (`@@` lines). Write them down as a numbered +checklist — one entry per changed file, with all identifiers from that +file. + +Use the most specific form of each identifier. CLI flag names, +configuration keys, full function names, and type names are good — +they match only relevant docs. Avoid generic short words that would +match hundreds of unrelated files. If a generic term is the only +identifier available for a change, include it, but prefer specific +forms when they exist. + +Do not skip files. Do not prioritize some files over others. Every +changed file gets an entry in the checklist. This checklist is your +contract — you will search docs for every identifier on it. + +### 3. Discover documentation files + +Find all documentation files in the repository. This includes files in +dedicated documentation directories and standalone documentation files +like README.md at any level. Explore the repository structure to +understand where documentation lives — different repos organize docs +differently. + +Search for documentation files (`.md`, `.rst`, `.adoc`) across the +repository. Exclude generated or dependency directories (`.git`, +`vendor`, `node_modules`) but do not exclude documentation files based +on their name or location. + +If the repository has no documentation files, produce zero findings +and exit — there is nothing to check. + +### 4. Search docs for every identifier + +Write a shell script that takes the identifiers from step 2 and +greps for each one across the documentation files from step 3. +Run the script in a single Bash call: + +```bash +for id in "identifier1" "identifier2" "identifier3"; do + matches=$(grep -rl "$id" --include="*.md" --include="*.rst" --include="*.adoc" . 2>/dev/null) + if [ -n "$matches" ]; then + echo "MATCH: $id -> $matches" + else + echo "NO_MATCH: $id" + fi +done +``` + +Include every identifier from every checklist entry in the `for` +loop. The script handles the searching mechanically — no identifiers +are skipped. + +From the script output, collect all matched doc files into a +candidate list. Exclude documentation files that are already modified +in the same PR — those are being actively updated. + +### 5. Evaluate every candidate (two passes) + +**Pass 1 — Quick scan.** For each candidate doc file from step 4, +view only the lines that matched the grep (use `grep -n` to see them +in context). Based on the matching lines alone, decide whether the +doc might be stale. Record a verdict for every candidate: + +``` +- path/to/doc.md → possibly stale (describes behavior that changed) +- path/to/other.md → not stale (mentions identifier in passing) +- path/to/another.md → not stale (changelog entry) +... +``` + +Every candidate must have a verdict. Do not skip candidates. + +**Pass 2 — Deep read.** For each candidate marked "possibly stale" +in pass 1, read the full file alongside the relevant section of the +diff. Confirm whether the doc is actually stale and write a specific +remediation describing what should be updated. + +When evaluating: + +- **Only flag docs whose content is now incorrect.** A doc that + mentions an identifier is not stale if the described behavior is + unchanged. It is stale only if the behavior, signature, or semantics + changed in a way that makes the doc misleading. +- **Do not flag changelog entries or release notes that describe past + releases.** Historical entries are not stale because the code evolved. + +### 6. Compile findings + +For each confirmed stale doc, record a separate finding. Do not +consolidate multiple stale files into one finding — each stale file +gets its own finding with its own file path: + +- **Severity:** `high` if the doc describes behavior that is now + *incorrect* and would mislead a reader (e.g., wrong API signature, + removed config key still documented). `medium` if the doc is + *incomplete* rather than incorrect (e.g., new parameter not + documented, but existing content is still accurate). `low` if the + doc is slightly outdated but not misleading. Never use `critical` — + stale docs do not constitute a security or correctness risk in the + code itself. Do not produce `info`-level findings — if the staleness + is not worth flagging at `low`, do not flag it at all. +- **Category:** a descriptive label for the type of staleness, e.g., + `stale-doc`, `missing-doc`. Use whatever category best describes the + finding. +- **File:** path to the stale doc file. +- **Line:** line number of the stale reference, if identifiable. +- **Description:** what is stale and why — reference the specific code + change that caused the staleness. +- **Remediation:** what should be updated in the doc. + +If no stale docs are found, produce zero findings. Do not generate +findings to justify having run the skill. + +## Constraints + +The agent definition (`agents/review.md`) is the authoritative list of +prohibitions. This skill does not restate them. If a step in this skill +appears to conflict with the agent definition, the agent definition +wins. + +- **Do not flag docs modified in the same PR.** If the PR already + updates a doc file, that file is being actively maintained and should + not appear in findings. +- **Severity is capped at high.** Stale documentation does not + constitute a critical security or correctness risk in the code. + A `high` finding means the doc would actively mislead readers. +- **This skill is read-only.** It produces findings but does not + modify documentation files. diff --git a/internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md b/internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md index d2a8a5942..a88a6117e 100644 --- a/internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md +++ b/internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md @@ -9,8 +9,9 @@ description: >- # PR Review This skill orchestrates a pull request review by gathering GitHub -context, delegating code evaluation to the `code-review` skill, adding -PR-specific checks, and producing a structured result. In pipeline mode +context, delegating code evaluation to the `code-review` skill, +delegating documentation staleness checks to the `docs-review` skill, +adding PR-specific checks, and producing a structured result. In pipeline mode (`$FULLSEND_OUTPUT_DIR` set), it writes JSON for the post-script to post. In interactive mode, it posts directly via `gh pr review`. It does not evaluate code directly — that is the `code-review` skill's @@ -84,10 +85,28 @@ Pass the diff obtained in step 2 and use the PR metadata and linked issues as additional context for the intent-alignment dimension. The `code-review` skill produces findings and an outcome. Carry those -forward to steps 4 and 5. Proceed to step 4 regardless of outcome — -it covers PR-specific inputs not examined by code-review. +forward to steps 4, 5, and 6. Proceed to step 4 regardless of outcome. -### 4. PR-specific checks +### 4. Check documentation currency + +Invoke the `docs-review` skill to evaluate whether the code changes +in this PR have made any in-repo documentation stale. The docs-review +skill has its own multi-step process (build identifier checklist, +grep for every identifier, two-pass evaluation). Follow that process +completely — do not substitute ad-hoc grep searches. + +Merge the docs-review findings into the findings list from step 3. +Documentation staleness findings are capped at `high` severity (never +`critical`), so they contribute to the outcome but do not dominate it. + +Skip this step if: +- The PR contains no code changes (docs-only or config-only PRs) +- The PR is a revert +- FILE_COUNT exceeds 200 (the diff was already too large for review) + +Proceed to step 5 regardless of outcome. + +### 5. PR-specific checks These checks apply only in the PR context and augment the findings from step 3. @@ -114,10 +133,10 @@ Verify the change scope matches the linked issue's authorization. A PR labeled "bug fix" that adds new capability is a feature, regardless of the label. Add a finding if the scope exceeds authorization. -Merge any new findings into the findings list from step 3 and -re-evaluate the overall outcome. +Merge any new findings into the findings list from steps 3 and 4, +and re-evaluate the overall outcome. -### 5. Produce the review result +### 6. Produce the review result Compose the review comment using this structure: @@ -254,13 +273,14 @@ wins. - **Never approve with unresolved critical or high findings.** If any critical or high finding exists, the outcome must be `request-changes`. -- **Never post without completing the `code-review` skill first.** - Partial reviews miss context and produce unreliable verdicts. +- **Never post without completing the `code-review` and `docs-review` + skills first.** Partial reviews miss context and produce unreliable + verdicts. - **Always include the PR head SHA in the review comment.** The review is only valid for the SHA evaluated; new pushes require a new review. - **Report failure rather than posting a partial review.** If you cannot - complete all six dimensions (tool failure, missing context, ambiguous - findings), produce a failure result (see step 5) rather than posting + complete all seven dimensions (tool failure, missing context, ambiguous + findings), produce a failure result (see step 6) rather than posting an incomplete result. - **In pipeline mode, `gh pr review` is reserved for the post-script.** The sandbox token is read-only. Write JSON to From 20831f571ec5bdf41056ef5355930fcdd3b0a412 Mon Sep 17 00:00:00 2001 From: csoceanu Date: Thu, 7 May 2026 14:21:41 +0300 Subject: [PATCH 2/4] fix: address review feedback - remove dir examples, update description - Remove specific directory names from step 3 exclusion instruction - Update agent description to include documentation currency Addresses review feedback from fullsend-ai-review. Co-Authored-By: Claude Opus 4.6 (1M context) --- internal/scaffold/fullsend-repo/agents/review.md | 2 +- internal/scaffold/fullsend-repo/skills/docs-review/SKILL.md | 4 +--- 2 files changed, 2 insertions(+), 4 deletions(-) diff --git a/internal/scaffold/fullsend-repo/agents/review.md b/internal/scaffold/fullsend-repo/agents/review.md index 14ba915a9..eed98a711 100644 --- a/internal/scaffold/fullsend-repo/agents/review.md +++ b/internal/scaffold/fullsend-repo/agents/review.md @@ -2,7 +2,7 @@ name: review description: >- Code review specialist. Reviews for correctness, security, intent - alignment, and style. + alignment, style, and documentation currency. tools: >- Read, Grep, Glob, Bash disallowedTools: >- diff --git a/internal/scaffold/fullsend-repo/skills/docs-review/SKILL.md b/internal/scaffold/fullsend-repo/skills/docs-review/SKILL.md index ef642ed29..79de6dab4 100644 --- a/internal/scaffold/fullsend-repo/skills/docs-review/SKILL.md +++ b/internal/scaffold/fullsend-repo/skills/docs-review/SKILL.md @@ -66,9 +66,7 @@ understand where documentation lives — different repos organize docs differently. Search for documentation files (`.md`, `.rst`, `.adoc`) across the -repository. Exclude generated or dependency directories (`.git`, -`vendor`, `node_modules`) but do not exclude documentation files based -on their name or location. +repository. If the repository has no documentation files, produce zero findings and exit — there is nothing to check. From b27626b084e449fbe234408d1aadd24f19a432b0 Mon Sep 17 00:00:00 2001 From: csoceanu Date: Sun, 10 May 2026 11:14:12 +0300 Subject: [PATCH 3/4] fix: add subagent guidance to docs-review skill Encourage the agent to dispatch a subagent for the docs-review process to avoid consuming the main review context window. Follows the pattern established by the retro-analysis skill. Addresses review feedback from ralphbean. Co-Authored-By: Claude Opus 4.6 (1M context) --- .../scaffold/fullsend-repo/skills/docs-review/SKILL.md | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/internal/scaffold/fullsend-repo/skills/docs-review/SKILL.md b/internal/scaffold/fullsend-repo/skills/docs-review/SKILL.md index 79de6dab4..196a1afe7 100644 --- a/internal/scaffold/fullsend-repo/skills/docs-review/SKILL.md +++ b/internal/scaffold/fullsend-repo/skills/docs-review/SKILL.md @@ -16,6 +16,14 @@ docs describing behavior that no longer exists. This skill detects that drift by matching the PR's code changes against in-repo documentation and flagging docs whose descriptions contradict the new code. +## Context management + +This skill involves scanning documentation files across the repository, +which can consume significant context. Dispatch a subagent to carry out +the process below, keeping the main review context free for code-review +and PR-specific checks. The subagent should return a list of findings +(or an empty list if no stale docs were found). + ## Process Follow these steps in order. Do not skip steps. From ce183f558b367a06aa416c756145dd9cc700084a Mon Sep 17 00:00:00 2001 From: csoceanu Date: Sun, 10 May 2026 11:24:30 +0300 Subject: [PATCH 4/4] fix: update pr-review frontmatter to mention docs-review Addresses review feedback from fullsend-ai-review. Co-Authored-By: Claude Opus 4.6 (1M context) --- internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md b/internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md index 9bbbaabec..9920a1b1a 100644 --- a/internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md +++ b/internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md @@ -2,8 +2,9 @@ name: pr-review description: >- PR-specific review procedure. Gathers GitHub context, delegates code - evaluation to the code-review skill, adds PR-specific checks, and - writes a structured review result. + evaluation to the code-review skill, delegates documentation + staleness checks to the docs-review skill, adds PR-specific checks, + and writes a structured review result. --- # PR Review