feat(review agent): anchor severity across re-reviews#729
Conversation
Site previewPreview: https://d7e81e3f-site.fullsend-ai.workers.dev Commit: |
Review: #729Head SHA: 17c58ca SummaryThis PR implements severity anchoring for re-reviews via a well-structured three-layer approach: workflow plumbing (pre-fetch script + env propagation), skill instructions (anchoring rules + finding matching), and agent identity (zero-trust exception documentation). The design is sound and the implementation is clean, with one medium finding: the skill instructions reference an env var ( FindingsMedium
Low
FooterOutcome: comment-only Previous runReview: #729Head SHA: b0ba168 SummaryThe design intent is sound — anchoring severity across re-reviews addresses a real pain point documented in #681. However, FindingsCritical
High
Medium
Low
Info
FooterOutcome: request-changes Previous run (2)Review: #729Head SHA: b68e77d SummaryThis PR implements severity anchoring for re-reviews, directly addressing issue #681 where findings escalated across re-reviews without code changes. The implementation is a well-structured three-layer fix: workflow pre-fetch with provenance validation, harness file plumbing, and skill-level instructions for severity anchoring. All edge cases are handled gracefully — first reviews, provenance failures, oversized bodies, compare API failures, and force-pushes all degrade safely. No critical, high, or medium findings. FindingsCriticalNone. HighNone. MediumNone. LowNone. Info
FooterOutcome: approve Previous run (3)Review: #729Head SHA: eee3134 SummaryThis PR correctly implements severity anchoring for re-reviews by following the established pre-fetch pattern from the fix agent workflow. The three-layer approach — workflow plumbing, skill instructions, and agent documentation — is well-structured and handles the first-review case gracefully via optional file handling. One low-severity robustness improvement is noted below but does not block approval. FindingsCriticalNone. HighNone. MediumNone. Low
InfoNone. FooterOutcome: approve Previous run (4)Review: #729Head SHA: f247ed0 SummaryThis PR implements severity anchoring for the review agent across re-reviews, addressing the observed severity escalation on unchanged code (issue #681). The three-layer approach — workflow plumbing to pre-fetch prior review context, skill instructions for anchoring rules, and agent identity documentation — is well-structured and follows existing patterns throughout. First reviews are unaffected, and safe defaults apply when the compare API fails or the prior review is absent. No security concerns: the prior review is fetched with the existing read-only token, filtered by bot identity, and capped at 1 MB. FindingsLow
FooterOutcome: approve |
ralphbean
left a comment
There was a problem hiding this comment.
Good change — severity anchoring across re-reviews directly addresses the trust problem documented in #681, and the three-layer approach (workflow plumbing → skill instructions → agent docs) is clean and follows the fix agent's established pattern.
One change requested, plus a few notes for future consideration.
Strategic assessment: This is well-scoped, directly closes a filed issue with concrete evidence, and the implementation is architecturally sound. The safe defaults (first reviews unaffected, compare API failures degrade to no anchoring) are exactly right.
|
fullsend review is working on this — view logs |
ralphbean
left a comment
There was a problem hiding this comment.
Good change — severity anchoring directly addresses the trust problem in #681, and the three-layer approach is clean. The follow-up commit addresses all four prior inline comments (thumbs-upped each). One change requested around provenance validation, plus two notes for future work.
All four prior review findings from the first pass have been addressed by the follow-up commit:
prior_review_filenow set in both branches--paginatedropped to avoid per-page jq issue- Bidirectional anchoring now explicitly documented
total_commitsguard added for compare API truncation
ralphbean
left a comment
There was a problem hiding this comment.
Good change — severity anchoring directly addresses the trust problem in #681 and the three-layer approach is clean. The follow-up commit addressed all four round-1 findings. Two new issues surfaced plus an additional angle on the existing provenance discussion.
Strategic assessment: Well-scoped, closes a filed issue with concrete evidence, and the implementation follows established patterns. The safe defaults (first reviews unaffected, compare failures degrade to no anchoring) are exactly right. The outstanding items are about robustness, not direction.
eee3134 to
b68e77d
Compare
|
fullsend review is working on this — view logs |
ralphbean
left a comment
There was a problem hiding this comment.
Good change — severity anchoring directly addresses the trust problem in #681. The three-layer approach is clean and follows established patterns. One change requested (jq argument injection hardening), plus three notes for future consideration.
Three existing findings from prior reviews were thumbs-upped (+1'd) as they remain relevant and unaddressed:
- Pagination gap after
--paginateremoval (comment by ralphbean) - Compare API inner subcommand failure handling (comment by ralphbean)
- Multi-file findings anchoring gap (comment by ralphbean, deferred)
ralphbean
left a comment
There was a problem hiding this comment.
Good change — severity anchoring directly addresses the trust problem in #681 (documented severity flip-flopping on unchanged code). The three-layer approach (workflow plumbing → skill instructions → agent docs) follows the fix agent's established pattern and is architecturally sound.
Strategic assessment: Well-scoped, closes a filed issue with concrete evidence, and the implementation direction is right. The GitHub App provenance validation added in this commit is exactly the right move.
However, this is the third review round, and while each iteration has addressed prior findings, new issues continue to surface:
- 6 new issues requiring changes (2 important workflow bugs, 4 high-severity instruction gaps)
- 8 new issues flagged for future work (portability, conventions, ambiguities)
- 6 existing issues remain unresolved from prior review rounds (per the thumbs-upped comments)
The core idea is solid, but the implementation needs hardening — particularly around shell script safety (set -euo pipefail, API error detection), instruction correctness (sentinel parsing, undefined variables), and resolving the conflict between severity anchoring and the zero-trust principle.
Breakdown by category:
Workflow shell script (review.yml):
- Missing
set -euo pipefail— errors silently swallowed (IMPORTANT, request changes) - API error masking via
2>/dev/null || echo ""— auth failures indistinguishable from "no prior review" (IMPORTANT, request changes) grep -oPnon-portable to macOS (MODERATE, note for future)- File written to
$GITHUB_WORKSPACEnot$RUNNER_TEMP(MODERATE, note for future) wc -cwhitespace handling deviates from repo convention (MODERATE, note for future)prior_shawritten toGITHUB_OUTPUTwithout heredoc delimiter (MODERATE, note for future)- Sentinel-on-first-line edge case (MINOR, note for future)
- Hardcoded dest path collision potential (MINOR, note for future)
Skills and agent docs:
- Sentinel parsing instruction wrong relative to
sticky.gobehavior (HIGH, request changes) REPO_FULL_NAMEvariable used but never defined (HIGH, request changes)PRIOR_REVIEW_PROVENANCE=unverifiable-*glob pattern never enumerated or referenced in skills (HIGH, request changes)- Severity anchoring "MUST match" conflicts with
agents/review.mdzero-trust principle (MEDIUM, request changes) - "Substantially same code area/function" has no operational definition (MEDIUM, note for future)
- Provenance failure output location unspecified (MEDIUM, note for future)
Summary: Six changes requested (shell safety + instruction correctness), eight notes for future consideration (conventions + ambiguities). The direction is excellent — once these gaps are addressed, this will be a solid addition to the review agent's capabilities.
b68e77d to
b0ba168
Compare
|
fullsend review is working on this — view logs |
b0ba168 to
1d8e55b
Compare
|
fullsend review is working on this — view logs |
1d8e55b to
17c58ca
Compare
|
fullsend review is working on this — view logs |
ralphbean
left a comment
There was a problem hiding this comment.
Good change — severity anchoring directly addresses the trust problem in #681, and the three-layer approach is clean and well-executed. The squashed commit addresses the majority of prior review findings (jq injection, set -euo pipefail, standalone script extraction, pagination, provenance validation, multi-file findings).
One change requested — and I owe an apology on this one.
Strategic assessment: Well-scoped, closes a filed issue with concrete evidence, follows established patterns. The provenance validation via GitHub App client_id is a strong addition. The safe defaults (first reviews unaffected, compare failures degrade to no anchoring) are exactly right.
|
Also, there's a merge conflict now. needs rebase and resolution. |
17c58ca to
55e32e2
Compare
|
fullsend review is working on this — view logs |
The review agent was stateless across runs — on re-review it could escalate or de-escalate severity for findings in unchanged code. Inject prior review context using the same pattern the fix agent uses for its pre-fetched review body. Closes: fullsend-ai#681 Assisted-by: Claude Code (Opus 4.6) Signed-off-by: Ben Alkov <ben.alkov@redhat.com>
55e32e2 to
7850742
Compare
|
fullsend review is working on this — view logs |
The Review Agent was stateless across runs — on re-review it could escalate or de-escalate severity for findings in unchanged code. Inject prior review context using the same pattern the Fix Agent uses for its pre-fetched review body.
Three-layer fix:
First reviews are unaffected — prior-review.txt is absent and PRIOR_REVIEW_SHA is empty, so the agent assesses everything fresh.
Closes: #681
Assisted-by: Claude Code (Opus 4.6)