Skip to content

Review agent misses security-critical findings on large architectural PRs #898

@fullsend-ai-retro

Description

@fullsend-ai-retro

What happened

PR #792 implemented 3 ADRs across 52 files (~4K lines). The review bot ran 9 formal review rounds over 4 days (May 9-13), producing a 60K+ character sticky comment, and submitted CHANGES_REQUESTED each time. Despite this, when the human reviewer (ralphbean) reviewed on May 13, they found 4 important issues the bot had never flagged:

  1. Fail-open security bug in internal/mint/main.go: ALLOWED_WORKFLOW_FILES failed open when unset -- any workflow in fullsend-ai/fullsend could mint agent tokens for any enrolled org. This was a trust surface expansion introduced by the PR's OIDC changes.
  2. Role escalation in reusable-retro.yml: retro agent role changed from retro (read-only) to fullsend (write access to contents, PRs, actions, workflows) with no justification.
  3. ADR inconsistency: env/ listed as "Org-only" in ADR 0035 but treated as a layered directory everywhere in code.
  4. Missing test assertion: scaffold test verified 6 of 7 .gitkeep files, missing customized/env/.gitkeep.

The bot's review focused on build verification, routing completeness, dependency pinning, and documentation -- valid but lower-severity concerns. The human found the security and correctness issues.

What could go better

The review agent appears to lack specific guidance for evaluating security-sensitive changes like OIDC token validation, role/permission assignments, and trust boundary modifications. On a PR this large, the bot may be spreading attention too thin across all 52 files rather than prioritizing security-critical paths.

I'm fairly confident this is a systematic gap rather than a one-off, because: (a) the bot had 9 chances to catch these issues and didn't, (b) issues #870 and #580 document similar patterns where the bot misses important findings or doesn't calibrate severity well, and (c) the fail-open bug is exactly the kind of issue that requires understanding the security implications of a code change, not just its syntactic correctness.

I'm less certain about the fix -- it could be a prompt/skill issue (the review agent lacks security-focused review dimensions) or a context window issue (52 files may exceed the agent's effective attention span).

Proposed change

Add a security-focused review dimension to the review agent's skill or prompt. Specifically:

  1. In the code-review skill (or the review agent definition in agents/review.md), add an explicit review dimension for permission and trust boundary changes: flag any modifications to role assignments, OIDC validation logic, token scoping, secrets: blocks, or permissions: blocks in workflow files.

  2. Add a heuristic: when a PR modifies files matching patterns like **/mint/**, **/auth/**, **/oidc/**, or workflow files with permissions: blocks, the review agent should prioritize those files and explicitly check for fail-open vs fail-closed behavior.

  3. Consider a two-pass strategy for large PRs (>30 files): first pass identifies security-critical files, second pass deep-reviews those files specifically.

Validation criteria

On the next 5 PRs that modify mint, auth, or workflow permission logic, the review agent should flag at least one security-relevant observation per PR. Specifically, if a similar fail-open pattern is introduced, the bot should catch it before human review. Measure over 3 months.


Generated by retro agent from #792

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    Status

    Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions