Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
58 changes: 55 additions & 3 deletions .github/workflows/review-pr.yml
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,20 @@ permissions:
id-token: write
actions: read # download-artifact across workflow_run boundary

# Rate-anomaly safeguard: serialize same-trigger bursts on a PR (e.g. rapid
# force-pushes firing repeated auto-reviews) instead of running N in parallel.
# The group is keyed per PR AND per trigger intent: the comment id (or event
# name) suffix keeps distinct comments/replies in distinct groups, so a quick
# conversational reply is never queued behind a 45-minute review. cancel-in-
# progress is false so an in-flight review/reply is never killed mid-post.
# Per-PR request *frequency* is enforced by the rate-limit check below, and the
# in-action cache lock (review-pr/action.yml) prevents concurrent reviews; the
# workflow_run/fork path (PR number only in the artifact) falls back to a per-run
# group, where those two mechanisms still bound abuse.
concurrency:
group: pr-review-${{ github.event.pull_request.number || github.event.issue.number || inputs.pr-number || github.run_id }}-${{ github.event.comment.id || github.event_name }}
cancel-in-progress: false
Comment on lines +83 to +95

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This actually already exists, but doesn't use GH's concurrency, and only 1 review-per-PR can run within a window of time: https://github.com/docker/docker-agent-action/blob/main/review-pr/action.yml#L142-L146

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct that a per-PR collapse already exists via the cache lock. The new concurrency: group is additive rather than equivalent: it is keyed per (PR + trigger intent), queues instead of skipping, applies before a runner spins up, and has no race window (the lock has a documented one). The trigger workflow also uses cancel-in-progress to drop superseded context-capture runs. If relying on the lock plus the rate-limit check is preferred, the group can be dropped; it is left in place for now as a cheap, race-free pre-resource layer.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense, thanks for the explanation!


jobs:
resolve-context:
if: inputs.trigger-run-id != ''
Expand Down Expand Up @@ -315,10 +329,36 @@ jobs:
REQUESTER: ${{ github.event.sender.login }}
run: node "$DOCKER_AGENT_ACTION_ROOT/dist/check-org-membership.js"

# Rate-anomaly safeguard: count how many docker-agent reviews and replies
# landed on this PR in the recent window (full reviews via the Reviews API
# plus marker-bearing reply comments). An authorized account can still drive
# the bot at high frequency (each request costs an LLM run), so a burst above
# the threshold is flagged and the expensive review is skipped. Runs BEFORE
# "Create check run" so a throttled request creates no check run (a skipped
# review must not surface as a green "PR Review" check).
#
# Scope: this gate guards the review run only. The conversational reply jobs
# (reply-to-feedback, reply-to-mention) are intentionally left unthrottled
# (see the note on each). The window count itself already spans reviews AND
# replies, so heavy reply traffic still throttles subsequent review runs.
- name: Check rate anomaly

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[LOW] Check rate anomaly is scoped to review requests only — @docker-agent mention floods are unthrottled

The Check rate anomaly step has:

if: |
  steps.membership.outputs.is_member == 'true' &&
  steps.command.outputs.is_review != 'false' &&
  steps.draft.outputs.skip != 'true'

When an @docker-agent mention arrives (not a /review command), steps.command.outputs.is_review is 'false', so the rate check is skipped. The reply-to-mention job (line ~785) is a separate job with no rate-limit integration at all.

This means an org member can flood @docker-agent mentions on a PR and drive up LLM inference costs with no throttle. The PR title scopes itself to "review requests", so this may be an intentional deferral — if so, a comment in the workflow noting the known gap would be helpful. If unintentional, the reply-to-mention job should also consult dist/rate-limit.js with its own window/threshold (likely a looser one, since mention replies are cheaper than full reviews).

id: rate
if: |
steps.membership.outputs.is_member == 'true' &&
steps.command.outputs.is_review != 'false' &&
steps.draft.outputs.skip != 'true'
continue-on-error: true # fail-open: a rate-check error must not block reviews
shell: bash
env:
GITHUB_TOKEN: ${{ env.GITHUB_APP_TOKEN || github.token }}
RATE_PR_NUMBER: ${{ steps.pr.outputs.number }}
run: node "$DOCKER_AGENT_ACTION_ROOT/dist/rate-limit.js"

- name: Create check run
if: |
(steps.pr.outputs.source == 'event' || steps.pr.outputs.source == 'trigger') &&
steps.membership.outputs.is_member == 'true'
steps.membership.outputs.is_member == 'true' &&
steps.rate.outputs.anomalous != 'true'
id: create-check
continue-on-error: true
uses: actions/github-script@3a2844b7e9c422d3c10d287c895573f7108da1b3 # v9.0.0
Expand Down Expand Up @@ -349,7 +389,8 @@ jobs:
if: |
steps.membership.outputs.is_member == 'true' &&
steps.command.outputs.is_review != 'false' &&
steps.draft.outputs.skip != 'true'
steps.draft.outputs.skip != 'true' &&
steps.rate.outputs.anomalous != 'true'
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
fetch-depth: 0
Expand All @@ -359,7 +400,8 @@ jobs:
if: |
steps.membership.outputs.is_member == 'true' &&
steps.command.outputs.is_review != 'false' &&
steps.draft.outputs.skip != 'true'
steps.draft.outputs.skip != 'true' &&
steps.rate.outputs.anomalous != 'true'
id: run-review
continue-on-error: true
uses: docker/docker-agent-action/review-pr@e96a4bb40cac114f64358621e1d08346c8eadc8c # v2.0.1
Expand Down Expand Up @@ -404,6 +446,12 @@ jobs:
core.warning(`Failed to update check run: ${error.message}`);
}

# No rate-anomaly gate here (unlike the review job): conversational replies are
# gated by org membership and serialized per-PR by the `concurrency:` group, and
# each reply requires a distinct human comment to trigger it, so they cannot be
# fanned out the way review requests can. The rate-limit window count does still
# include these replies, so a reply burst throttles subsequent review runs.
# Extending the same gate to the reply jobs is a deferred follow-up.
reply-to-feedback:
needs: [resolve-context]
if: |
Expand Down Expand Up @@ -746,6 +794,10 @@ jobs:
path: feedback/
retention-days: 90

# No rate-anomaly gate here, same rationale as reply-to-feedback above:
# org-gated, per-PR-serialized, and trigger-bound to a human comment. The
# rate-limit window count includes these replies, so they still feed the
# review job's throttle. Gating the reply jobs themselves is a deferred follow-up.
reply-to-mention:
needs: [resolve-context]
if: |
Expand Down
9 changes: 9 additions & 0 deletions .github/workflows/self-review-pr-trigger.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,15 @@ on:
pull_request_review_comment:
types: [ created ]

# Rate-anomaly safeguard: collapse a burst of review_requested events on the same
# PR at the source — re-requesting a review is redundant, so cancel-in-progress
# drops superseded context-capture runs and only the latest fans out to a review.
# Distinct review comments stay independent (the comment id keys them into their
# own group) so a real reply is never dropped by another comment on the same PR.
concurrency:
group: pr-review-trigger-${{ github.event.pull_request.number || github.run_id }}-${{ github.event.comment.id || 'review-request' }}
cancel-in-progress: true

permissions: {}

jobs:
Expand Down
51 changes: 51 additions & 0 deletions SECURITY.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,8 @@ This action includes **built-in security features for all agent executions**:
- **Suspicious patterns** (strip + warn): Behavioral/natural-language injection attempts ("ignore previous instructions", "system mode", "reveal the token", base64/hex obfuscation, etc.) — matching lines are removed from the prompt before it reaches the agent
- **Medium-risk patterns** (warn only): API key variable names in configuration (`ANTHROPIC_API_KEY`, `GITHUB_TOKEN`, etc.)

4. **Review-Request Abuse Safeguards** — the comment/event-triggered PR review pipeline is hardened against abuse on every trigger path: org-membership validation and rate-anomaly throttling. See [Review-Request Abuse Safeguards](#review-request-abuse-safeguards).

## Security Architecture

The action implements a defense-in-depth approach:
Expand Down Expand Up @@ -69,6 +71,55 @@ The action implements a defense-in-depth approach:
└────────────────────────────────────────────────────────────────┘
```

## Review-Request Abuse Safeguards

The PR review pipeline (`.github/workflows/review-pr.yml` and the `review-pr/`
composite actions) is comment- and event-triggered, which makes it the main
abuse vector for cost/spam. Two safeguards harden the review-request flow on
every trigger path (`review_requested`, the deprecated `/review` comment,
`@docker-agent` mentions, automatic `opened`/`synchronize`, and the
`workflow_run` fork path).

### 1. Review requests come from org members

Membership is verified before any review work runs, and which actor is checked
depends on the trigger:

| Trigger | Actor verified | Mechanism |
|---------|----------------|-----------|
| `/review` comment, `@docker-agent` mention, reply-to-feedback | The commenter | `check-org-membership` against the `docker` org (OIDC `org-membership-token`) |
| Automatic review (`opened`/`synchronize`/`ready_for_review`) | The PR author | `check-org-membership` resolves the PR author live via the GitHub API |
| `review_requested` via the PR sidebar | The requester | The requesting org member is verified, so an external contributor's PR can be reviewed on request. The requester is taken only from a trusted source (`github.event.sender.login` on the direct same-repo path, re-derived from the PR timeline on the fork/`workflow_run` path), never from the trigger artifact. Also relies on **GitHub-native enforcement** (only users with triage/write access can request a reviewer) and gates on `requested_reviewer.login == 'docker-agent'` |

The membership check **fails closed**: if no actor login can be resolved, or the
actor is not an org member, the review is skipped. `src/check-org-membership`
evaluates the authorization paths in order (the PR author for automatic review,
then the trusted requester for `review_requested`), resolving the PR author live
via the API so the directly-wired `pull_request` path verifies the author
instead of an empty comment author.

### 2. Rate anomalies are detected and throttled

Authorization gates *who* may trigger a review, not *how often*. Three layers
bound request frequency:

- **Per-PR `concurrency:` groups** on `review-pr.yml` and the trigger workflow
collapse same-trigger bursts (e.g. rapid force-pushes, repeated review
requests) instead of running them in parallel. Groups are scoped per trigger
intent so a quick conversational reply is never queued behind a long review.
- **`src/rate-limit`** counts docker-agent review/reply outputs on the PR within
a sliding window (default 600 s) and, when the count crosses a threshold
(default 8), flags a rate anomaly. It counts one unit per LLM run: full reviews
via the Reviews API (`pulls.listReviews`, by bot author, covering findings,
zero-finding APPROVEs, and timeout/error/LGTM fallbacks, none of which carry an
inline marker) plus marker-bearing reply comments. The review job skips the
expensive review on a flagged anomaly; the conversational reply jobs are not
gated (they are org-gated and per-PR serialized), though their replies still
count toward the window. The check **fails open** so an API error never blocks a
legitimate review.
- The existing in-action **cache lock** (`pr-review-lock-<repo>-<pr>-*`, 600 s
TTL) prevents concurrent reviews from racing on the same PR.

## Security Modules

All security logic lives under `src/security/` and is compiled into `dist/security.js` by
Expand Down
4 changes: 3 additions & 1 deletion review-pr/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -191,7 +191,8 @@ with:
>
> 1. **Verifies org membership** before every review. Auto-review checks the **PR author** (so only org members' PRs are reviewed automatically); a **requested review** checks the **requester**, so a maintainer can pull an external contributor's PR into review on demand; `/review` checks the commenter
> 2. **Prevents bot cascades** — replies from bots (except `docker-agent`) are ignored
> 3. **Fork PRs work automatically** with the two-workflow setup — the trigger → `workflow_run` pattern provides OIDC/secret access regardless of fork status
> 3. **Throttles rate anomalies** — per-PR `concurrency:` groups collapse same-trigger bursts, and a rate-limit check skips the review when too many requests land on one PR in a short window
> 4. **Fork PRs work automatically** with the two-workflow setup — the trigger → `workflow_run` pattern provides OIDC/secret access regardless of fork status

### What you don't need to add

Expand All @@ -204,6 +205,7 @@ The workflow YAML examples above are the complete, recommended setup. The reusab
| **PR vs issue comment** | The reusable workflow checks `github.event.issue.pull_request` internally. Plain issue comments on non-PR issues are silently ignored. |
| **Draft PR skipping** | Draft PRs are skipped internally — no caller condition needed. |
| **Concurrent review guard** | A cache-based lock (`pr-review-lock-<repo>-<pr>-*`) prevents duplicate reviews from racing on the same PR. |
| **Rate-anomaly throttling** | Per-PR `concurrency:` groups serialize same-trigger bursts, and a rate-limit check skips the review when too many docker-agent reviews and replies land on one PR within the window. No caller configuration needed. |

**The only decision callers make** is which setup pattern to use: 1-workflow for same-repo PRs, 2-workflow for repos that accept fork PRs. That distinction is the caller's responsibility because it controls which event path delivers OIDC credentials to the reusable workflow.

Expand Down
5 changes: 3 additions & 2 deletions src/post-mention-reply/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -101,8 +101,9 @@ export async function run(config: PostMentionReplyConfig): Promise<void> {
const prNum = parseInt(prNumber, 10);

// Guard 7: dedup check (inline only — top-level has no dedup, see C1).
// Workflow-level concurrency lock and should-reply guards prevent double-posting
// for top-level mentions; only inline threads need per-thread deduplication.
// Top-level mentions rely on the per-PR `concurrency:` group in review-pr.yml
// (which serializes runs for one PR) plus the should-reply guards to bound
// double-posting; only inline threads need this per-thread deduplication.
if (isInline) {
let isDuplicate = false;
try {
Expand Down
Loading