Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .claude-plugin/marketplace.json
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@
{
"name": "bitwarden-code-review",
"source": "./plugins/bitwarden-code-review",
"version": "1.11.0",
"version": "1.12.0",
"description": "Comprehensive code review system with organization-wide standards."
},
{
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ A curated collection of plugins for AI-assisted development at Bitwarden. Enable
| [bitwarden-tech-lead](plugins/bitwarden-tech-lead/) | 2.3.1 | Tech lead for technical planning, architecture coherence, and surfacing patterns to Technical Strategy Ideas |
| [bitwarden-shepherd](plugins/bitwarden-shepherd/) | 1.0.0 | Champion of a technical strategy β€” shepherds a TSI through evaluation into the funnel, then through to adoption |
| [bitwarden-atlassian-tools](plugins/bitwarden-atlassian-tools/) | 2.2.7 | Read-only Atlassian access via MCP server with deep Jira issue research skill |
| [bitwarden-code-review](plugins/bitwarden-code-review/) | 1.11.0 | Autonomous code review agent following Bitwarden engineering standards with GitHub integration |
| [bitwarden-code-review](plugins/bitwarden-code-review/) | 1.12.0 | Autonomous code review agent following Bitwarden engineering standards with GitHub integration |
| [bitwarden-delivery-tools](plugins/bitwarden-delivery-tools/) | 1.5.0 | Delivery lifecycle skills: initiative funnel navigation, work transitions, tech breakdowns and cross-team signoffs, commits, PRs, preflight, labeling |
| [bitwarden-designer](plugins/bitwarden-designer/) | 0.1.0 | Product designer persona: Code of Conduct and 30/60/90 critique, critique facilitation; dispatches into bitwarden-design-tools |
| [bitwarden-design-tools](plugins/bitwarden-design-tools/) | 0.1.0 | Design toolkit: content style guide, Figma Dev Mode MCP, Bitwarden brand application, handoff prep, Design System governance, Product and Design Jira |
Expand Down
2 changes: 1 addition & 1 deletion plugins/bitwarden-code-review/.claude-plugin/plugin.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"name": "bitwarden-code-review",
"version": "1.11.0",
"version": "1.12.0",
"description": "Comprehensive code review system with organization-wide standards.",
"author": {
"name": "Bitwarden",
Expand Down
11 changes: 11 additions & 0 deletions plugins/bitwarden-code-review/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,17 @@ All notable changes to the Bitwarden Code Review Plugin will be documented in th
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [1.12.0] - 2026-06-18

### Changed

- `performing-multi-agent-code-review`: per-stage model flags (`--model-analysis`, `--model-security`, `--model-validation`, `--model-audit`) with a security floor rule; subagents inherit the session model instead of forcing Opus; severity audit defaults to Sonnet.
- `performing-multi-agent-code-review`: updated the architecture subagent to be a general agent type with a stronger prompt. Reduce complexity by not requiring engineers to install plugins they don't need. Also found minimal to zero actual benefit to using the tech-lead agent.
- `performing-multi-agent-code-review`: `{model}` in file names and report headers is the resolved model nickname; `-mixed` suffix when stage flags differ
- `performing-multi-agent-code-review`: enhanced the reference documents to provide example shapes of the DTOs that pass data between the subagents and the orchestration agent.
- `performing-multi-agent-code-review`: removed duplicate and unnecessarily verbose instructions
- Improved the README.md to better describe the purpose and usage of the multi-agent review.

## [1.11.0] - 2026-05-12

### Added
Expand Down
51 changes: 49 additions & 2 deletions plugins/bitwarden-code-review/README.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,13 @@
# Bitwarden Code Review Plugin

Comprehensive AI-powered code review agent following Bitwarden engineering standards.
AI-powered code review for Bitwarden β€” an autonomous review agent for everyday PRs, plus a rigorous multi-agent pipeline for the changes that warrant a deeper look.

## Overview

This plugin provides an autonomous code review agent that conducts thorough, professional code reviews following Bitwarden's organizational standards. The agent focuses on security, correctness, and high-value feedback while maintaining a high signal-to-noise ratio.

It offers two complementary lenses. The autonomous `bitwarden-code-reviewer` agent reviews a pull request the way a human reviewer would and posts inline comments to GitHub. The `performing-multi-agent-code-review` skill takes a different approach β€” it has Claude evaluate code _as Claude_ across a pipeline of specialized sub-agents, trading human-style commentary for depth and high-signal findings written to a local report.

## Features

- **Autonomous Review Agent**: Single agent handles all code review tasks without manual invocation
Expand All @@ -14,6 +16,7 @@ This plugin provides an autonomous code review agent that conducts thorough, pro
- **Security-First Approach**: Prioritizes security vulnerabilities, data exposure, and authentication issues
- **Structured Thinking**: Uses explicit reasoning blocks to improve review quality and consistency
- **Confidence Scoring**: Pre-filters findings with a 0-100 confidence score (β‰₯75 threshold) before validation to reduce false positives
- **Multi-Agent Review Pipeline**: A separate `performing-multi-agent-code-review` skill runs six specialized sub-agents β€” architecture compliance, code quality, bug analysis, security & logic, validation, and severity audit β€” for depth on complex changes

## Skills

Expand All @@ -39,7 +42,7 @@ See [`classifying-review-findings`](./skills/classifying-review-findings/SKILL.m

### Directory Structure

```
```bash
bitwarden-code-review/
β”œβ”€β”€ .claude/
β”‚ └── settings.json # Security boundaries
Expand Down Expand Up @@ -84,6 +87,50 @@ The agent is automatically invoked by Claude when:
Use the bitwarden-code-reviewer agent to review this PR
```

### Multi-agent code review skill

The `performing-multi-agent-code-review` skill is built for complex changes where one reviewer β€” human or AI β€” can't hold every concern in view at once. Instead of a single pass, it puts a team of specialized agents on the same diff, each reviewing from its own perspective β€” architecture and pattern compliance, code quality, bugs, and security & logic β€” then validates and severity-audits everything they surface. It started from [Anthropic's `code-review` command](https://github.com/anthropics/claude-code/blob/main/plugins/code-review/commands/code-review.md) and was rebuilt for the control that command lacks: it wires in our own `bitwarden-security-engineer` plugin, runs on any model you choose, reviews **draft or published** PRs β€” plus local changes, branch comparisons, and commit ranges β€” and writes its report to a local file instead of posting to GitHub.

It's a deliberately different lens from the `bitwarden-code-reviewer` agent. That agent reviews the way a human reviewer would; this skill has **Claude evaluate code as Claude** β€” optimized for signal, not parity. It spotlights blockers, real bugs, and known-bad patterns, and stays out of the nit-picking lane. Each potential finding is scored 0–100, and only those clearing an 80-confidence bar are raised; every one it keeps is cited with file, line, and the agent that caught it. What clears the bar still gets challenged β€” the validation and severity-audit agents can overturn a finding β€” but an overturned finding is never dropped: it moves into a collapsed **Reviewed and Dismissed** section, tagged with its original severity, original confidence, and the reason it was set aside. That section is deliberate β€” a human sees everything the first pass surfaced and can judge when a dismissal was wrong and the original call was right; without it, that signal is lost.

Agents start cold β€” a sub-agent inherits none of the main session's context β€” so the skill briefs every one of them explicitly. Each receives Bitwarden's zero-knowledge invariant and the P01–P06 threat-model directive verbatim. Feature context β€” the PR's intent, the ticket, the product framing β€” is handed out deliberately: the architecture and security agents get the full "why" so they can reason adversarially from intent, while the quality and bug agents see only the diff, so their first read stays unbiased.

The `performing-multi-agent-code-review` skill slots cleanly into a [Claude Code agent-teams](https://code.claude.com/docs/en/agent-teams) loop β€” run it at the end of a coding session, address what it finds, refactor β€” or as a depth-of-review gate on a draft PR before you publish.

#### Examples

Invoke the skill explicitly with the slash command, or with natural language β€” it triggers on phrasing like "thorough", "deep", "multi-pass", or "multi-agent" review, and on commit-range framing like "review the last week of commits". With no model flags, the review runs on your session's model, with the severity audit defaulting to sonnet.

**1. Local changes, all defaults.** The simplest run β€” review your uncommitted work before you commit:

> Perform a thorough multi-agent code review of my local changes.

**2. A specific pull request.** Pass a PR number or URL; it reviews draft and published PRs alike:

```markdown
/bitwarden-code-review:performing-multi-agent-code-review https://github.com/bitwarden/ios/pull/1234567 --model opus
```

**3. Changes over a period of time.** Commit-range mode reviews the cumulative diff across a time window, commit count, or explicit ref pair. Run it from inside the target repo; it confirms the resolved range with you before spending tokens:

```markdown
Review the last week of commits using /bitwarden-code-review:performing-multi-agent-code-review --model opus
```

**4. Full control, per stage.** Tune each stage independently β€” opus for analysis and validation, opus for security, sonnet for the audit β€” and send the report to a specific directory:

```bash
/bitwarden-code-review:performing-multi-agent-code-review \
https://github.com/bitwarden/gh-actions/pull/604 \
--model-analysis opus \
--model-security opus \
--model-validation opus \
--model-audit sonnet \
--output-dir ./reviews
```

A **security floor** keeps `--model-security` from ever dropping below your global model, so threat-model evaluation never silently degrades. Omit `--output-dir` and the report lands in `${CLAUDE_PLUGIN_DATA}/code-reviews/`, organized across projects and never git-tracked.

### In GitHub Actions

See the production implementation: [bitwarden/gh-actions `_review-code.yml`](https://github.com/bitwarden/gh-actions/blob/main/.github/workflows/_review-code.yml)
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
name: bitwarden-code-reviewer
version: 1.10.1
version: 1.12.0
description: Conducts thorough code reviews following Bitwarden standards. Finds all issues first pass, avoids false positives, respects codebase conventions. Invoke when user mentions "code review", "review code", "review", "PR", or "pull request".
model: opus
skills: avoiding-false-positives, classifying-review-findings, posting-bitwarden-review-comments, posting-review-summary, reviewing-dependency-changes
Expand Down
Loading
Loading