GitClaw — AI-Powered PR Review Agent

A senior engineer embedded in your git workflow. Reviews pull requests for security, quality, and compliance — with full observability, drift detection, and a 5-provider LLM fallback chain. Everything lives in git.

What It Does

GitClaw is a provider-agnostic AI agent that runs on every pull request and:

Narrates the diff in plain English before reviewing — so human reviewers know exactly where to focus
Reviews for security — hardcoded secrets, SQL injection, broken auth, unsafe dependencies
Scores code quality across 4 dimensions: complexity, test coverage gap, duplication, maintainability (0–100 each)
Cites authoritative references — OWASP, CVE, ESLint, NIST — for every HIGH/CRITICAL finding
Posts a structured GitHub comment with verdict, risk score, and actionable fixes
Emits OTel trace spans — latency, token count, cost per run — to .gitagent/traces.jsonl
Writes structured daily logs at DEBUG/INFO/WARN/ERROR to .gitagent/logs/YYYY-MM-DD.ndjson
Detects behavioral drift — compares rolling reviews against a baseline, fires alerts if the agent becomes lenient
Monitors its own health — p50/p95/p99 latency, cost trends, escalation rate — every 5 reviews
Learns your codebase — memory system tracks recurring patterns and hot paths across PRs
Signs every audit entry with Ed25519 — tamper-evident compliance artifact
Escalates to humans when PRs touch auth, payments, DB migrations, or secrets
Falls back across 5 LLM providers — Anthropic → OpenAI → Groq → NVIDIA NIM → Gemini

How It Works

When triggered (manually, via CLI, or by a GitHub webhook), the agent runs 9 skills in sequence:

PR opened / updated
       │
       ▼
┌──────────────────┐
│  narrate-diff    │  ← Plain-English summary, identifies highest-risk file:line
└────────┬─────────┘
         ▼
┌──────────────────┐
│  review-pr       │  ← Fetches diff, scans CRITICAL→INFO, formats comment
└────────┬─────────┘
         ▼
┌──────────────────┐
│  quality-score   │  ← Complexity, test gap, duplication, maintainability (0–100)
└────────┬─────────┘
         ▼
┌──────────────────┐
│ justify-decision │  ← HIGH/CRITICAL only: OWASP/CVE/ESLint citations
└────────┬─────────┘
         ▼
┌──────────────────┐
│  audit-log       │  ← Signed JSON → .gitagent/audit.jsonl
└────────┬─────────┘
         ▼
┌──────────────────┐
│  emit-log        │  ← Structured NDJSON → .gitagent/logs/YYYY-MM-DD.ndjson
└────────┬─────────┘
         ▼
┌──────────────────┐
│  observe-trace   │  ← OTel span → .gitagent/traces.jsonl
└────────┬─────────┘
         ▼
  Post GitHub comment · escalate if CRITICAL · run health/drift checks

Each step is logged. Nothing is auto-merged. The agent recommends; humans decide.

Architecture

GitClaw/
├── agent.yaml                             # Manifest — model, 9 skills, human-in-the-loop, compliance
├── SOUL.md                                # Agent identity and communication style
├── RULES.md                               # Hard constraints (must/must-never)
├── index.js                               # Orchestrator — provider selection, tracing, memory, health
├── providers.js                           # 5-provider fallback chain (Anthropic→OpenAI→Groq→NIM→Gemini)
├── test.js                                # Provider connectivity + review smoke tests
├── clawless.config.js                     # Serverless deployment (webhooks, secrets, volumes)
├── skills/
│   ├── narrate-diff/SKILL.md              # Plain-English PR summary, identifies risk focus area
│   ├── review-pr/SKILL.md                 # CRITICAL→INFO security & quality scan
│   ├── quality-score/SKILL.md             # 4-dimension code quality scorer (0–100 each)
│   ├── justify-decision/SKILL.md          # OWASP/CVE/ESLint citations for HIGH+ findings
│   ├── audit-log/SKILL.md                 # Append-only compliance trail
│   ├── emit-log/SKILL.md                  # Structured daily NDJSON logs
│   ├── observe-trace/SKILL.md             # OTel-compatible trace spans
│   ├── health-check/SKILL.md              # Agent health metrics — every 5 reviews
│   └── detect-drift/SKILL.md             # Behavioral drift detection — every 10 reviews
├── tools/
│   ├── github-pr.yaml                     # Tool schema: get_diff, post_comment, get_files
│   └── github-pr.js                       # Implementation using fetch() — WebContainer-safe
├── .github/
│   └── workflows/review.yml               # GitHub Actions trigger on PR open/update
├── dashboard/
│   └── index.html                         # Standalone audit dashboard (no server, drag-drop)
├── memory/
│   └── patterns.json                      # Codebase memory — hot paths, recurring issues
└── metrics/
    ├── health.json                         # Current agent health snapshot
    ├── baseline.json                       # Drift detection baseline (written at review #10)
    └── drift.json                          # Latest drift signal check result

Provider Fallback Chain

GitClaw is provider-agnostic. Set any combination of keys — it picks the first available one automatically:

Tier	Provider	Model	Env Var	Free tier?
1	Anthropic Claude	claude-sonnet-4-5	`ANTHROPIC_API_KEY`	No
2	OpenAI	gpt-4.1	`OPENAI_API_KEY`	No
3	Groq Llama	llama-3.3-70b-versatile	`GROQ_API_KEY`	Yes
4	NVIDIA NIM	llama-3.1-70b-instruct	`NVIDIA_API_KEY`	Yes (limited)
5	Google Gemini	gemini-1.5-pro	`GEMINI_API_KEY`	Yes

# Force a specific tier by unsetting higher-priority keys
ANTHROPIC_API_KEY="" node index.js 42 owner/repo   # uses OpenAI
ANTHROPIC_API_KEY="" OPENAI_API_KEY="" node index.js 42 owner/repo  # uses Groq

Prerequisites

Node.js v18 or higher (node --version)
npm v9 or higher
At least one LLM API key (see table above — Groq has a free tier)
A GitHub Personal Access Token with repo scope (optional for dry-run on public repos)

Installation

# Clone the repo
git clone https://github.com/your-org/gitclaw.git
cd gitclaw

# Install dependencies
npm install

# Set up credentials
cp .env.example .env

Open .env and fill in:

GITHUB_TOKEN=
ANTHROPIC_API_KEY=

Configuration

`agent.yaml`

Controls the model, which skills are active, and when to escalate to a human:

model:
  preferred: claude-sonnet-4-5-20250929  # swap to claude-opus for deeper reviews

human_in_the_loop:
  enabled: true
  trigger: "when PR touches auth, secrets, DB migrations, or billing logic"

`RULES.md`

Hard behavioral constraints — the agent reads this on every run. Edit it to add project-specific rules (e.g. "always flag usage of our deprecated internal SDK").

`SOUL.md`

Defines the agent's tone and expertise. You can tune it to match your team's review culture.

Usage

Basic

# Review PR #42 in owner/repo
node index.js 42 owner/repo

Using environment variables

PR_NUMBER=42 GITHUB_REPO=owner/repo npm start

Dry-run mode (any public repo, no write access needed)

# Fetches the real diff, runs the full review, prints the comment instead of posting
node index.js 123 vercel/next.js --dry-run
node index.js 42 expressjs/express --dry-run

npm scripts

npm start              # node index.js (reads PR_NUMBER + GITHUB_REPO from env)
npm test               # provider connectivity + review smoke test (no repo needed)
npm run test:quick     # ping all configured providers only
npm run test:anthropic # test Anthropic provider specifically
npm run test:groq      # test Groq specifically
npm run validate       # validates agent.yaml and skill manifests

What you'll see

▶ Running skill: review-pr
  🔧 Tool: github-pr({"action":"get_diff","repo":"owner/repo","pr_number":42})
  🔧 Tool: github-pr({"action":"post_comment","repo":"owner/repo","pr_number":42,...})
▶ Running skill: justify-decision
▶ Running skill: audit-log
  🔧 Tool: Write({".gitagent/audit.jsonl"})
🚨 Escalating to human — CRITICAL finding: hardcoded secret in src/auth.js:42
✅ Done. Audit written to .gitagent/audit.jsonl

Reading the audit log

# Pretty-print all entries
cat .gitagent/audit.jsonl | jq '.'

# Show only blocked PRs
cat .gitagent/audit.jsonl | jq 'select(.verdict == "BLOCKED")'

# Count reviews per day
cat .gitagent/audit.jsonl | jq -r '.timestamp[:10]' | sort | uniq -c

CI/CD Integration

GitHub Actions

Add this workflow to trigger GitClaw automatically on every PR:

# .github/workflows/gitclaw-review.yml
name: GitClaw PR Review

on:
  pull_request:
    types: [opened, synchronize, reopened]

jobs:
  review:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - uses: actions/setup-node@v4
        with:
          node-version: '20'

      - name: Install dependencies
        run: npm install

      - name: Run GitClaw review
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
          PR_NUMBER: ${{ github.event.pull_request.number }}
          GITHUB_REPO: ${{ github.repository }}
        run: node index.js

Add ANTHROPIC_API_KEY to your repo's Actions secrets under Settings → Secrets and variables → Actions.

GitLab CI

gitclaw-review:
  image: node:20
  script:
    - npm install
    - node index.js $CI_MERGE_REQUEST_IID $CI_PROJECT_PATH
  variables:
    GITHUB_TOKEN: $GITHUB_TOKEN
    ANTHROPIC_API_KEY: $ANTHROPIC_API_KEY
  rules:
    - if: $CI_PIPELINE_SOURCE == "merge_request_event"

Audit Dashboard

dashboard/index.html is a self-contained, zero-dependency dashboard that runs entirely in the browser — no server, no build step.

To open it:

open dashboard/index.html
# or just double-click it in Finder

To load data: drag and drop your .gitagent/audit.jsonl file onto the page, or click to browse.

You'll see:

Summary stat cards — total reviews, blocked PRs, escalations, critical findings, average risk score
Full review history table with verdict badges, risk scores, finding counts, escalation flags, and signature previews
Color-coded risk levels (🔴 ≥60, 🟡 ≥30, 🟢 <30)

All data is processed locally — nothing leaves your machine.

Serverless Deployment

Deploy to Clawless for zero-infra, webhook-triggered reviews — no server needed:

# Deploy
npx clawless deploy

# View deployment status
npx clawless status

# Tail live logs
npx clawless logs --follow

After deploying, register the Clawless webhook URL in your GitHub repo under Settings → Webhooks. Select the Pull requests event. GitClaw will fire automatically on every PR open/update.

The clawless.config.js handles:

Webhook payload mapping (PR number, repo, commit SHA → env vars)
Secret injection from Clawless secrets store
Persistent volume for the audit log across invocations

Example Output

Posted directly as a GitHub PR comment:

## ReviewAgent Report 🤖

**PR:** #42 · **Files changed:** 3 · **Verdict:** 🚫 BLOCKED

---

### 🚫 Critical (must fix before merge)
- **[CRITICAL]** `src/auth.js:42` — Hardcoded API key detected. Move to environment variable.
  > Rule: OWASP A02:2021 Cryptographic Failures
  > Fix: `const key = process.env.API_KEY`
  > Ref: https://owasp.org/Top10/A02_2021-Cryptographic_Failures/

### ⚠️ High
- **[HIGH]** `src/db.js:17` — Raw SQL string concatenation. SQL injection risk.
  > Rule: OWASP A03:2021 Injection
  > Fix: Use parameterized queries — `db.query('SELECT * FROM users WHERE id = ?', [id])`
  > Ref: https://owasp.org/Top10/A03_2021-Injection/

### 🔶 Medium
- **[MEDIUM]** `src/api.js:88` — `console.log` with user data left in production path.
  > Fix: Remove or replace with a structured logger that respects log levels.

### 💡 Suggestions
- **[LOW]** `src/utils.js:12` — Unused import `lodash`. Remove to reduce bundle size.
- **[INFO]** `src/api.js:34` — Consider extracting this 40-line function for testability.

---
*Reviewed by ReviewAgent v1.0.0 · Audit entry written to `.gitagent/audit.jsonl`*

Severity Levels

Level	Badge	Meaning	Blocks merge?
CRITICAL	🚫	Security vulnerability, hardcoded secret, data exposure	Yes
HIGH	⚠️	Injection risk, broken auth, unsafe dependency	Yes
MEDIUM	🔶	Missing tests, debug code in prod, deprecated API	Recommended fix
LOW	💬	Style, naming, unused imports	No
INFO	💡	Minor refactor suggestions	No

Risk Score

Every review computes a weighted risk score (0–100):

score = min(100, CRITICAL×40 + HIGH×15 + MEDIUM×5 + LOW×1)

Score	Badge	Label
60–100	🔴	CRITICAL RISK — human escalation triggered
30–59	🟡	ELEVATED RISK — changes requested
0–29	🟢	LOW RISK — likely approvable

The score is included in:

The PR comment header (visible to all reviewers)
The audit log entry (risk_score field)
The git commit message (audit: PR #42 — BLOCKED (risk: 80/100))
The dashboard stat cards

Audit Log

Every review appends a structured entry to .gitagent/audit.jsonl:

{
  "timestamp": "2025-09-15T14:32:00Z",
  "agent": "pr-review-agent",
  "version": "1.0.0",
  "event": "pr_reviewed",
  "pr_number": 42,
  "repo": "owner/repo",
  "verdict": "BLOCKED",
  "findings": {
    "CRITICAL": 1,
    "HIGH": 1,
    "MEDIUM": 1,
    "LOW": 1,
    "INFO": 0
  },
  "human_escalated": true,
  "skill_invoked": "review-pr",
  "commit_sha": "abc123def456",
  "reviewer": "ReviewAgent/claude-sonnet-4-5"
}

Properties:

Field	Type	Description
`timestamp`	ISO 8601 UTC	When the review ran
`verdict`	string	`APPROVED`, `CHANGES_REQUESTED`, or `BLOCKED`
`findings`	object	Count of findings per severity level
`human_escalated`	boolean	Whether a human reviewer was paged
`commit_sha`	string	Head commit of the PR at review time
`reviewer`	string	Agent + model that produced the review

The log is append-only and version-controlled. It survives repo clones, is diff-able in git history, and serves as a compliance artifact for SOC 2 / ISO 27001 audits.

Cryptographic Signatures

Every audit entry is signed before it's written to disk using gitclaw identity sign (Ed25519). The signed entry includes two extra fields:

{
  "...": "...",
  "signature": "ed25519:base64encodedSignatureHere==",
  "public_key": "SHA256:fingerprint"
}

To verify an entry:

npx gitclaw identity verify --entry "$(tail -1 .gitagent/audit.jsonl)"

If gitclaw identity is unavailable (e.g. in a minimal CI environment), the agent falls back to a deterministic UNSIGNED:<hash> placeholder so the schema stays consistent and the field is always present. You can grep for UNSIGNED: to detect unverified entries.

Memory System

memory/patterns.json is a local learning file that grows with each PR review. It stores:

hot_paths — directories with the most frequent findings (top 10), e.g. src/auth, db/migrations
recurring_issues — issue patterns seen more than once, with counts and last-seen timestamps (top 20)
version — increments on every update so you can track drift

Before each review, the agent reads this file and injects the context into its task prompt:

"Recurring issues to watch: "sql injection in db.js" (seen 4x, HIGH); "hardcoded token" (seen 2x, CRITICAL)"

This means the agent gets progressively more focused on your codebase's specific weaknesses over time.

To reset memory:

rm memory/patterns.json

To inspect it:

cat memory/patterns.json | jq '.recurring_issues[:5]'

Human-in-the-Loop

GitClaw escalates automatically when a PR touches:

Authentication or session logic
Payment or billing code
Database migrations
Cryptographic primitives
Environment secrets or .env files

When escalated, the agent logs human_escalated: true in the audit entry and outputs a 🚨 line to the console (or triggers a Clawless notification if deployed). It never auto-merges — it only recommends.

Skills Reference

`review-pr`

The core skill. Fetches the PR diff via the github-pr tool, scans across five severity categories, and formats a structured Markdown comment. Runs first, always.

`justify-decision`

Runs after review-pr for any HIGH or CRITICAL finding. Maps the finding to an authoritative source (OWASP Top 10, CVE database, NIST, CWE, ESLint docs) and appends a one-line citation. Makes findings undeniable.

`audit-log`

Runs last. Appends a structured JSON entry to .gitagent/audit.jsonl. Never overwrites — always appends. The file is committed to the repo on each run so the trail is version-controlled.

Environment Variables

Variable	Required	Description
`GITHUB_TOKEN`	Yes	GitHub PAT with `repo` scope
`ANTHROPIC_API_KEY`	Yes	Anthropic API key
`PR_NUMBER`	Yes*	PR number to review (*or first CLI arg)
`GITHUB_REPO`	Yes*	`owner/repo` format (*or second CLI arg)

Troubleshooting

gitclaw: command not found / Cannot find package 'gitclaw' Run npm install first. The gitclaw and clawless packages must be installed.

Error: GITHUB_TOKEN is not set Copy .env.example to .env and fill in your token. Make sure it has the repo scope.

npm run validate fails Check that agent.yaml references skill names that exactly match the name: field in each SKILL.md frontmatter.

Agent posts no comment on the PR Verify your GITHUB_TOKEN has write access to the target repo. Tokens for forks won't have permission to post on the upstream repo by default.

Audit log not persisting between Clawless runs Confirm the audit-trail volume is configured in clawless.config.js and that .gitagent is in the mountPath.

Built With

Claude Sonnet — core reasoning model
GitClaw — agent runtime and skill orchestration
Clawless — serverless deployment and webhook triggers

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.claude		.claude
.github/workflows		.github/workflows
dashboard		dashboard
skills		skills
tools		tools
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
RULES.md		RULES.md
SOUL.md		SOUL.md
agent.yaml		agent.yaml
clawless.config.js		clawless.config.js
index.js		index.js
package-lock.json		package-lock.json
package.json		package.json
providers.js		providers.js
test.js		test.js

Folders and files

Latest commit

History

Repository files navigation

GitClaw — AI-Powered PR Review Agent

Table of Contents

What It Does

How It Works

Architecture

Provider Fallback Chain

Prerequisites

Installation

Configuration

agent.yaml

RULES.md

SOUL.md

Usage

Basic

Using environment variables

Dry-run mode (any public repo, no write access needed)

npm scripts

What you'll see

Reading the audit log

CI/CD Integration

GitHub Actions

GitLab CI

Audit Dashboard

Serverless Deployment

Example Output

Severity Levels

Risk Score

Audit Log

Cryptographic Signatures

Memory System

Human-in-the-Loop

Skills Reference

review-pr

justify-decision

audit-log

Environment Variables

Troubleshooting

Built With

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`agent.yaml`

`RULES.md`

`SOUL.md`

`review-pr`

`justify-decision`

`audit-log`

Packages