Skip to content

SAQLAINAP/GitClaw-Agent

Repository files navigation

GitClaw — AI-Powered PR Review Agent

A senior engineer embedded in your git workflow. Reviews pull requests for security, quality, and compliance — with full observability, drift detection, and a 5-provider LLM fallback chain. Everything lives in git.

License: MIT Node gitagent gitclaw Providers


Table of Contents


What It Does

GitClaw is a provider-agnostic AI agent that runs on every pull request and:

  • Narrates the diff in plain English before reviewing — so human reviewers know exactly where to focus
  • Reviews for security — hardcoded secrets, SQL injection, broken auth, unsafe dependencies
  • Scores code quality across 4 dimensions: complexity, test coverage gap, duplication, maintainability (0–100 each)
  • Cites authoritative references — OWASP, CVE, ESLint, NIST — for every HIGH/CRITICAL finding
  • Posts a structured GitHub comment with verdict, risk score, and actionable fixes
  • Emits OTel trace spans — latency, token count, cost per run — to .gitagent/traces.jsonl
  • Writes structured daily logs at DEBUG/INFO/WARN/ERROR to .gitagent/logs/YYYY-MM-DD.ndjson
  • Detects behavioral drift — compares rolling reviews against a baseline, fires alerts if the agent becomes lenient
  • Monitors its own health — p50/p95/p99 latency, cost trends, escalation rate — every 5 reviews
  • Learns your codebase — memory system tracks recurring patterns and hot paths across PRs
  • Signs every audit entry with Ed25519 — tamper-evident compliance artifact
  • Escalates to humans when PRs touch auth, payments, DB migrations, or secrets
  • Falls back across 5 LLM providers — Anthropic → OpenAI → Groq → NVIDIA NIM → Gemini

How It Works

When triggered (manually, via CLI, or by a GitHub webhook), the agent runs 9 skills in sequence:

PR opened / updated
       │
       ▼
┌──────────────────┐
│  narrate-diff    │  ← Plain-English summary, identifies highest-risk file:line
└────────┬─────────┘
         ▼
┌──────────────────┐
│  review-pr       │  ← Fetches diff, scans CRITICAL→INFO, formats comment
└────────┬─────────┘
         ▼
┌──────────────────┐
│  quality-score   │  ← Complexity, test gap, duplication, maintainability (0–100)
└────────┬─────────┘
         ▼
┌──────────────────┐
│ justify-decision │  ← HIGH/CRITICAL only: OWASP/CVE/ESLint citations
└────────┬─────────┘
         ▼
┌──────────────────┐
│  audit-log       │  ← Signed JSON → .gitagent/audit.jsonl
└────────┬─────────┘
         ▼
┌──────────────────┐
│  emit-log        │  ← Structured NDJSON → .gitagent/logs/YYYY-MM-DD.ndjson
└────────┬─────────┘
         ▼
┌──────────────────┐
│  observe-trace   │  ← OTel span → .gitagent/traces.jsonl
└────────┬─────────┘
         ▼
  Post GitHub comment · escalate if CRITICAL · run health/drift checks

Each step is logged. Nothing is auto-merged. The agent recommends; humans decide.


Architecture

GitClaw/
├── agent.yaml                             # Manifest — model, 9 skills, human-in-the-loop, compliance
├── SOUL.md                                # Agent identity and communication style
├── RULES.md                               # Hard constraints (must/must-never)
├── index.js                               # Orchestrator — provider selection, tracing, memory, health
├── providers.js                           # 5-provider fallback chain (Anthropic→OpenAI→Groq→NIM→Gemini)
├── test.js                                # Provider connectivity + review smoke tests
├── clawless.config.js                     # Serverless deployment (webhooks, secrets, volumes)
├── skills/
│   ├── narrate-diff/SKILL.md              # Plain-English PR summary, identifies risk focus area
│   ├── review-pr/SKILL.md                 # CRITICAL→INFO security & quality scan
│   ├── quality-score/SKILL.md             # 4-dimension code quality scorer (0–100 each)
│   ├── justify-decision/SKILL.md          # OWASP/CVE/ESLint citations for HIGH+ findings
│   ├── audit-log/SKILL.md                 # Append-only compliance trail
│   ├── emit-log/SKILL.md                  # Structured daily NDJSON logs
│   ├── observe-trace/SKILL.md             # OTel-compatible trace spans
│   ├── health-check/SKILL.md              # Agent health metrics — every 5 reviews
│   └── detect-drift/SKILL.md             # Behavioral drift detection — every 10 reviews
├── tools/
│   ├── github-pr.yaml                     # Tool schema: get_diff, post_comment, get_files
│   └── github-pr.js                       # Implementation using fetch() — WebContainer-safe
├── .github/
│   └── workflows/review.yml               # GitHub Actions trigger on PR open/update
├── dashboard/
│   └── index.html                         # Standalone audit dashboard (no server, drag-drop)
├── memory/
│   └── patterns.json                      # Codebase memory — hot paths, recurring issues
└── metrics/
    ├── health.json                         # Current agent health snapshot
    ├── baseline.json                       # Drift detection baseline (written at review #10)
    └── drift.json                          # Latest drift signal check result

Provider Fallback Chain

GitClaw is provider-agnostic. Set any combination of keys — it picks the first available one automatically:

Tier Provider Model Env Var Free tier?
1 Anthropic Claude claude-sonnet-4-5 ANTHROPIC_API_KEY No
2 OpenAI gpt-4.1 OPENAI_API_KEY No
3 Groq Llama llama-3.3-70b-versatile GROQ_API_KEY Yes
4 NVIDIA NIM llama-3.1-70b-instruct NVIDIA_API_KEY Yes (limited)
5 Google Gemini gemini-1.5-pro GEMINI_API_KEY Yes
# Force a specific tier by unsetting higher-priority keys
ANTHROPIC_API_KEY="" node index.js 42 owner/repo   # uses OpenAI
ANTHROPIC_API_KEY="" OPENAI_API_KEY="" node index.js 42 owner/repo  # uses Groq

Prerequisites

  • Node.js v18 or higher (node --version)
  • npm v9 or higher
  • At least one LLM API key (see table above — Groq has a free tier)
  • A GitHub Personal Access Token with repo scope (optional for dry-run on public repos)

Installation

# Clone the repo
git clone https://github.com/your-org/gitclaw.git
cd gitclaw

# Install dependencies
npm install

# Set up credentials
cp .env.example .env

Open .env and fill in:

GITHUB_TOKEN=
ANTHROPIC_API_KEY=

Configuration

agent.yaml

Controls the model, which skills are active, and when to escalate to a human:

model:
  preferred: claude-sonnet-4-5-20250929  # swap to claude-opus for deeper reviews

human_in_the_loop:
  enabled: true
  trigger: "when PR touches auth, secrets, DB migrations, or billing logic"

RULES.md

Hard behavioral constraints — the agent reads this on every run. Edit it to add project-specific rules (e.g. "always flag usage of our deprecated internal SDK").

SOUL.md

Defines the agent's tone and expertise. You can tune it to match your team's review culture.


Usage

Basic

# Review PR #42 in owner/repo
node index.js 42 owner/repo

Using environment variables

PR_NUMBER=42 GITHUB_REPO=owner/repo npm start

Dry-run mode (any public repo, no write access needed)

# Fetches the real diff, runs the full review, prints the comment instead of posting
node index.js 123 vercel/next.js --dry-run
node index.js 42 expressjs/express --dry-run

npm scripts

npm start              # node index.js (reads PR_NUMBER + GITHUB_REPO from env)
npm test               # provider connectivity + review smoke test (no repo needed)
npm run test:quick     # ping all configured providers only
npm run test:anthropic # test Anthropic provider specifically
npm run test:groq      # test Groq specifically
npm run validate       # validates agent.yaml and skill manifests

What you'll see

▶ Running skill: review-pr
  🔧 Tool: github-pr({"action":"get_diff","repo":"owner/repo","pr_number":42})
  🔧 Tool: github-pr({"action":"post_comment","repo":"owner/repo","pr_number":42,...})
▶ Running skill: justify-decision
▶ Running skill: audit-log
  🔧 Tool: Write({".gitagent/audit.jsonl"})
🚨 Escalating to human — CRITICAL finding: hardcoded secret in src/auth.js:42
✅ Done. Audit written to .gitagent/audit.jsonl

Reading the audit log

# Pretty-print all entries
cat .gitagent/audit.jsonl | jq '.'

# Show only blocked PRs
cat .gitagent/audit.jsonl | jq 'select(.verdict == "BLOCKED")'

# Count reviews per day
cat .gitagent/audit.jsonl | jq -r '.timestamp[:10]' | sort | uniq -c

CI/CD Integration

GitHub Actions

Add this workflow to trigger GitClaw automatically on every PR:

# .github/workflows/gitclaw-review.yml
name: GitClaw PR Review

on:
  pull_request:
    types: [opened, synchronize, reopened]

jobs:
  review:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - uses: actions/setup-node@v4
        with:
          node-version: '20'

      - name: Install dependencies
        run: npm install

      - name: Run GitClaw review
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
          PR_NUMBER: ${{ github.event.pull_request.number }}
          GITHUB_REPO: ${{ github.repository }}
        run: node index.js

Add ANTHROPIC_API_KEY to your repo's Actions secrets under Settings → Secrets and variables → Actions.

GitLab CI

gitclaw-review:
  image: node:20
  script:
    - npm install
    - node index.js $CI_MERGE_REQUEST_IID $CI_PROJECT_PATH
  variables:
    GITHUB_TOKEN: $GITHUB_TOKEN
    ANTHROPIC_API_KEY: $ANTHROPIC_API_KEY
  rules:
    - if: $CI_PIPELINE_SOURCE == "merge_request_event"

Audit Dashboard

dashboard/index.html is a self-contained, zero-dependency dashboard that runs entirely in the browser — no server, no build step.

To open it:

open dashboard/index.html
# or just double-click it in Finder

To load data: drag and drop your .gitagent/audit.jsonl file onto the page, or click to browse.

You'll see:

  • Summary stat cards — total reviews, blocked PRs, escalations, critical findings, average risk score
  • Full review history table with verdict badges, risk scores, finding counts, escalation flags, and signature previews
  • Color-coded risk levels (🔴 ≥60, 🟡 ≥30, 🟢 <30)

All data is processed locally — nothing leaves your machine.


Serverless Deployment

Deploy to Clawless for zero-infra, webhook-triggered reviews — no server needed:

# Deploy
npx clawless deploy

# View deployment status
npx clawless status

# Tail live logs
npx clawless logs --follow

After deploying, register the Clawless webhook URL in your GitHub repo under Settings → Webhooks. Select the Pull requests event. GitClaw will fire automatically on every PR open/update.

The clawless.config.js handles:

  • Webhook payload mapping (PR number, repo, commit SHA → env vars)
  • Secret injection from Clawless secrets store
  • Persistent volume for the audit log across invocations

Example Output

Posted directly as a GitHub PR comment:

## ReviewAgent Report 🤖

**PR:** #42 · **Files changed:** 3 · **Verdict:** 🚫 BLOCKED

---

### 🚫 Critical (must fix before merge)
- **[CRITICAL]** `src/auth.js:42` — Hardcoded API key detected. Move to environment variable.
  > Rule: OWASP A02:2021 Cryptographic Failures
  > Fix: `const key = process.env.API_KEY`
  > Ref: https://owasp.org/Top10/A02_2021-Cryptographic_Failures/

### ⚠️ High
- **[HIGH]** `src/db.js:17` — Raw SQL string concatenation. SQL injection risk.
  > Rule: OWASP A03:2021 Injection
  > Fix: Use parameterized queries — `db.query('SELECT * FROM users WHERE id = ?', [id])`
  > Ref: https://owasp.org/Top10/A03_2021-Injection/

### 🔶 Medium
- **[MEDIUM]** `src/api.js:88` — `console.log` with user data left in production path.
  > Fix: Remove or replace with a structured logger that respects log levels.

### 💡 Suggestions
- **[LOW]** `src/utils.js:12` — Unused import `lodash`. Remove to reduce bundle size.
- **[INFO]** `src/api.js:34` — Consider extracting this 40-line function for testability.

---
*Reviewed by ReviewAgent v1.0.0 · Audit entry written to `.gitagent/audit.jsonl`*

Severity Levels

Level Badge Meaning Blocks merge?
CRITICAL 🚫 Security vulnerability, hardcoded secret, data exposure Yes
HIGH ⚠️ Injection risk, broken auth, unsafe dependency Yes
MEDIUM 🔶 Missing tests, debug code in prod, deprecated API Recommended fix
LOW 💬 Style, naming, unused imports No
INFO 💡 Minor refactor suggestions No

Risk Score

Every review computes a weighted risk score (0–100):

score = min(100, CRITICAL×40 + HIGH×15 + MEDIUM×5 + LOW×1)
Score Badge Label
60–100 🔴 CRITICAL RISK — human escalation triggered
30–59 🟡 ELEVATED RISK — changes requested
0–29 🟢 LOW RISK — likely approvable

The score is included in:

  • The PR comment header (visible to all reviewers)
  • The audit log entry (risk_score field)
  • The git commit message (audit: PR #42 — BLOCKED (risk: 80/100))
  • The dashboard stat cards

Audit Log

Every review appends a structured entry to .gitagent/audit.jsonl:

{
  "timestamp": "2025-09-15T14:32:00Z",
  "agent": "pr-review-agent",
  "version": "1.0.0",
  "event": "pr_reviewed",
  "pr_number": 42,
  "repo": "owner/repo",
  "verdict": "BLOCKED",
  "findings": {
    "CRITICAL": 1,
    "HIGH": 1,
    "MEDIUM": 1,
    "LOW": 1,
    "INFO": 0
  },
  "human_escalated": true,
  "skill_invoked": "review-pr",
  "commit_sha": "abc123def456",
  "reviewer": "ReviewAgent/claude-sonnet-4-5"
}

Properties:

Field Type Description
timestamp ISO 8601 UTC When the review ran
verdict string APPROVED, CHANGES_REQUESTED, or BLOCKED
findings object Count of findings per severity level
human_escalated boolean Whether a human reviewer was paged
commit_sha string Head commit of the PR at review time
reviewer string Agent + model that produced the review

The log is append-only and version-controlled. It survives repo clones, is diff-able in git history, and serves as a compliance artifact for SOC 2 / ISO 27001 audits.


Cryptographic Signatures

Every audit entry is signed before it's written to disk using gitclaw identity sign (Ed25519). The signed entry includes two extra fields:

{
  "...": "...",
  "signature": "ed25519:base64encodedSignatureHere==",
  "public_key": "SHA256:fingerprint"
}

To verify an entry:

npx gitclaw identity verify --entry "$(tail -1 .gitagent/audit.jsonl)"

If gitclaw identity is unavailable (e.g. in a minimal CI environment), the agent falls back to a deterministic UNSIGNED:<hash> placeholder so the schema stays consistent and the field is always present. You can grep for UNSIGNED: to detect unverified entries.


Memory System

memory/patterns.json is a local learning file that grows with each PR review. It stores:

  • hot_paths — directories with the most frequent findings (top 10), e.g. src/auth, db/migrations
  • recurring_issues — issue patterns seen more than once, with counts and last-seen timestamps (top 20)
  • version — increments on every update so you can track drift

Before each review, the agent reads this file and injects the context into its task prompt:

"Recurring issues to watch: "sql injection in db.js" (seen 4x, HIGH); "hardcoded token" (seen 2x, CRITICAL)"

This means the agent gets progressively more focused on your codebase's specific weaknesses over time.

To reset memory:

rm memory/patterns.json

To inspect it:

cat memory/patterns.json | jq '.recurring_issues[:5]'

Human-in-the-Loop

GitClaw escalates automatically when a PR touches:

  • Authentication or session logic
  • Payment or billing code
  • Database migrations
  • Cryptographic primitives
  • Environment secrets or .env files

When escalated, the agent logs human_escalated: true in the audit entry and outputs a 🚨 line to the console (or triggers a Clawless notification if deployed). It never auto-merges — it only recommends.


Skills Reference

review-pr

The core skill. Fetches the PR diff via the github-pr tool, scans across five severity categories, and formats a structured Markdown comment. Runs first, always.

justify-decision

Runs after review-pr for any HIGH or CRITICAL finding. Maps the finding to an authoritative source (OWASP Top 10, CVE database, NIST, CWE, ESLint docs) and appends a one-line citation. Makes findings undeniable.

audit-log

Runs last. Appends a structured JSON entry to .gitagent/audit.jsonl. Never overwrites — always appends. The file is committed to the repo on each run so the trail is version-controlled.


Environment Variables

Variable Required Description
GITHUB_TOKEN Yes GitHub PAT with repo scope
ANTHROPIC_API_KEY Yes Anthropic API key
PR_NUMBER Yes* PR number to review (*or first CLI arg)
GITHUB_REPO Yes* owner/repo format (*or second CLI arg)

Troubleshooting

gitclaw: command not found / Cannot find package 'gitclaw' Run npm install first. The gitclaw and clawless packages must be installed.

Error: GITHUB_TOKEN is not set Copy .env.example to .env and fill in your token. Make sure it has the repo scope.

npm run validate fails Check that agent.yaml references skill names that exactly match the name: field in each SKILL.md frontmatter.

Agent posts no comment on the PR Verify your GITHUB_TOKEN has write access to the target repo. Tokens for forks won't have permission to post on the upstream repo by default.

Audit log not persisting between Clawless runs Confirm the audit-trail volume is configured in clawless.config.js and that .gitagent is in the mountPath.


Built With

  • Claude Sonnet — core reasoning model
  • GitClaw — agent runtime and skill orchestration
  • Clawless — serverless deployment and webhook triggers

License

MIT

About

A senior engineer embedded in your git workflow. Reviews pull requests for security, quality, and compliance — with full observability, drift detection, and a 5-provider LLM fallback chain. Everything lives in git.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors