Cadence

Detects AI-generated content in git repositories and websites.

Analyze suspicious commits via code patterns, velocity anomalies, and statistical markers. Scan websites for AI-generated text using pattern detection and optional OpenAI validation.

Status: Ready to use | Tests: 70+ passing | Go: 1.23.0

Quick Start

Install

Version information (version, commit hash, build time) is automatically injected during build from git tags.

Quick Start (all platforms)

git clone https://github.com/codemeapixel/cadence.git
cd cadence

make build

The Makefile automatically detects your OS and uses the appropriate build method (PowerShell on Windows, shell on Unix/Linux/macOS).

Alternative Methods

# Using scripts directly
./scripts/build.sh        # Linux/macOS
.\scripts\build.ps1       # Windows

# Direct Go (no version injection)
go build ./cmd/cadence

Version injection is automatic when using make build or the platform-specific scripts.

Analyze a Repository

# Generate default config
./cadence config > cadence.yaml

# Scan a repo for AI-generated commits
./cadence analyze /path/to/repo -o report.txt --config cadence.yaml

Output shows commits with unusual patterns, confidence scores, and reasons why each was flagged.

Usage

Analyze a Repository

# Generate default config
./cadence config > cadence.yml

# Scan a repo (auto-loads cadence.yml if in current directory)
./cadence analyze /path/to/repo -o report.txt

# Or specify config explicitly
./cadence analyze /path/to/repo -o report.txt --config cadence.yml

# With custom thresholds (overrides config)
./cadence analyze /path/to/repo \
  -o report.json \
  --suspicious-additions 500 \
  --max-additions-pm 100

# Analyze specific branch
./cadence analyze /path/to/repo \
  -o report.json \
  --branch main

# Exclude certain files (node_modules, lock files, etc)
./cadence analyze /path/to/repo \
  -o report.json \
  --exclude-files "*.min.js,package-lock.json"

Analyze Website Content for AI-Generated Text

# Detect AI-generated content on a website
./cadence web https://example.com

# Generate JSON report and save to file
./cadence web https://example.com --json --output report.json

# With AI expert analysis (requires CADENCE_AI_KEY)
./cadence web https://example.com --config cadence.yml --verbose

The cadence web command analyzes website content for common AI patterns:

Overused phrases: "in today's world", "furthermore", "in conclusion"
Generic language: "provide value", "various", "stakeholder", "utilize"
Excessive structure: Too many bullet points, numbered lists, perfect formatting
Perfect grammar: No contractions, no colloquialisms, suspiciously polished
Boilerplate text: "our mission", "award-winning", "industry-leading"
Repetitive patterns: Sentences starting with same words
Lack of nuance: Few specific examples, no citations, vague references
Over-explanation: Excessive transition phrases, explains obvious concepts

Output Options:

Text format (default) - Human-readable with detailed pattern breakdown
JSON format (--json) - Machine-readable with full metadata
File output (--output <file>) - Save report to file instead of stdout

Report Features:

Confidence score (0-100%) with assessment
Detailed pattern breakdown with severity ratings
Specific examples of flagged content (up to 5 per pattern)
Context showing where patterns appear in the content
Content quality metrics (word count, headings, quality score)

Note: cadence.yml in the current directory is automatically loaded if no --config flag is specified.

Output Example (Text)

SUSPICIOUS COMMITS
Found 1 suspicious commit(s):

[1] Commit: a1b2c3d4
    Author:     John Doe <john@example.com>
    Date:       2024-01-27T10:30:00Z
    Confidence: 66.7%
    Additions:  1500 lines / 2000 total
    Deletions:  1200 lines / 1500 total
    Files:      45 files changed
    Time Delta: 0.50 minutes
    Velocity:   3000 additions/min | 2400 deletions/min
    
    Reasons:
    - Large commit: 1500 additions (threshold: 500)
    - Fast velocity: 3000 additions/min (threshold: 100)

Output Example (JSON)

{
  "suspicious_commits": [
    {
      "hash": "a1b2c3d4...",
      "author": "John Doe",
      "timestamp": "2024-01-27T10:30:00Z",
      "confidence_score": 0.667,
      "additions_filtered": 1500,
      "deletions_filtered": 1200,
      "addition_velocity_per_min": 3000.0,
      "reasons": [
        "Large commit: 1500 additions (threshold: 500)",
        "Fast velocity: 3000 additions/min (threshold: 100)"
      ]
    }
  ]
}

Detection Strategies

Cadence flags commits that are suspicious based on:

Strategy	What it looks for	Indicator
Velocity	Abnormally fast coding	>100 additions/min
Size	Huge commits	>500 additions
Timing	Rapid-fire commits	<60 sec apart
Additions Only	No deletions, all adds	>90% additions
Merge Pattern	Unusual merge behavior	Context-dependent

Confidence Score: Increases with each triggered strategy. Multiple signals = higher confidence.

AI-Powered Analysis (Optional)

Cadence can leverage OpenAI's GPT models to analyze flagged commits for additional AI-generation indicators. This is optional and requires an OpenAI API key.

Why Use AI Analysis?

Second opinion: AI provides independent assessment of suspicious commits
Token efficient: Only analyzes already-flagged commits (not all commits)
Lightweight: Uses GPT-4 Mini for cost efficiency
Complementary: Works alongside statistical detection, not instead of it

Setup

Get an OpenAI API key from https://platform.openai.com/api-keys
Enable in config or environment:

# Via config file (cadence.yaml)
ai:
  enabled: true
  provider: "openai"
  api_key: "sk-..."  # or use env var below
  model: "gpt-4-mini"

# OR via environment variable
export CADENCE_AI_KEY="sk-..."

Run analysis as normal - AI kicks in automatically for suspicious commits

Output

AI analysis appears in both text and JSON reports:

Text Report:

    AI Analysis:     likely AI-generated

JSON Report:

"ai_analysis": "likely AI-generated"

Cost Estimation

Average suspicious commit: ~200 tokens
GPT-4 Mini: ~$0.00015 per 1K tokens
Cost per analysis: ~$0.00003 (3 cents per 1000 commits)

Configuration

Config File (YAML)

Create a cadence.yaml:

thresholds:
  # Commit size limits
  suspicious_additions: 500      # additions per commit
  suspicious_deletions: 1000     # deletions per commit
  
  # Velocity limits
  max_additions_per_min: 100     # additions per minute
  max_deletions_per_min: 500     # deletions per minute
  
  # Timing
  min_time_delta_seconds: 60     # seconds between commits

# Files to ignore
exclude_files:
  - "*.min.js"
  - "package-lock.json"
  - "yarn.lock"

Command Line Flags

./cadence analyze <repo> [flags]

Flags:
  -o, --output string              Output file (required) - .txt or .json
  --suspicious-additions int       Flag commits >N additions (default: 500)
  --suspicious-deletions int       Flag commits >N deletions (default: 1000)
  --max-additions-pm float         Max additions per minute (default: 100)
  --max-deletions-pm float         Max deletions per minute (default: 500)
  --min-time-delta int            Min seconds between commits (default: 60)
  --branch string                 Branch to analyze (default: all)
  --exclude-files strings         File patterns to exclude
  --config string                 Config file path

Environment Variables

# Set webhook server config
export CADENCE_WEBHOOK_PORT=3000
export CADENCE_WEBHOOK_SECRET="your-secret-key"
export CADENCE_WEBHOOK_MAX_WORKERS=4

Webhook Server

Start the Server

./cadence webhook --port 3000 --secret "webhook-secret-key"

Configure GitHub Webhook

Repository Settings → Webhooks → Add webhook
Payload URL: https://your-server:3000/webhooks/github
Content type: application/json
Secret: Use same value as --secret flag
Events: Select "Push events"

API Endpoints

Receive webhook push event

POST /webhooks/github
POST /webhooks/gitlab

Returns:

{
  "job_id": "uuid",
  "status": "processing"
}

Check job status

GET /jobs/:id

Returns:

{
  "id": "job-uuid",
  "status": "completed|processing|pending|failed",
  "repo": "repo-name",
  "branch": "main",
  "timestamp": "2024-01-27T10:30:00Z",
  "result": {
    "suspicious_commits": [...]
  }
}

List recent jobs

GET /jobs?limit=50

Health check

GET /health

How It Works

GitHub sends push webhook → HTTP POST to /webhooks/github
Cadence returns immediately with a job ID
Analysis happens in background (non-blocking)
Poll /jobs/:id to check progress
Results available when status is completed

Common Questions

Q: Can I use this in CI/CD?
A: Yes. Run cadence analyze in your pipeline, parse the JSON output, and fail the build if suspicious commits found.

Q: How accurate is it?
A: Depends on your thresholds. Aggressive settings catch more but have more false positives. Start with defaults and tune.

Q: What about non-AI code that looks suspicious?
A: The confidence score helps - legitimate fast commits might trigger one strategy but not multiple. Check the reasons.

Q: Does it work with GitHub/GitLab Enterprise?
A: Webhooks work with any Git host. Self-hosted instances need network access to your Cadence server.

Q: Can I extend it?
A: Yes. Detection strategies are pluggable interfaces in internal/detector/. Add custom logic easily.

Development

Build

# Using Makefile (Linux/macOS)
make build

# Or direct Go (all platforms)
go build ./cmd/cadence

# Version info is automatically injected from git tags via go:generate

Available Make Targets

make build    # Build binary with version injection
make install  # Install to $GOPATH/bin
make test     # Run all tests
make cover    # Run tests with coverage
make fmt      # Format code
make tidy     # Tidy dependencies  
make lint     # Run linter
make vet      # Run go vet
make run      # Run application
make clean    # Clean build artifacts
make help     # Show all targets

Run Tests

go test ./...
go test -cover ./...  # With coverage

Project Structure

cmd/cadence/          - CLI commands (analyze, webhook, config)
internal/
  analyzer/           - Repository analyzer orchestrator
  detector/           - Detection strategies
  git/                - Git operations
  metrics/            - Statistics and velocity calculations
  reporter/           - Output formatting (text, JSON)
  config/             - Configuration loading
  webhook/            - Webhook server (GitHub, GitLab)
  web/                - Website content fetching and analysis
    patterns/         - Web pattern detection strategies
  errors/             - Error types
test/                 - Integration tests

Adding Custom Detection Strategies

For Git Commit Analysis:

Create a new strategy in internal/detector/:

type CustomStrategy struct{}

func (s *CustomStrategy) Name() string {
    return "custom_detection"
}

func (s *CustomStrategy) Detect(pair *git.CommitPair, stats *metrics.RepositoryStats) (bool, string) {
    if isCustomSuspicious(pair) {
        return true, "Your reason here"
    }
    return false, ""
}

Register it in internal/detector/detector.go and it will automatically be used.

For Web Content Analysis:

Create a custom pattern strategy:

import "github.com/codemeapixel/cadence/internal/web/patterns"

// Create custom strategy
customStrategy := patterns.NewCustomPatternStrategy(
    "marketing_speak",
    []string{"synergy", "innovative", "disruptive"},
    2, // threshold
)

// Register with analyzer
analyzer := patterns.NewTextSlopAnalyzer()
analyzer.RegisterStrategy(customStrategy)

Or implement the WebPatternStrategy interface for more complex logic:

type MyCustomStrategy struct{}

func (s *MyCustomStrategy) Name() string {
    return "my_custom_pattern"
}

func (s *MyCustomStrategy) Detect(content string, wordCount int) *patterns.DetectionResult {
    // Your detection logic here
    if detected {
        return &patterns.DetectionResult{
            Detected:    true,
            Type:        s.Name(),
            Severity:    0.8,
            Description: "Custom pattern detected",
            Examples:    []string{"example1", "example2"},
        }
    }
    return nil
}

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.github		.github
cmd/cadence		cmd/cadence
internal		internal
scripts		scripts
test		test
.gitignore		.gitignore
.golangci.yml		.golangci.yml
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
Makefile		Makefile
codecov.yml		codecov.yml
go.mod		go.mod
go.sum		go.sum
golangci.yml		golangci.yml
skills.json		skills.json

Uh oh!

License

CodeMeAPixel/Cadence

Folders and files

Latest commit

History

Repository files navigation

Cadence

Quick Start

Install

Analyze a Repository

Usage

Analyze a Repository

Analyze Website Content for AI-Generated Text

Output Example (Text)

Output Example (JSON)

Detection Strategies

AI-Powered Analysis (Optional)

Why Use AI Analysis?

Setup

Output

Cost Estimation

Configuration

Config File (YAML)

Command Line Flags

Environment Variables

Webhook Server

Start the Server

Configure GitHub Webhook

API Endpoints

Receive webhook push event

Check job status

List recent jobs

Health check

How It Works

Common Questions

Development

Build

Available Make Targets

Run Tests

Project Structure

Adding Custom Detection Strategies

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 4

Sponsor this project

Uh oh!

Uh oh!

Contributors 2

Uh oh!

Languages