Detects AI-generated content in git repositories and websites.
Analyze suspicious commits via code patterns, velocity anomalies, and statistical markers. Scan websites for AI-generated text using pattern detection and optional OpenAI validation.
Status: Ready to use | Tests: 70+ passing | Go: 1.23.0
Version information (version, commit hash, build time) is automatically injected during build from git tags.
Quick Start (all platforms)
git clone https://github.com/codemeapixel/cadence.git
cd cadence
make buildThe Makefile automatically detects your OS and uses the appropriate build method (PowerShell on Windows, shell on Unix/Linux/macOS).
Alternative Methods
# Using scripts directly
./scripts/build.sh # Linux/macOS
.\scripts\build.ps1 # Windows
# Direct Go (no version injection)
go build ./cmd/cadenceVersion injection is automatic when using make build or the platform-specific scripts.
# Generate default config
./cadence config > cadence.yaml
# Scan a repo for AI-generated commits
./cadence analyze /path/to/repo -o report.txt --config cadence.yamlOutput shows commits with unusual patterns, confidence scores, and reasons why each was flagged.
# Generate default config
./cadence config > cadence.yml
# Scan a repo (auto-loads cadence.yml if in current directory)
./cadence analyze /path/to/repo -o report.txt
# Or specify config explicitly
./cadence analyze /path/to/repo -o report.txt --config cadence.yml
# With custom thresholds (overrides config)
./cadence analyze /path/to/repo \
-o report.json \
--suspicious-additions 500 \
--max-additions-pm 100
# Analyze specific branch
./cadence analyze /path/to/repo \
-o report.json \
--branch main
# Exclude certain files (node_modules, lock files, etc)
./cadence analyze /path/to/repo \
-o report.json \
--exclude-files "*.min.js,package-lock.json"# Detect AI-generated content on a website
./cadence web https://example.com
# Generate JSON report and save to file
./cadence web https://example.com --json --output report.json
# With AI expert analysis (requires CADENCE_AI_KEY)
./cadence web https://example.com --config cadence.yml --verboseThe cadence web command analyzes website content for common AI patterns:
- Overused phrases: "in today's world", "furthermore", "in conclusion"
- Generic language: "provide value", "various", "stakeholder", "utilize"
- Excessive structure: Too many bullet points, numbered lists, perfect formatting
- Perfect grammar: No contractions, no colloquialisms, suspiciously polished
- Boilerplate text: "our mission", "award-winning", "industry-leading"
- Repetitive patterns: Sentences starting with same words
- Lack of nuance: Few specific examples, no citations, vague references
- Over-explanation: Excessive transition phrases, explains obvious concepts
Output Options:
- Text format (default) - Human-readable with detailed pattern breakdown
- JSON format (
--json) - Machine-readable with full metadata - File output (
--output <file>) - Save report to file instead of stdout
Report Features:
- Confidence score (0-100%) with assessment
- Detailed pattern breakdown with severity ratings
- Specific examples of flagged content (up to 5 per pattern)
- Context showing where patterns appear in the content
- Content quality metrics (word count, headings, quality score)
Note: cadence.yml in the current directory is automatically loaded if no --config flag is specified.
SUSPICIOUS COMMITS
Found 1 suspicious commit(s):
[1] Commit: a1b2c3d4
Author: John Doe <john@example.com>
Date: 2024-01-27T10:30:00Z
Confidence: 66.7%
Additions: 1500 lines / 2000 total
Deletions: 1200 lines / 1500 total
Files: 45 files changed
Time Delta: 0.50 minutes
Velocity: 3000 additions/min | 2400 deletions/min
Reasons:
- Large commit: 1500 additions (threshold: 500)
- Fast velocity: 3000 additions/min (threshold: 100)
{
"suspicious_commits": [
{
"hash": "a1b2c3d4...",
"author": "John Doe",
"timestamp": "2024-01-27T10:30:00Z",
"confidence_score": 0.667,
"additions_filtered": 1500,
"deletions_filtered": 1200,
"addition_velocity_per_min": 3000.0,
"reasons": [
"Large commit: 1500 additions (threshold: 500)",
"Fast velocity: 3000 additions/min (threshold: 100)"
]
}
]
}Cadence flags commits that are suspicious based on:
| Strategy | What it looks for | Indicator |
|---|---|---|
| Velocity | Abnormally fast coding | >100 additions/min |
| Size | Huge commits | >500 additions |
| Timing | Rapid-fire commits | <60 sec apart |
| Additions Only | No deletions, all adds | >90% additions |
| Merge Pattern | Unusual merge behavior | Context-dependent |
Confidence Score: Increases with each triggered strategy. Multiple signals = higher confidence.
Cadence can leverage OpenAI's GPT models to analyze flagged commits for additional AI-generation indicators. This is optional and requires an OpenAI API key.
- Second opinion: AI provides independent assessment of suspicious commits
- Token efficient: Only analyzes already-flagged commits (not all commits)
- Lightweight: Uses GPT-4 Mini for cost efficiency
- Complementary: Works alongside statistical detection, not instead of it
- Get an OpenAI API key from https://platform.openai.com/api-keys
- Enable in config or environment:
# Via config file (cadence.yaml)
ai:
enabled: true
provider: "openai"
api_key: "sk-..." # or use env var below
model: "gpt-4-mini"
# OR via environment variable
export CADENCE_AI_KEY="sk-..."- Run analysis as normal - AI kicks in automatically for suspicious commits
AI analysis appears in both text and JSON reports:
Text Report:
AI Analysis: likely AI-generated
JSON Report:
"ai_analysis": "likely AI-generated"- Average suspicious commit: ~200 tokens
- GPT-4 Mini: ~$0.00015 per 1K tokens
- Cost per analysis: ~$0.00003 (3 cents per 1000 commits)
Create a cadence.yaml:
thresholds:
# Commit size limits
suspicious_additions: 500 # additions per commit
suspicious_deletions: 1000 # deletions per commit
# Velocity limits
max_additions_per_min: 100 # additions per minute
max_deletions_per_min: 500 # deletions per minute
# Timing
min_time_delta_seconds: 60 # seconds between commits
# Files to ignore
exclude_files:
- "*.min.js"
- "package-lock.json"
- "yarn.lock"./cadence analyze <repo> [flags]
Flags:
-o, --output string Output file (required) - .txt or .json
--suspicious-additions int Flag commits >N additions (default: 500)
--suspicious-deletions int Flag commits >N deletions (default: 1000)
--max-additions-pm float Max additions per minute (default: 100)
--max-deletions-pm float Max deletions per minute (default: 500)
--min-time-delta int Min seconds between commits (default: 60)
--branch string Branch to analyze (default: all)
--exclude-files strings File patterns to exclude
--config string Config file path# Set webhook server config
export CADENCE_WEBHOOK_PORT=3000
export CADENCE_WEBHOOK_SECRET="your-secret-key"
export CADENCE_WEBHOOK_MAX_WORKERS=4./cadence webhook --port 3000 --secret "webhook-secret-key"- Repository Settings → Webhooks → Add webhook
- Payload URL:
https://your-server:3000/webhooks/github - Content type:
application/json - Secret: Use same value as
--secretflag - Events: Select "Push events"
POST /webhooks/github
POST /webhooks/gitlab
Returns:
{
"job_id": "uuid",
"status": "processing"
}GET /jobs/:id
Returns:
{
"id": "job-uuid",
"status": "completed|processing|pending|failed",
"repo": "repo-name",
"branch": "main",
"timestamp": "2024-01-27T10:30:00Z",
"result": {
"suspicious_commits": [...]
}
}GET /jobs?limit=50
GET /health
- GitHub sends push webhook → HTTP POST to
/webhooks/github - Cadence returns immediately with a job ID
- Analysis happens in background (non-blocking)
- Poll
/jobs/:idto check progress - Results available when
statusiscompleted
Q: Can I use this in CI/CD?
A: Yes. Run cadence analyze in your pipeline, parse the JSON output, and fail the build if suspicious commits found.
Q: How accurate is it?
A: Depends on your thresholds. Aggressive settings catch more but have more false positives. Start with defaults and tune.
Q: What about non-AI code that looks suspicious?
A: The confidence score helps - legitimate fast commits might trigger one strategy but not multiple. Check the reasons.
Q: Does it work with GitHub/GitLab Enterprise?
A: Webhooks work with any Git host. Self-hosted instances need network access to your Cadence server.
Q: Can I extend it?
A: Yes. Detection strategies are pluggable interfaces in internal/detector/. Add custom logic easily.
# Using Makefile (Linux/macOS)
make build
# Or direct Go (all platforms)
go build ./cmd/cadence
# Version info is automatically injected from git tags via go:generatemake build # Build binary with version injection
make install # Install to $GOPATH/bin
make test # Run all tests
make cover # Run tests with coverage
make fmt # Format code
make tidy # Tidy dependencies
make lint # Run linter
make vet # Run go vet
make run # Run application
make clean # Clean build artifacts
make help # Show all targetsgo test ./...
go test -cover ./... # With coveragecmd/cadence/ - CLI commands (analyze, webhook, config)
internal/
analyzer/ - Repository analyzer orchestrator
detector/ - Detection strategies
git/ - Git operations
metrics/ - Statistics and velocity calculations
reporter/ - Output formatting (text, JSON)
config/ - Configuration loading
webhook/ - Webhook server (GitHub, GitLab)
web/ - Website content fetching and analysis
patterns/ - Web pattern detection strategies
errors/ - Error types
test/ - Integration tests
For Git Commit Analysis:
Create a new strategy in internal/detector/:
type CustomStrategy struct{}
func (s *CustomStrategy) Name() string {
return "custom_detection"
}
func (s *CustomStrategy) Detect(pair *git.CommitPair, stats *metrics.RepositoryStats) (bool, string) {
if isCustomSuspicious(pair) {
return true, "Your reason here"
}
return false, ""
}Register it in internal/detector/detector.go and it will automatically be used.
For Web Content Analysis:
Create a custom pattern strategy:
import "github.com/codemeapixel/cadence/internal/web/patterns"
// Create custom strategy
customStrategy := patterns.NewCustomPatternStrategy(
"marketing_speak",
[]string{"synergy", "innovative", "disruptive"},
2, // threshold
)
// Register with analyzer
analyzer := patterns.NewTextSlopAnalyzer()
analyzer.RegisterStrategy(customStrategy)Or implement the WebPatternStrategy interface for more complex logic:
type MyCustomStrategy struct{}
func (s *MyCustomStrategy) Name() string {
return "my_custom_pattern"
}
func (s *MyCustomStrategy) Detect(content string, wordCount int) *patterns.DetectionResult {
// Your detection logic here
if detected {
return &patterns.DetectionResult{
Detected: true,
Type: s.Name(),
Severity: 0.8,
Description: "Custom pattern detected",
Examples: []string{"example1", "example2"},
}
}
return nil
}