ai-secrets-scan

The secret scanner built for the AI era.

Detect exposed API keys, tokens, and credentials across AI projects, MCP configurations, and LLM pipelines -- with 52 purpose-built patterns, entropy analysis, and zero dependencies.

Why This Exists

The AI tooling ecosystem has a credential sprawl problem, and traditional secret scanners were not designed for it.

81.5% year-over-year increase in secret exposure across public repositories (GitGuardian 2024 State of Secrets Sprawl)
24,000+ exposed AI API keys discovered in public codebases in a single year -- each one a direct path to billing abuse, data exfiltration, or model poisoning
36.7% of MCP server configurations are vulnerable to SSRF attacks, often through credentials embedded directly in config files
New key formats every month -- OpenRouter, Groq, Fireworks, LangSmith, Perplexity, DeepSeek -- traditional scanners lag behind by quarters

ai-secrets-scan was built to close this gap: a focused, dependency-free scanner that understands the AI ecosystem natively.

Quick Start

pip install ai-secrets-scan

ai-secrets-scan ./my-project

That's it. No configuration required. It scans your project, detects secrets, and reports findings with severity levels and fix suggestions.

Features

Detection

52 AI-specific secret patterns covering 20+ providers (OpenAI, Anthropic, Google, Groq, Mistral, and more)
Entropy-based detection using Shannon entropy analysis to catch novel or unknown key formats
Context-aware matching to reduce false positives -- low-specificity patterns only fire near AI/LLM-related code
MCP config scanning across Claude Desktop, Cursor, and other MCP client locations

Workflow Integration

Pre-commit hook -- one command to block secrets before they reach your repository
GitHub Actions workflow generation with SARIF upload to the Security tab
GitLab CI config generation for merge request scanning
Baseline/allowlist management for incremental adoption in existing projects

Output

SARIF v2.1.0 output for GitHub Code Scanning, Azure DevOps, and VS Code
JSON output for programmatic consumption and CI pipelines
Color-coded terminal output with severity indicators and redacted previews

Design

Zero dependencies -- pure Python standard library, installs in seconds
Local-first -- your code never leaves your machine
Cross-platform -- Windows, macOS, Linux
.gitignore-aware -- respects your existing ignore rules automatically

Supported Providers

AI / LLM Providers

Provider	Patterns	Key Format
OpenAI	3	`sk-...`, `sk-proj-...`, `org-...`
Anthropic	1	`sk-ant-...`
Google AI / Vertex	3	`AIza...`, service account JSON, OAuth secrets
Mistral AI	1	`MISTRAL_API_KEY=...`
Groq	1	`gsk_...`
Together AI	1	`TOGETHER_API_KEY=...`
Fireworks AI	1	`fw_...`
Perplexity	1	`pplx-...`
OpenRouter	1	`sk-or-v1-...`
DeepSeek	1	`DEEPSEEK_API_KEY=sk-...`
Stability AI	1	Context-aware `sk-...` detection
ElevenLabs	1	`ELEVENLABS_API_KEY=...`
Cohere	1	Context-aware 40-char token
HuggingFace	1	`hf_...`
Replicate	1	`r8_...`

Cloud (AI Services)

Provider	Patterns	Key Format
AWS (Bedrock, SageMaker)	3	`AKIA...`, secret key, session token
Azure OpenAI	1	Context-aware 32-char hex key

Vector Databases

Provider	Patterns	Key Format
Pinecone	2	`pcsk_...`, legacy UUID format
Weaviate	1	`WEAVIATE_API_KEY=...`
Qdrant	1	`QDRANT_API_KEY=...`
Supabase (pgvector)	2	JWT service role key, anon key

ML Observability & Experiment Tracking

Provider	Patterns	Key Format
Weights & Biases	1	`WANDB_API_KEY=...`
LangSmith	1	`lsv2_...`
LangChain	1	`ls__...`
Neptune.ai	1	`NEPTUNE_API_TOKEN=...`
Arize	1	`ARIZE_API_KEY=...`

Communication

Provider	Patterns	Key Format
Slack	3	`xoxb-...`, `xoxp-...`, webhook URLs
Discord	2	Bot tokens, webhook URLs
Telegram	1	Bot tokens (context-aware)

Source Control

Provider	Patterns	Key Format
GitHub	4	`ghp_...`, `github_pat_...`, `gho_...`, `ghu_/ghs_...`
GitLab	1	`glpat-...`

Payments

Provider	Patterns	Key Format
Stripe	2	`sk_live_...`, `pk_live_...`

Observability

Provider	Patterns	Key Format
Datadog	1	`DD_API_KEY=...`
Sentry	1	DSN URLs

Generic

Pattern	Severity
Generic API Key Assignment	Medium
Bearer Token	High
Base64 Encoded Secret	Medium
Database Connection String	Critical
Private Key (RSA/EC/DSA/OpenSSH)	Critical
JWT Token	Medium
Generic Password Assignment	High

Usage Examples

Basic Scan

$ ai-secrets-scan ./my-project

🔍 AI Secrets Scanner v0.2.0
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Scanning: ./my-project (23 files)

🔴 CRITICAL  config/settings.py:14
   OpenAI API Key: sk-R****...Qx

🟠 HIGH  .env:7
   HuggingFace Token: hf_a****...2f

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Results: 2 secrets found
  🔴 Critical: 1  🟠 High: 1

Files scanned: 23 | Time: 0.1s

Scan MCP Configurations

Scan Claude Desktop, Cursor, and other MCP client configs for embedded secrets:

ai-secrets-scan --mcp

Combine with a project scan:

ai-secrets-scan ./my-project --mcp

Severity Filtering

Only report high and critical findings:

ai-secrets-scan ./my-project --min-severity high

JSON Output

ai-secrets-scan ./my-project --format json

{
  "version": "0.2.0",
  "scan_path": "./my-project",
  "files_scanned": 23,
  "total_findings": 2,
  "findings": [
    {
      "file": "config/settings.py",
      "line": 14,
      "pattern_name": "OpenAI API Key",
      "severity": "critical",
      "provider": "openai",
      "matched_text": "sk-R****...Qx",
      "suggestion": "Use os.environ.get(\"OPENAI_API_KEY\") instead.",
      "source": "pattern"
    }
  ]
}

SARIF Output

Generate SARIF for GitHub Security tab integration:

ai-secrets-scan ./my-project --format sarif > results.sarif

Baseline Workflow

Adopt the scanner in an existing project without drowning in known findings:

# Step 1: Save current findings as a baseline
ai-secrets-scan ./my-project --baseline-save .secrets-baseline.json

# Step 2: Subsequent scans only report NEW secrets
ai-secrets-scan ./my-project --baseline .secrets-baseline.json

Pre-commit Hook

Prevent secrets from ever being committed:

# Install directly into .git/hooks/
ai-secrets-scan hook --install

# Or generate config for the pre-commit framework
ai-secrets-scan hook --generate

The hook scans staged files and blocks the commit if secrets are detected.

CI Integration

Generate ready-to-use CI pipeline configurations:

# GitHub Actions (with SARIF upload)
ai-secrets-scan ci --github
ai-secrets-scan ci --github -o .github/workflows/secrets-scan.yml

# GitLab CI
ai-secrets-scan ci --gitlab
ai-secrets-scan ci --gitlab -o .gitlab-ci-secrets.yml

Fix Suggestions

Get actionable remediation advice for each finding:

ai-secrets-scan ./my-project --fix

File Type Filtering

Scan only specific file types:

ai-secrets-scan ./my-project --types env,python,yaml

Supported types: env, mcp, python, yaml, notebook, json, toml, config, docker, terraform

Comparison with Other Tools

Feature	ai-secrets-scan	GitGuardian	TruffleHog	detect-secrets	Gitleaks
AI/LLM-specific patterns	52	~10	~5	~5	~5
MCP config awareness	Yes	No	No	No	No
Entropy-based detection	Yes	Yes	Yes	Yes	No
Baseline/allowlist	Yes	Paid	No	Yes	No
Pre-commit hooks	Yes	Yes	Yes	Yes	Yes
SARIF output	Yes	Yes	Yes	No	Yes
GitHub Actions generation	Yes	N/A	N/A	N/A	N/A
Fix suggestions	Yes	Paid	No	No	No
Zero dependencies	Yes	No	No	No	N/A (Go)
Local-only (no SaaS)	Yes	No	Yes	Yes	Yes
Free & open source	Yes	Freemium	Yes	Yes	Yes

Configuration

Create a config file to standardize settings across your team:

ai-secrets-scan init

This creates .ai-secrets-scan.yml:

# AI Secrets Scanner configuration
---

# Minimum severity to report: critical, high, medium, low
min_severity: low

# Directories to exclude (added to defaults: node_modules, .git, __pycache__, venv)
exclude:
  - vendor
  - third_party

# File types to scan (omit to scan all supported types)
# file_types:
#   - env
#   - python
#   - mcp
#   - yaml
#   - notebook
#   - json

# Custom patterns (in addition to built-in patterns)
# custom_patterns:
#   - name: "Internal Service Token"
#     regex: "svc_[a-zA-Z0-9]{32}"
#     severity: critical
#     provider: internal

The scanner auto-detects this file in the project root. Override with --config path/to/config.yml.

API Usage

Use ai-secrets-scan as a Python library for custom integrations:

from ai_secrets_scan import SecretScanner, Reporter

# Initialize the scanner
scanner = SecretScanner(
    min_severity="medium",
    enable_entropy=True,
)

# Scan a directory
findings = scanner.scan_path("./my-project")

# Scan MCP configurations
mcp_findings = scanner.scan_mcp_configs()

# Process findings programmatically
for finding in findings:
    print(f"[{finding.severity}] {finding.pattern_name}")
    print(f"  File: {finding.file}:{finding.line}")
    print(f"  Provider: {finding.provider}")
    print(f"  Source: {finding.source}")  # "pattern" or "entropy"

# Use the reporter for formatted output
reporter = Reporter(fmt="json", show_fix=True)
reporter.report(findings, files_scanned=scanner.files_scanned)

# Baseline management
from ai_secrets_scan import save_baseline, filter_new_findings

save_baseline(findings, ".secrets-baseline.json")
new_only = filter_new_findings(findings, ".secrets-baseline.json")

# Entropy analysis
from ai_secrets_scan import shannon_entropy

entropy = shannon_entropy("sk-proj-abc123def456ghi789")
print(f"Entropy: {entropy:.2f} bits/char")

Contributing

Contributions are welcome. Here's how to get started:

Fork the repository
Create a feature branch
Make your changes
Add tests for new patterns or features
Submit a pull request

Adding a New Secret Pattern

Add entries to ai_secrets_scan/patterns.py:

{
    "name": "NewProvider API Key",
    "regex": r"npk_[a-zA-Z0-9]{32,}",
    "severity": "critical",
    "provider": "newprovider",
},

For patterns that could match non-AI contexts, add "context_required": True to limit matches to lines near AI/LLM-related keywords.

Running Tests

python -m pytest tests/ -v

License

MIT License. See LICENSE for details.

Built for developers building with AI. If this tool saved you from a credential leak, consider giving it a star.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
ai_secrets_scan		ai_secrets_scan
examples		examples
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Folders and files

Latest commit

History

Repository files navigation

ai-secrets-scan

Why This Exists

Quick Start

Features

Detection

Workflow Integration

Output

Design

Supported Providers

AI / LLM Providers

Cloud (AI Services)

Vector Databases

ML Observability & Experiment Tracking

Communication

Source Control

Payments

Observability

Generic

Usage Examples

Basic Scan

Scan MCP Configurations

Severity Filtering

JSON Output

SARIF Output

Baseline Workflow

Pre-commit Hook

CI Integration

Fix Suggestions

File Type Filtering

Comparison with Other Tools

Configuration

API Usage

Contributing

Adding a New Secret Pattern

Running Tests

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages