An offline-first Discord moderation bot that detects and blocks scams in text and images using on-premises LLMs and OCR.
- Offline Detection: Uses self-hosted LLMs and OCR for complete privacy
- Multi-layered Analysis: Rule-based, OCR, and LLM inference pipeline
- Smart Actions: Auto-delete, flag for review, or monitor based on confidence
- Moderator Dashboard: Web-based interface for reviewing flagged content
- Configurable Policies: Per-guild settings and thresholds
- Comprehensive Logging: Full audit trail with performance metrics
Discord Gateway ← Bot Service → Detection Pipeline → Actioner → Moderator Dashboard
↓
[Rules, OCR, LLM] → Database & Logs
- Discord Bot Service - Message event handling and Discord integration
- Detection Pipeline - Coordinates rule-based, OCR, and LLM analysis
- OCR Service - Tesseract-based text extraction from images
- LLM Service - Offline quantized model inference with structured prompts
- Actioner - Policy enforcement and moderator notifications
- Dashboard - FastAPI-based web interface for moderators
- Database - PostgreSQL for flagged messages, actions, and configuration
- Docker and Docker Compose
- Discord Bot Token
- Quantized LLM model (GGUF format)
- Clone and configure:
git clone <repository>
cd AntiScam
cp .env.example .env
# Edit .env with your Discord token and settings- Download a quantized model:
# Example: Download a quantized Llama model
mkdir -p models
wget https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGUF/resolve/main/llama-2-7b-chat.q4_0.gguf -O models/quantized_model.gguf- Start services:
docker-compose up -d- Invite bot to Discord:
- Go to Discord Developer Portal
- Create application and bot
- Generate invite link with permissions:
Manage Messages,Read Message History,Send Messages,Add Reactions
Access the dashboard at http://localhost:8080 to configure:
- Detection thresholds: Auto-delete and flag confidence levels
- Feature toggles: Enable/disable OCR, LLM, or rule-based detection
- Channels: Set moderator notification and log channels
- Retention: Configure data retention policies
!scamconfig set auto_delete_confidence 0.9
!scamconfig set flag_threshold 0.5
!scamconfig set mod_channel #moderators
!scamconfig show
!scamstats
AntiScam/
├── core/ # Core detection logic
│ ├── preprocessor.py # Text normalization and feature extraction
│ ├── rule_detector.py # Rule-based scam detection
│ ├── detector_pipeline.py # Main detection coordinator
│ ├── actioner.py # Policy enforcement
│ └── logging_system.py # Comprehensive logging
├── database/ # Database models and management
│ ├── models.py # SQLAlchemy models
│ ├── database.py # Database management
│ └── __init__.py
├── services/ # Microservices
│ ├── bot/ # Discord bot service
│ ├── ocr/ # OCR processing service
│ ├── llm/ # LLM inference service
│ └── dashboard/ # Web dashboard
├── docker-compose.yml # Service orchestration
├── requirements.txt # Python dependencies
└── init.sql # Database initialization
- Install dependencies:
pip install -r requirements.txt- Setup database:
# Start PostgreSQL and Redis
docker-compose up postgres redis -d
# Initialize database
python -c "
from database import init_database
import asyncio
async def init():
db = init_database('postgresql://antiscam:password@localhost/antiscam_db')
await db.create_tables()
asyncio.run(init())
"- Run services individually:
# Bot service
python -m services.bot.main
# OCR service
python -m services.ocr.main
# LLM service
python -m services.llm.main
# Dashboard
python -m services.dashboard.main# Run basic tests
python -m pytest tests/
# Test specific components
python -m pytest tests/test_detector.py
python -m pytest tests/test_ocr.py| Variable | Description | Default |
|---|---|---|
DISCORD_TOKEN |
Discord bot token | Required |
DATABASE_URL |
PostgreSQL connection string | postgresql://antiscam:password@localhost/antiscam_db |
REDIS_URL |
Redis connection string | redis://localhost:6379/0 |
LLM_MODEL_PATH |
Path to GGUF model file | ./models/quantized_model.gguf |
LLM_THREADS |
CPU threads for LLM inference | 4 |
TESSERACT_CMD |
Tesseract executable path | /usr/bin/tesseract |
LOG_LEVEL |
Logging level | INFO |
The system includes built-in rules for:
- Payment scams: Venmo, CashApp, PayPal requests
- Impersonation: Fake admin/moderator messages
- Phishing: Account suspension/verification scams
- Giveaways: Fake contests and crypto airdrops
- Social engineering: Remote access and urgency tactics
Custom rules can be added via the dashboard or API.
- Rule-based detection: <50ms average
- OCR processing: 200-1000ms per image
- LLM inference: 300-800ms for classification
- Memory usage: ~2GB with 7B quantized model
Scale by:
- Adding more LLM worker replicas
- Using GPU acceleration for LLM inference
- Implementing Redis caching for domain lookups
- Horizontal scaling of OCR workers
Base URL: http://localhost:8080/api
GET /guilds/{guild_id}/flagged-messages?status=pending&limit=50
POST /flagged-messages/{message_id}/action?moderator_id={user_id}
Content-Type: application/json
{
"action": "approve|delete_ban|warn",
"reason": "Optional reason"
}
PUT /guilds/{guild_id}/config?moderator_id={user_id}
Content-Type: application/json
{
"auto_delete_confidence": 0.9,
"flag_threshold": 0.5,
"enable_ocr": true,
"enable_llm": true
}
GET /guilds/{guild_id}/stats?days=30
- Data Privacy: All processing happens on-premises
- Access Control: Dashboard requires Discord role verification
- Rate Limiting: Built-in protection against spam/abuse
- Data Retention: Configurable cleanup of old records
- Audit Trail: Complete logging of all actions
-
Bot not responding:
- Check Discord token in
.env - Verify bot permissions in Discord server
- Check bot service logs:
docker-compose logs bot
- Check Discord token in
-
LLM inference failing:
- Ensure model file exists and is valid GGUF format
- Check available system memory (>4GB recommended)
- Review LLM service logs:
docker-compose logs llm-service
-
OCR not working:
- Verify Tesseract installation in container
- Check image format support (PNG, JPG, GIF)
- Review OCR service logs:
docker-compose logs ocr-service
-
Dashboard not accessible:
- Verify port 8080 is not blocked
- Check database connectivity
- Review dashboard logs:
docker-compose logs dashboard
Access service health at:
- Dashboard:
http://localhost:8080/api/health - Bot statistics:
!scamstatscommand in Discord - System logs: Dashboard → System Logs tab
- Fork the repository
- Create feature branch
- Add tests for new functionality
- Ensure all tests pass
- Submit pull request
MIT License - see LICENSE file for details.
For issues and questions:
- GitHub Issues: Report bugs and feature requests
- Documentation: Check this README and code comments
- Discord: Join our support server [invite link]