Check before you trust. β A production-grade, AI-powered scam prevention platform built for Malaysia.
ScamGuards is a community-driven fraud detection system that allows users to check identifiers (phone numbers, emails, bank accounts) against a crowdsourced database of scam reports. The platform uses AI to analyze patterns, detect duplicates, and provide confidence-based risk assessments.
π²πΎ Malaysia-First β Localized for Malaysian phone formats, banks, e-wallets, and common local scam types.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β CLIENT LAYER β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ βββββββββββββββββββββββ β
β β Search β β Report β β Dispute β β Admin Dashboard β β
β β Page β β Submission β β Form β β (Email Auth) β β
β ββββββββ¬βββββββ ββββββββ¬βββββββ ββββββββ¬βββββββ ββββββββββββ¬βββββββββββ β
βββββββββββΌβββββββββββββββββΌβββββββββββββββββΌβββββββββββββββββββββΌβββββββββββββ
β β β β
βΌ βΌ βΌ βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β MIDDLEWARE LAYER β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Rate Limiting & Abuse Prevention β β
β β β’ IP-based cooldowns (60s between reports) β β
β β β’ Auto-ban after threshold (20 submissions β 24hr ban) β β
β β β’ In-memory store for Edge Runtime compatibility β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β API LAYER β
β ββββββββββββββ ββββββββββββββ ββββββββββββββ ββββββββββββββββββββββββββ β
β β /search β β /submit β β /dispute β β /analyze-report β β
β β β β β β β β (Multi-Scammer AI) β β
β βββββββ¬βββββββ βββββββ¬βββββββ βββββββ¬βββββββ βββββββββββββ¬βββββββββββββ β
β β β β β β
β β βΌ β β β
β β ββββββββββββββββββββ β β β
β β β Duplicate Check β β β β
β β β & Smart Merge β β β β
β β ββββββββββ¬ββββββββββ β β β
ββββββββββΌββββββββββββββΌββββββββββββββββββΌββββββββββββββββββββββΌβββββββββββββββ
β β β β
βΌ βΌ βΌ βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β DATA LAYER (Supabase) β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β PostgreSQL + RLS β β
β β βββββββββββ βββββββββββββββ βββββββββββ ββββββββββββββββββββββββββ β β
β β β reports β β data_points β βdisputes β β reporter_reputation β β β
β β ββββββ¬βββββ ββββββββ¬βββββββ ββββββ¬βββββ ββββββββββββββ¬ββββββββββββ β β
β β β β β β β β
β β βββββββββββββββ΄ββββββββββββββ΄ββββββββββββββββββββ β β
β β β β β
β β βββββββββββββββββββββββββββββ΄ββββββββββββββββββββββββββββββββββββ β β
β β β Materialized Views (Pre-computed) β β β
β β β β’ platform_stats β’ scam_type_stats β’ daily_stats β β β
β β β β’ scammer_search_stats (confidence + heat level) β β β
β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Supabase Storage (evidence) β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β AI LAYER (Qwen via DashScope) β
β ββββββββββββββββββββββββββββ ββββββββββββββββββββββββββββββββββββββββββ β
β β Search Detective β β Report Analyst β β
β β β’ Data point extraction β β β’ Multi-scammer detection β β
β β β’ Smart Paste for searchβ β β’ Grouped preview with user confirm β β
β β β’ Type classification β β β’ Risk scoring & scam type inference β β
β ββββββββββββββββββββββββββββ ββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
This project demonstrates iterative architectural improvement, evolving from a basic MVP to a production-grade system.
Goal: Functional prototype with core search/report capabilities.
| Component | Implementation | Status |
|---|---|---|
| Database | Basic tables (reports, data_points, disputes) | β |
| Search | Exact match only | β |
| AI | Single prompt for risk scoring | β |
| Security | None | |
| Admin | None | β |
Goal: Add search intelligence, security, and abuse prevention.
| Component | Improvement | Impact |
|---|---|---|
| Search | Fuzzy matching via pg_trgm + full-text search |
3x more matches |
| Security | Row Level Security (RLS) on all tables | Data isolation |
| Analytics | Materialized views for platform stats | 100x faster queries |
| Abuse Prevention | IP-based rate limiting in middleware | Spam blocked |
| Admin | Secure email/password auth with whitelist | Controlled access |
| Functions | SECURITY DEFINER SET search_path = '' |
SQL injection prevention |
Goal: AI-powered features and unified scammer profiling.
| Component | Innovation | Impact |
|---|---|---|
| Smart Paste | AI extracts data points from pasted paragraphs | 80% faster input |
| Multi-Scammer Detection | AI identifies multiple scammers in single narrative | Batch processing |
| Duplicate Detection | Smart merge with report count tracking | Data deduplication |
| Confidence Scoring | confidence = 50 + (report_count * 10) |
Trust signals |
| Heat Levels | CRITICAL/HIGH/MEDIUM/LOW based on reports | Priority triage |
| Scammer Profiles | Unified view aggregating all data points | Entity resolution |
-- Basic normalized structure
reports (id, scam_type, description, platform, evidence_url)
data_points (report_id, type, value, normalized_value)
disputes (report_id, reason, contact_email, status)
audit_logs (action, ip_hash, metadata)-- Added for performance & security
+ reports.reporter_hash -- Anonymous tracking
+ reports.amount_lost -- Financial impact
+ reports.description_tsv -- Full-text search vector
+ rate_limits -- Abuse prevention
+ moderation_queue -- Auto-flagging
+ reporter_reputation -- Trust scoring
+ Materialized Views -- Pre-computed analytics-- Added for duplicate detection & profiling
+ data_points.report_count -- How many times reported
+ data_points.first_reported_at -- Temporal tracking
+ data_points.last_reported_at -- Recent activity
+ data_points.confidence_score -- Calculated trust
+ report_submissions -- Per-datapoint rate limiting
+ scammer_profiles (VIEW) -- Aggregated entity view
+ scammer_search_stats (MATVIEW) -- Pre-computed search enhancementConfidence Score = min(100, 50 + (unique_reports Γ 10))
Heat Level:
CRITICAL = 10+ reports (100% confidence)
HIGH = 5-9 reports (90-99% confidence)
MEDIUM = 3-4 reports (70-89% confidence)
LOW = 1-2 reports (50-69% confidence)
Layer 1: Middleware (Edge)
βββ IP-based rate limiting
βββ Submission cooldowns (60s)
βββ Auto-ban thresholds (20 β 24hr ban)
βββ Request validation
Layer 2: API Routes
βββ Input sanitization
βββ Type validation (Zod)
βββ Error boundary handling
Layer 3: Database (Supabase)
βββ Row Level Security (RLS)
βββ Function search_path hardening
βββ Prepared statements (no SQL injection)
βββ Audit logging
Layer 4: Admin Access
βββ Supabase Auth (email/password)
βββ Environment-based whitelist
βββ Session management
| Concern | Decision | Rationale |
|---|---|---|
| Authentication | Public submit, admin-only verify | Balance accessibility with control |
| Rate Limiting | In-memory (Edge compatible) | Vercel Edge Runtime constraint |
| IP Tracking | SHA-256 hash, not raw IP | PDPA compliance |
| Admin Auth | Email whitelist + Supabase Auth | Simple, secure, auditable |
| SQL Injection | SET search_path = '' on all functions |
Supabase linter compliance |
The system uses two specialized AI personas optimized for different tasks:
Input: "got scammed by john at 0123456789 on telegram @scammer123"
Output: [
{ type: "name", value: "john", confidence: 85 },
{ type: "phone", value: "0123456789", confidence: 95 },
{ type: "telegram", value: "@scammer123", confidence: 90 }
]
Input: Paragraph describing scam with multiple perpetrators
Output: {
isMultiple: true,
scammers: [
{ name: "Scammer A", dataPoints: [...], riskScore: 85 },
{ name: "Scammer B", dataPoints: [...], riskScore: 78 }
]
}
User pastes scam story
β
βΌ
βββββββββββββββββββββ
β AI Analysis β
β (Qwen qwen-max) β
ββββββββββ¬βββββββββββ
β
ββββββ΄βββββ
βΌ βΌ
Single Multiple
Scammer Scammers
β β
βΌ βΌ
Standard Grouped
Form Preview
β β
βΌ βΌ
Submit Select &
Confirm
β
βΌ
Batch Submit
(N reports)
| Optimization | Implementation | Improvement |
|---|---|---|
| Fuzzy Search | pg_trgm GIN indexes |
Sub-100ms on 100K records |
| Full-Text Search | tsvector with GIN |
Semantic matching |
| Pre-computed Stats | Materialized views | 100x faster dashboard |
| Composite Indexes | (status, created_at DESC) |
Optimized common queries |
| Connection Pooling | Supabase built-in | Handles concurrent load |
- Phone Validation:
01X-XXXXXXXformat with carrier detection - Banks: Maybank, CIMB, Public Bank, RHB, Hong Leong, etc.
- E-Wallets: Touch 'n Go, GrabPay, Boost, ShopeePay
- Scam Types: Macau, Love, Parcel, Job, Investment, Loan, Collectibles (TCG)
- Currency: MYR with RM formatting
- Languages: English + Bahasa Malaysia with browser auto-translate hints
| Layer | Technology | Why |
|---|---|---|
| Framework | Next.js 14 (App Router) | Server components, edge-ready |
| Language | TypeScript | Type safety, better DX |
| Styling | Tailwind CSS + shadcn/ui | Rapid, consistent UI |
| Database | Supabase (PostgreSQL) | RLS, real-time, storage |
| AI | Qwen via DashScope | Cost-effective, fast inference |
| Deployment | Vercel | Edge functions, auto-scaling |
| Auth | Supabase Auth | Built-in, secure |
scamguard/
βββ app/
β βββ api/
β β βββ search/ # Fuzzy + exact + full-text search
β β βββ submit/ # Report submission with duplicate detection
β β βββ dispute/ # Challenge reports
β β βββ extract/ # AI data point extraction
β β βββ analyze-report/ # Multi-scammer AI analysis
β β βββ admin/ # Protected admin endpoints
β βββ admin/
β β βββ login/ # Email/password auth
β β βββ dashboard/ # Report management
β βββ search/ # Search interface
β βββ submit/ # Smart Report paste
β βββ results/ # Search results display
βββ components/
β βββ ui/ # shadcn/ui components
β βββ search/ # SmartSearchPaste
β βββ submit/ # SmartReportPaste (multi-scammer)
βββ lib/
β βββ ai/
β β βββ scam-analyzer.ts # Search extraction
β β βββ report-analyzer.ts# Multi-scammer detection
β βββ supabase/ # Client (browser + server)
β βββ utils/ # Normalization, validation
βββ middleware.ts # Rate limiting, abuse prevention
βββ supabase/
βββ migrations/
βββ 001_initial_schema.sql
βββ 002_production_upgrade.sql
βββ 003_production_10_of_10.sql
βββ 004_duplicate_detection.sql # Latest
- Node.js 18+
- Supabase account
- DashScope API key (Alibaba Cloud)
# Clone
git clone https://github.com/nicuk/scamguards.git
cd scamguards
# Install
npm install
# Configure
cp .env.example .env.local
# Edit .env.local with your keys
# Database setup (in Supabase SQL Editor)
# Run: supabase/FULL_SCHEMA.sql
# Then: supabase/migrations/004_duplicate_detection.sql
# Create storage bucket: "evidence" (public)
# Run
npm run dev| Variable | Description |
|---|---|
NEXT_PUBLIC_SUPABASE_URL |
Supabase project URL |
NEXT_PUBLIC_SUPABASE_ANON_KEY |
Supabase anon key |
DASHSCOPE_API_KEY |
Alibaba Cloud DashScope key |
ADMIN_EMAILS |
Comma-separated admin emails |
- Real-time notifications for new reports matching saved searches
- Batch report verification for admins
- Public API for third-party integrations
- Mobile app (React Native)
- ML-based scam pattern prediction
Elastic License 2.0 β Free to use, modify, and self-host. Commercial SaaS requires separate license.
Built with modern best practices for security, performance, and user experience. Contributions welcome.
Protecting Malaysians from scams, one check at a time.