AI-powered security log analysis platform that transforms raw firewall, IDS, and network logs into actionable intelligence using Retrieval-Augmented Generation (RAG) and Large Language Models.

A secure onboarding flow where new users create an account. Passwords are hashed using bcrypt and never stored in plaintext.

Multi-user authentication system with session-based login protection and bcrypt hashing.

Upload raw security logs (CSV, JSON, XML, LOG, TXT). Files are normalized into a unified schema and indexed automatically.

Displays total parsed records, detected log types, and key metadata extracted from uploaded files.

Tabular structured JSON visualization with preserved fields such as timestamps, severity, source/destination IPs, protocol, and attack category.

Deep-dive technical summary generated using Retrieval-Augmented Generation over normalized log data.

High-level business-focused security insights written for leadership and non-technical stakeholders.

Ask security-focused questions in natural language and receive context-aware answers grounded in your logs.

Generate reports and email them securely with one click via integrated SMTP support.

Export fully formatted security assessment reports (technical + executive) for audit and documentation.
- Multi-format log ingestion: CSV, XML, JSON, LOG, TXT
- Intelligent normalization: Automatic type detection and field extraction
- Semantic search: 768-dimensional vector embeddings for contextual retrieval
- RAG-powered insights: Grounded AI responses using actual log evidence
- Groq LLM integration: Llama 3.3 70B with 128k context window
- Conversational interface: Ask questions in natural language
- Automated reporting: Technical and executive summaries
- Threat correlation: Identify attack patterns across disparate logs
- Vector database: ChromaDB with HNSW indexing for fast similarity search
- Batch processing: Efficient embedding generation (64 docs/batch)
- Session management: Append or start fresh analysis workflows
- Real-time ingestion: Watch folder for automatic log processing
- Multi-user system: Flask-Login with secure session management
- Password hashing: Bcrypt with salt for credential storage
- SQLite database: Lightweight user management
- Protected routes: Login required for all analysis features
┌──────────────────────────────────────┐
│ User Interface │
│ Flask Web App + HTML Templates │
└───────────────┬──────────────────────┘
│
│ User uploads logs /
│ asks questions
▼
┌──────────────────────────────┐
│ Normalization Engine │
│ (format_con.py) │
└───────────────┬──────────────┘
│
│ Converts logs into
│ structured JSON
▼
┌────────────────────────────┐
│ Vectorization + Indexing │
│ (live_ingest.py RAG DB) │
└───────────────┬────────────┘
│
│ Create embeddings
▼
┌────────────────────────────────────────────────────────────┐
│ Vector Database │
│ ChromaDB │
│ │
│ • Stores 768-dim sentence embeddings │
│ • Supports semantic similarity search │
│ • Uses HNSW indexing for fast recall │
└───────────────┬────────────────────────────────────────────┘
│
│ Retrieve Top-K Relevant Chunks
▼
┌────────────────────────────┐
│ RAG Engine │
│ (rag_engine.py) │
└───────────────┬────────────┘
│
│ Build prompt with retrieved context
▼
┌───────────────────────────┐
│ Groq API │
│ (LLaMA 3.3 model) │
└───────────────┬───────────┘
│
│ AI response /
│ report generation
▼
┌───────────────────────────────┐
│ Final Output Layer │
│ • Technical Summary │
│ • Executive Summary │
│ • PDF Export │
│ • Chat Assistant │
└───────────────────────────────┘
Tech Stack:
- Backend: Python 3.8+, Flask 3.0
- Vector DB: ChromaDB 0.4.22
- Embeddings: SentenceTransformers (multi-qa-mpnet-base-dot-v1)
- LLM: Groq API (Llama 3.3 70B Versatile)
- Auth: Flask-Login, bcrypt
- PDF: ReportLab
- Frontend: Bootstrap 5, Vanilla JS
- Python 3.8 or higher
- Groq API key
- 4GB+ RAM (8GB recommended)
- 10GB+ free disk space
git clone https://github.com/SudoXploit7/DefenSight-AI.git
cd DefenSight-AIpython -m venv venv
venv\Scripts\activate # Windows
source venv/bin/activate # macOS/Linuxpip install -r requirements.txtcp .env.example .envpython init_db.py init
python init_db.py create-adminpython gui_app.pyVisit:
http://localhost:5000
Default credentials:
- Username:
admin - Password:
admin123
Change these immediately in production.
DefenSight AI/
│
├── DefenSight AI_db/ # ChromaDB vector database (auto-generated)
│
├── incoming_logs/ # (Optional) Live-ingest watch folder
│
├── instance/ # Flask instance folder
│
├── normalized/ # Normalized JSON output files
│
├── project_screenshots/ # Screenshots used in README
│
├── raw_data/ # Uploaded raw logs (CSV/XML/JSON/LOG/TXT)
│
├── static/ # CSS & JavaScript assets
│ ├── styles.css
│ └── assistant.js
│
├── templates/ # HTML UI Pages
│ ├── base.html
│ ├── login.html
│ ├── register.html
│ ├── upload.html
│ ├── normalized_list.html
│ ├── normalize.html
│ └── index.html
│
├── test_data/ # Sample logs for demo/testing
│
├── .env.example # Environment variable template
├── .gitignore # Git ignore rules
│
├── auth.py # Authentication logic (Flask-Login + bcrypt)
├── chat.py # CLI chat utility (optional)
├── format_con.py # Log normalization engine
├── gui_app.py # Main Flask Web Application
├── LICENSE # MIT License
├── live_ingest.py # Real-time log ingestion pipeline
├── rag_engine.py # RAG pipeline + Groq API integration
├── README.md # Project documentation
└── requirements.txt # Python dependencies
Soumyadipta Birabar
- GitHub: @SudoXploit7
- LinkedIn: Soumyadipta Birabar
Built with ❤️ for the cybersecurity community
DefenSight AI • Transforming Security Data into Actionable Intelligence