Skip to content

kushagra486/cybersecurity-threat-analysis

Repository files navigation

Cybersecurity: Suspicious Web Threat Interactions

banner

Notebook Python License Build Release


Project Overview

This repository provides a full, production-ready pipeline for detecting and analyzing suspicious web traffic collected from AWS CloudWatch. It bundles data, notebooks, trained models, deployment tooling (FastAPI, Streamlit), CI, Docker support, and documentation — everything you need to reproduce and deploy the project.

Dataset (local path): /mnt/data/CloudWatch_Traffic_Web_Attack.csv

Note: The dataset file is included in data/CloudWatch_Traffic_Web_Attack.csv. If you plan to publish this repository publicly and the file is large or sensitive, consider using Git LFS or uploading it to GitHub Releases and adding a download script.


What’s included

  • data/ — original dataset and processed outputs.
  • notebooks/ — end-to-end Jupyter notebook with EDA, feature engineering, and modeling.
  • src/ — source code: data loaders, training scripts, streamlit_app.py, fastapi_app.py.
  • models/ — serialized model artifacts (IsolationForest, RandomForest, scaler) and model metadata.
  • reports/ — academic PDF, model documentation, architecture diagram, slides.
  • assets_banner.png — project banner for README or GitHub repo header.
  • Dockerfile, requirements.txt, .github/workflows/ci.yml, RUN_SERVICES.md, tests, and deployment helpers.
  • LICENSE, CONTRIBUTING.md, CODE_OF_CONDUCT.md, .gitignore.

Models & Artifacts

The project ships with pre-trained model artifacts saved under models/:

  • isolation_forest_model.pkl — unsupervised anomaly detector (IsolationForest).
  • random_forest_classifier.pkl — supervised classifier for suspicious traffic.
  • scaler.pkl — StandardScaler used to preprocess inputs for the classifier.

Model card (summary): see reports/Model_Documentation.pdf for input schema, expected performance notes, bias/limitations, and deployment guidance.


Quickstart (local)

  1. Clone the repository:
git clone https://github.com/yourusername/yourrepo.git
cd yourrepo
  1. Create a virtualenv and install dependencies:
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
  1. Run the Streamlit dashboard:
streamlit run src/streamlit_app.py
  1. Run the FastAPI prediction API:
uvicorn src.fastapi_app:app --reload --host 0.0.0.0 --port 8000
  1. Run tests:
pytest -q

Production & Deployment Tips

  • Use the included Dockerfile to create a container for the FastAPI service. Use an orchestration platform (Kubernetes / ECS) for scaling and high availability.
  • Secure the API: add authentication (API keys, OAuth), TLS termination, rate limiting, and logging/monitoring.
  • Model updates: store model versions in models/ and use a retraining pipeline; update reports/Model_Documentation.pdf with each release.
  • Sensitive data: remove or anonymize IPs if publishing publicly.

Contributing

See CONTRIBUTING.md for contribution guidelines, coding style, and testing instructions.


License & Attribution

This project is MIT licensed. See LICENSE for full details.


About

No description, website, or topics provided.

Resources

Code of conduct

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors