Skip to content

isadfrn/network-threat-classifier

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Network threat classifier

Languages used Repository size Last commit

About

This repository contains Python scripts for analyzing network traffic data to classify IPs as compromised or non-compromised using a Support Vector Machine (SVM) model and to predict network traffic flows using a Random Forest Regressor model.

This was a project for a Machine Learning class during my Master's degree. The presentation is written in Portuguese and can be accessed here.

Project Structure

network-threat-classifier/
├── src/                    # Python source code
│   ├── preprocessor.py     # Data preprocessing
│   ├── svn.py              # SVM classification
│   └── predictor.py        # Random Forest prediction
├── data/                   # Input data
│   └── data.csv
├── output/                 # Generated files (ignored by git)
│   ├── processed_data.csv
│   ├── roc_curve.png
│   └── prediction_comparison.png
├── docs/                   # Documentation
│   └── slides.pdf
├── run.py                  # Main runner script
├── Makefile                # Makefile for automation
├── requirements.txt        # Python dependencies
└── README.md

Run

Run the complete pipeline with a single command. The script will automatically create a virtual environment, install dependencies, and execute all steps.

Using Python script (works on all platforms):

python run.py

Using Makefile (Linux/Mac, or Windows with Make installed):

make

Both commands will automatically:

  • Create a virtual environment if it doesn't exist
  • Install all dependencies from requirements.txt
  • Run the complete pipeline (preprocessing → classification → prediction)

Additional Makefile commands:

make preprocess  # Data preprocessing only
make classify    # Preprocessing + SVM classification
make predict     # Complete pipeline
make clean       # Remove generated files
make clean-venv  # Remove virtual environment
make help        # Show all available commands

Contributing

This repository is using Gitflow Workflow and Conventional Commits, so if you want to contribute:

  • create a branch from develop branch;
  • make your contributions;
  • open a Pull Request to develop branch;
  • wait for discussion and future approval;

I thank you in advance for any contribution.

Status

Finished

License

MIT

About

This repository contains Python scripts for analyzing network traffic data to classify IPs as compromised or non-compromised using a Support Vector Machine (SVM) model and to predict network traffic flows using a Random Forest Regressor model.

Topics

Resources

License

Stars

Watchers

Forks

Contributors