🚫 Toxic Comment Classification using BERT

This project classifies toxic as well as sarcastic comments using a fine-tuned BERT model with an accuracy of over 95%. It integrates unlabeled data from the Jigsaw Toxic Comment Classification Challenge and utilizes a custom API wrapper named isomina for training orchestration and evaluation.

🧠 Model

Base: bert-base-uncased
Fine-tuned on labeled and pseudo-labeled Jigsaw dataset
Multi-label classification (toxic, severe_toxic, obscene, threat, insult, identity_hate)
Trained using custom isomina API wrapper for streamlined experimentation and deployment

📊 Output Examples

| Visualization of Predictions |

In output2.png, we see that a sarcastic message was correctly identified.

🚀 Features

95% validation accuracy
Semi-supervised training using pseudo-labeling
BERT fine-tuning pipeline
Easy experimentation with isomina API
Robust evaluation with ROC AUC, F1, and accuracy metrics

Data:
- Labeled: From Jigsaw Toxic Comment Classification Challenge
- Unlabeled: From Jigsaw Unintended Bias Dataset
API: Integrated with the custom isomina API for:
- Data preprocessing
- Model orchestration
- Metrics visualization
- Inference handling

🔧 Installation & Setup

1. Clone the Repository

git clone https://github.com/yourusername/toxic-comment-bert.git
cd toxic-comment-bert

Requirements

transformers
pandas
numpy
scikit-learn
matplotlib
seaborn
tqdm
isomina # Custom or private package, ensure it's accessible
torch>=1.10.0

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
assets		assets
toxic_comment		toxic_comment
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
run.py		run.py
sm.html		sm.html
tempCodeRunnerFile.py		tempCodeRunnerFile.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🚫 Toxic Comment Classification using BERT

🧠 Model

📊 Output Examples

🚀 Features

🔧 Installation & Setup

1. Clone the Repository

Requirements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🚫 Toxic Comment Classification using BERT

🧠 Model

📊 Output Examples

🚀 Features

🔧 Installation & Setup

1. Clone the Repository

Requirements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages