Toxic Comments

Project overview

https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge

For most online communities, social interaction and discussion are the core mechanisms, through which users communicate with each other to share information and exchange opinions for diverse topics. However, the anonymity afforded by such online communities has led to the increase of misbehavior, such as abuse and harassment, spread of propaganda, hate speech, and many more. Such misbehavior negatively influence the users online experience and impede the healthiness of the online environment. Using 159,571 human labelled online discussion comments under Wikipedia discussion pages, current project intends to solve the problem by providing a technical tool to detect the toxic social interactions accurately and effectively. Utilizing an enriched word embedding feature sets and different machine learning techniques, the proposed model is able to achieve both high overall accuracy rate as well as F1 score in detecting the toxic comments from ordinary comments.

Results:

Presentation slides: [https://github.com/LittleRabbitHole/ToxicCommentsClassification/blob/master/Results.pdf]

Detection with word level features: DataExplorationPrediction_Features.ipynb

Detection with word embedding features: DataExplorationPrediction_w2v.ipynb

Word embedding: w2v_train.py

Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
.ipynb_checkpoints		.ipynb_checkpoints
cabin-sketch-v1.02		cabin-sketch-v1.02
test		test
word_cloud_pngs		word_cloud_pngs
CleanCommentGenerator.py		CleanCommentGenerator.py
CommentsFeaturesGenerator.py		CommentsFeaturesGenerator.py
DataExplorationPrediction_Features.ipynb		DataExplorationPrediction_Features.ipynb
DataExplorationPrediction_w2v.ipynb		DataExplorationPrediction_w2v.ipynb
LSTMSession-tn8.ipynb		LSTMSession-tn8.ipynb
README.md		README.md
Results.pdf		Results.pdf
ToxicCommentAnalysis.ipynb		ToxicCommentAnalysis.ipynb
Word2Vec_merge.py		Word2Vec_merge.py
match_controls_fullfeatures.ipynb		match_controls_fullfeatures.ipynb
w2v_train.py		w2v_train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Toxic Comments

Project overview

https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge

Results:

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Toxic Comments

Project overview

https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge

Results:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages