Skip to content

Releases: debatelab/toxicity-detector

📣 Toxicity Detector v0.1.0

16 Jan 15:25

Choose a tag to compare

🎉 We're excited to announce the first official release of Toxicity Detector, an LLM-based pipeline for detecting toxic speech in text.

🌟 What is Toxicity Detector?

Toxicity Detector is a configurable, multi-stage pipeline that uses Large Language Models to analyze text and identify toxic content. It supports two toxicity types out of the box:

Personalized toxicity: Direct attacks on individuals (insults, threats, harassment)
Hate speech: Group-based toxicity targeting protected characteristics

✨ Key Features

🔧 Highly configurable via YAML configuration files
🎯 Multi-stage analysis with preparatory questions and configurable indicators
🖥️ Multiple interfaces: CLI, Python API, and interactive Gradio web UI
💾 Built-in result serialization for auditing and analysis
🔌 Flexible model support: Compatible with OpenAI, Hugging Face, and other LangChain-supported providers
📊 Transparent reasoning: Get detailed explanations alongside toxicity verdicts
🚀 Quick Start
📚 Documentation

See the README for detailed installation instructions, configuration guides, and usage examples.

🙏 Acknowledgements

This project was developed as part of the KIdeKu project, funded by the German Federal Ministry of Education, Family Affairs, Senior Citizens, Women and Youth (BMBFSFJ).