Releases: debatelab/toxicity-detector
📣 Toxicity Detector v0.1.0
🎉 We're excited to announce the first official release of Toxicity Detector, an LLM-based pipeline for detecting toxic speech in text.
🌟 What is Toxicity Detector?
Toxicity Detector is a configurable, multi-stage pipeline that uses Large Language Models to analyze text and identify toxic content. It supports two toxicity types out of the box:
Personalized toxicity: Direct attacks on individuals (insults, threats, harassment)
Hate speech: Group-based toxicity targeting protected characteristics
✨ Key Features
🔧 Highly configurable via YAML configuration files
🎯 Multi-stage analysis with preparatory questions and configurable indicators
🖥️ Multiple interfaces: CLI, Python API, and interactive Gradio web UI
💾 Built-in result serialization for auditing and analysis
🔌 Flexible model support: Compatible with OpenAI, Hugging Face, and other LangChain-supported providers
📊 Transparent reasoning: Get detailed explanations alongside toxicity verdicts
🚀 Quick Start
📚 Documentation
See the README for detailed installation instructions, configuration guides, and usage examples.
🙏 Acknowledgements
This project was developed as part of the KIdeKu project, funded by the German Federal Ministry of Education, Family Affairs, Senior Citizens, Women and Youth (BMBFSFJ).