Releases · debatelab/toxicity-detector

🎉 We're excited to announce the first official release of Toxicity Detector, an LLM-based pipeline for detecting toxic speech in text.

🌟 What is Toxicity Detector?

Toxicity Detector is a configurable, multi-stage pipeline that uses Large Language Models to analyze text and identify toxic content. It supports two toxicity types out of the box:

Personalized toxicity: Direct attacks on individuals (insults, threats, harassment)
Hate speech: Group-based toxicity targeting protected characteristics

✨ Key Features

🔧 Highly configurable via YAML configuration files
🎯 Multi-stage analysis with preparatory questions and configurable indicators
🖥️ Multiple interfaces: CLI, Python API, and interactive Gradio web UI
💾 Built-in result serialization for auditing and analysis
🔌 Flexible model support: Compatible with OpenAI, Hugging Face, and other LangChain-supported providers
📊 Transparent reasoning: Get detailed explanations alongside toxicity verdicts
🚀 Quick Start
📚 Documentation

See the README for detailed installation instructions, configuration guides, and usage examples.

🙏 Acknowledgements

This project was developed as part of the KIdeKu project, funded by the German Federal Ministry of Education, Family Affairs, Senior Citizens, Women and Youth (BMBFSFJ).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

🌟 What is Toxicity Detector?

✨ Key Features

🙏 Acknowledgements

Uh oh!

Releases: debatelab/toxicity-detector

📣 Toxicity Detector v0.1.0

🌟 What is Toxicity Detector?

✨ Key Features

🙏 Acknowledgements

Uh oh!