Skip to content

Johannes-Schaefer/ihs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

iHS

iHS (short for illegal hate speech) is a dataset of German Twitter messages annotated for potentially illegal hate speech and mentions of identity terms. This repository contains the anonymized dataset as well as scripts and logs of experiments. I am happy to provide the complete data to you, feel free to contact me with a motivated request.

Further information is provided in the article: Johannes Schäfer. 2023. Bias Mitigation for Capturing Potentially Illegal Hate Speech. Datenbank Spektrum. https://doi.org/10.1007/s13222-023-00439-0

We describe the annotation guidelines in our previous work (Bibtex) and provide them online (in German).

This repository provides the anonymized dataset and supplementary material in the following structure:

data/iHS-corpus_anonymized.xml  -- the iHS dataset (anonymized by only providing the Tweet-ID for instances and masked username mentions)
keywords  -- list of keywords used to query Twitter data
src/ -- source code for the experiments described in the article
logs/ -- log files of the conducted experiments 

Citation

If you use any of the material from this repository, please cite the article: Johannes Schäfer. 2023. Bias Mitigation for Capturing Potentially Illegal Hate Speech. Datenbank Spektrum. https://doi.org/10.1007/s13222-023-00439-0

Bibtex:

@article{schaefer2023bias,
  title={Bias Mitigation for Capturing Potentially Illegal Hate Speech},
  author={Sch{\"a}fer, Johannes},
  journal={Datenbank-Spektrum},
  doi={10.1007/s13222-023-00439-0},
  year={2023}
}

About

A German Tweet dataset annotated for potentially illegal hate speech and mentions of identity terms.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages