This is a Twitter corpus built with the aim of representing and analyzing hate speech against some minority groups in Italy: immigrants in particular, but also Muslims and Roma.
Similar to the one provided by Wasseem and Hovy (2016), the corpus released here only contains the tweets' ID and their annotation. The content of each tweet can thus be retrieved using the Twitter APIs and querying the corresponding ID.
The corpus development forms part of the Hate Speech Monitoring program coordinated by the Computer Science Department of the University of Turin (Italy).
If you use the resource, please cite:
@InProceedings{SanguinettiEtAlLREC2018,
author = {Manuela Sanguinetti and Fabio Poletto and Cristina Bosco and Viviana Patti and Marco Stranisci},
title = {An Italian Twitter Corpus of Hate Speech against Immigrants},
booktitle = {Proceedings of the 11th Conference on Language Resources and Evaluation (LREC2018), May 2018, Miyazaki, Japan},
month = {},
year = {2018},
address = {},
publisher = {},
pages = {2798--2895},
url = {}
}
Poletto F., Stranisci M.,Sanguinetti M., Patti V., Bosco C. (2017) Hate speech annotation: Analysis of an Italian Twitter corpus. In: Proceedings of the 4th Italian Conference on Computational Linguistics (CLiC-it 2017), Rome, Italy.
The work is funded by Progetto di Ateneo/CSP 2016 (Immigrants, Hate and Prejudice in Social Media, project S1618_L2_BOSC_01) and by Fondazione CRT (Hate Speech and Social Media, project n. 2016.0688).