The embeddings could be found at: https://drive.google.com/drive/folders/1_ZeoYMyBTb2sROeKmxS0quVrbvJBrct2?usp=sharing
Note that ver1
is trained on 0.3 million
tweets only while ver2
is trained on 4.7 million
tweets.
label_definitions.txt
contains the mapping for the labels for both tasks (i.e., coarsegrained and finegrained labels).
@inproceedings{rizwan2020hate,
title={Hate-speech and offensive language detection in roman Urdu},
author={Rizwan, Hammad and Shakeel, Muhammad Haroon and Karim, Asim},
booktitle={Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
pages={2512--2522},
year={2020}
}