🕹️ CounterGeDi: A controllable approach to generate polite, detoxified and emotional counterspeech [Accepted at IJCAI 2022: AI for Good(Special Track)]
Punyajoy Saha, Kanishk Singh, Adarsh Kumar, Binny Mathew and Animesh Mukherjee : "CounterGeDi: A controllable approach to generate polite, detoxified and emotional counterspeech"
Recently, many studies have tried to create generation models to assist counter speakers by providing counterspeech suggestions for combating the explosive proliferation of online hate. However, since these suggestions are from a vanilla generation model, they might not include the appropriate properties required to counter a particular hate speech instance. In this paper, we propose CounterGeDi - an ensemble of generative discriminators (GeDi) to guide the generation of a DialoGPT model toward more polite, detoxified, and emotionally laden counterspeech. We generate counterspeech using three datasets and observe significant improvement across different attribute scores. The politeness and detoxification scores increased by around 15% and 6% respectively, while the emotion in the counterspeech increased by at least 10% across all the datasets. We also experiment with triple-attribute control and observe significant improvement over single attribute results when combining complementing attributes, e.g., politeness, joyfulness and detoxification. In all these experiments, the relevancy of the generated text does not deteriorate due to the application of these controls.
WARNING: The repository contains content that are offensive and/or hateful in nature.
Please cite our paper in any published work that uses any of these resources.
@misc{https://doi.org/10.48550/arxiv.2205.04304,
doi = {10.48550/ARXIV.2205.04304},
url = {https://arxiv.org/abs/2205.04304},
author = {Saha, Punyajoy and Singh, Kanishk and Kumar, Adarsh and Mathew, Binny and Mukherjee, Animesh},
keywords = {Computation and Language (cs.CL), Computers and Society (cs.CY), FOS: Computer and information sciences, FOS: Computer and information sciences},
title = {CounterGeDi: A controllable approach to generate polite, detoxified and emotional counterspeech},
publisher = {arXiv},
year = {2022},
copyright = {Creative Commons Attribution 4.0 International}
}
./Discriminator --> Contains the codes for the Discriminators used in GeDi Model
./Generation --> Contains the codes for Generation of Results using our proposed Model
./Utils --> Contains the utility functions like Preprocessing, Data loading etc
To train the base model for Counterspeech Generation, run the file Generation_training.py
, after updating the task name and other saving related parameters as per the requirement(see comments to get more idea on different path variables to be updated).
For generation of results, run Generation_gedi.py
file.
In order to generate the required result file, adjust the parameters in params
dictionary in the python file, as per the requirement. For example
# To generate sentences controlled for emotion joy + Politeness:
params = {
...
...
'disc_weight':[0.5, 0.5],
...
...
'task_name':[('Emotion', 'joy'), ('Politeness', 'polite')],
...
}
Similarly you can tweak other papameters to change the results as per the requirement.
For Generation Metrics:
- We evaluate the generated responses on variety of metrics including BLEU,meteor, diversity and novelty.
- The methods to compute these scores are described in the
Evaluation notebook.ipynb
For Emotions Evaluation:
- Do
git clone https://github.com/monologg/GoEmotions-pytorch
- Then move the
Evaluation notebook-Emotion
to theGoEmotions-pytorch
folder and set file paths accordingly for running evaluation
For Toxicity Evaluation:
- Toxicity is calculated using HateXplain model
- The colab notebook could be accessed here - CounterGedi_detox_eval.ipynb
For Grammatical Coherence Evaluation:
- To evaluate whether the respsonses were grammaticaly coreect or not, we use a pretrained model trained on the corpus of linguistic acceptability(COLA scores).
- The colab notebook could be accessed here - CounterGedi_COLA_eval.ipynb
- Add arxiv paper link.
- Add link to Proceedings paper.
- Usage Instruction General
- Add Evaluation Instruction
- Remove Redundant Files
- Add generated result files