This is a list of papers concerning the ethical issues in NLP. Updating... Any suggestion or issue are welcome.
- 💾 stands for releasing new dataset.
- Survey and Framework
- Semantic Bias
- Bias in Language Generation
- Downstream Task
- Hate Speech
- Analysis
- Benchmarks
- Mitigating Gender Bias in Natural Language Processing: Literature Review (ACL 2019)
- Language (Technology) is Power: A Critical Survey of “Bias” in NLP (ACL 2020)
- Predictive Biases in Natural Language Processing Models- A Conceptual Framework and Overview (ACL 2020)
- On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜 (FAccT 2021)
- Case Study: Deontological Ethics in NLP (NAACL 2021)
- Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings (NIPS 2016)
- Semantics derived automatically from language corpora contain human-like biases (Science 2017)
- Word embeddings quantify 100 years of gender and ethnic stereotypes (PNAS 2018)
- Lipstick on a Pig: Debiasing Methods Cover up Systematic Gender Biases in Word Embeddings But do not Remove Them (NAACL 2019)
- Black is to Criminal as Caucasian is to Police: Detecting and Removing Multiclass Bias in Word Embeddings (NAACL 2019)
- Simple dynamic word embeddings for mapping perceptions in the public sphere (NAACL 2019 Workshop)
- It’s All in the Name: Mitigating Gender Bias with Name-Based Counterfactual Data Substitution (EMNLP 2019)
- Double-Hard Debias: Tailoring Word Embeddings for Gender Bias Mitigation (ACL 2020)
- Gender Bias in Multilingual Embeddings and Cross-Lingual Transfer (ACL 2020) - 💾
- Exploring the Linear Subspace Hypothesis in Gender Bias Mitigation (EMNLP 2020)
- Unequal Representations: Analyzing Intersectional Biases in Word Embeddings Using Representational Similarity Analysis (COLING 2020)
- Intrinsic Bias Metrics Do Not Correlate with Application Bias (arxiv)
- On Measuring Social Biases in Sentence Encoders (NAACL 2019)
- Gender Bias in Contextualized Word Embeddings (NAACL 2019)
- Measuring Bias in Contextualized Word Representations (ACL 2019 Workshop)
- Assessing Social and Intersectional Biases in Contextualized Word Representations (NeurIPS 2019)
- Towards Debiasing Sentence Representations (ACL 2020)
- Interpreting Pretrained Contextualized Representations via Reductions to Static Embeddings (ACL 2020)
- Monolingual and Multilingual Reduction of Gender Bias in Contextualized Representations (COLING 2020)
- Unmasking Contextual Stereotypes: Measuring and Mitigating BERT’s Gender Bias (COLING 2020)
- Debiasing Pre-trained Contextualised Embeddings (EACL 2021)
- FairFil: Contrastive Neural Debiasing Method for Pretrained Text Encoders (ICLR 2021)
- On Transferability of Bias Mitigation Effects in Language Model Fine-Tuning (NAACL 2021)
- The Woman Worked as a Babysitter: On Biases in Language Generation (EMNLP 2019)
- Plug and Play Language Models: A Simple Approach to Controlled Text Generation (Section 4.4) (ICLR 2020)
- SOCIAL BIAS FRAMES: Reasoning about Social and Power Implications of Language (ACL 2020) - 💾
- Mitigating Gender Bias for Neural Dialogue Generation with Adversarial Learning (EMNLP 2020)
- Queens are Powerful too: Mitigating Gender Bias in Dialogue Generation (EMNLP 2020)
- Towards Controllable Biases in Language Generation (EMNLP 2020 Findings)
- Does Gender Matter? Towards Fairness in Dialogue Systems (COLING 2020)
- Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP (arxiv)
- Mind the GAP: A Balanced Corpus of Gendered Ambiguous Pronouns (TACL 2018) - 💾
- Gender Bias in Neural Natural Language Processing (arxiv)
- Gender Bias in Coreference Resolution (NAACL 2018) - 💾
- Gender Bias in Coreference Resolution: Evaluation and Debiasing Methods (NAACL 2018) - 💾
- Toward Gender-Inclusive Coreference Resolution (ACL 2020) - 💾
- Stereotype and Skew: Quantifying Gender Bias in Pre-trained and Fine-tuned Language Models (EACL 2021)
- Men Also Like Shopping: Reducing Gender Bias Amplification using Corpus-level Constraints (EMNLP 2017)
- Bias in Bios: A Case Study of Semantic Representation Bias in a High-Stakes Setting (FAT* 2019)
- Null It Out: Guarding Protected Attributes by Iterative Nullspace Projection (ACL 2020)
- The Risk of Racial Bias in Hate Speech Detection (ACL 2019)
- Social Biases in NLP Models as Barriers for Persons with Disabilities (ACL 2020)
- Latent-Optimized Adversarial Neural Transfer for Sarcasm Detection (NAACL 2021)
- Is Your Classifier Actually Biased? Measuring Fairness under Uncertainty with Bernstein Bounds (ACL 2020)
- Investigating Gender Bias in Language Models Using Causal Mediation Analysis (NeurIPS 2020)
- Diverse Adversaries for Mitigating Bias in Training (EACL 2021)
- StereoSet: Measuring stereotypical bias in pretrained language models (arxiv 2020) - 💾
- CrowS-Pairs: A Challenge Dataset for Measuring Social Biases in Masked Language Models (EMNLP 2020) - 💾
- BOLD: Dataset and Metrics for Measuring Biases in Open-Ended Language Generation (FAccT 2021) - 💾
- HateXplain: A Benchmark Dataset for Explainable Hate Speech Detection (AAAI 2021) - 💾
- What Will it Take to Fix Benchmarking in Natural Language Understanding? (NAACL 2021)