A curated list of awesome cultural NLP resources, inspired by awesome-computer-vision.
Table Of Contents
Title | Conference / Journal | Paper | Code | Remarks |
---|---|---|---|---|
Culturally Aware and Adapted NLP: A Taxonomy and a Survey of the State of the Art | Arxiv 2024 | 2406.03930 | ||
Towards Measuring and Modeling “Culture” in LLMs: A Survey | Arxiv 2024 | 2403.15412 | Github | Cool paper! |
Challenges and Strategies in Cross-Cultural NLP | ACL 2022 | 2203.10020 | ||
Title | Conference / Journal | Paper | Code | Remarks |
---|---|---|---|---|
Vision-Language Models under Cultural and Inclusive Considerations | Arxiv 2024 | 2407.06177 | ||
Beyond Aesthetics: Cultural Competence in Text-to-Image Models | Arxiv 2024 | 2407.06863 | Data | Data |
M5 -- A Diverse Benchmark to Assess the Performance of Large Multimodal Models Across Multilingual and Multicultural Vision-Language Tasks | Arxiv 2024 | 2407.03791 | ||
Culturally Aware and Adapted NLP: A Taxonomy and a Survey of the State of the Art | Arxiv 2024 | 2406.03930 | ||
NORMAD: A Benchmark for Measuring the Cultural Adaptability of Large Language Models | Arxiv 2024 | 2404.12464 | Data | Data |
An image speaks a thousand words, but can everyone listen? On image transcreation for cultural relevance | Arxiv 2024 | 2404.01247 | Code and Data | Data + Application |
No Culture Left Behind: Massively Multi-Cultural Knowledge Acquisition & LM Benchmarking on 1000+ Sub-Country Regions and 2000+ Ethnolinguistic Groups | Arxiv 2024 | 2402.09369v1 | Data | |
The PRISM Alignment Project: What Participatory, Representative and Individualised Human Feedback Reveals About the Subjective and Multicultural Alignment of Large Language Models | Arxiv 2024 (under review) | 2404.16019 | Repository | Code and Data |
Exploring Cross-Cultural Differences in English Hate Speech Annotations: From Dataset Construction to Analysis | NAACL 2024 | 2308.16705 | Data+Code | |
CLIcK: A Benchmark Dataset of Cultural and Linguistic Intelligence | LREC-COLING '24 | https://arxiv.org/pdf/2403.06412 | Data | |
Bridging Cultural Nuances in Dialogue Agents through Cultural Value Surveys | EACL Findings 2024 | 2401.10352 | Dataset | |
Culturally Aware Natural Language Inference | EMNLP 2023 (Findings) | 2023.findings-emnlp.509 | Data | |
Global Voices, Local Biases: Socio-Cultural Prejudices across Languages | EMNLP 2023 | 2310.17586 | Data | Data+Analysis |
NORMSAGE: Multi-Lingual Multi-Cultural Norm Discovery from Conversations On-the-Fly | EMNLP 2023 | 2210.08604 | Code and Data | NormsKB |
GeoDE: a Geographically Diverse Evaluation Dataset for Object Recognition | Neurips 2023 | 2301.02560 | Code and Data | |
SeeGULL: A Stereotype Benchmark with Broad Geo-Cultural Coverage Leveraging Generative Models | ACL 2023 | 2305.11840 | Code | |
FORK: A Bite-Sized Test Set for Probing Culinary Cultural Biases in Commonsense Reasoning Models | ACL Findings 2023 | 2023.findings-acl.631 | Dataset | |
Multi-lingual and Multi-cultural Figurative Language Understanding | ACL Findings 2023 | 2305.16171 | Code | |
EnCBP: A New Benchmark Dataset for Finer-Grained Cultural Background Prediction in English | ACL Findings 2022 | 2203.14498 | ||
Re-contextualizing Fairness in NLP: The Case of India | AACL 2022 | 2209.12226 | Data | Data+Analysis |
Visually Grounded Reasoning across Languages and Cultures | EMNLP 2021 | 2109.13238 | Website | EMNLP 2021 Best Paper |
Would you Rather? A New Benchmark for Learning Machine Alignment with Cultural Values and Social Preferences | ACL 2020 | 2020.acl-main.477/ | ||
Title | Conference / Journal | Paper | Code | Remarks |
---|---|---|---|---|
CIC: A framework for Culturally-aware Image Captioning | IJCAI 2024 | 2402.05374 | Webpage | |
Title | Conference / Journal | Paper | Code | Remarks |
---|---|---|---|---|
GIVL: Improving Geographical Inclusivity of Vision-Language Models With Pre-Training Methods | CVPR 2023 | 2301.01893 | Code (not released yet) | |
Title | Conference / Journal | Paper | Code | Remarks |
---|---|---|---|---|
Cultural Conditioning or Placebo? On the Effectiveness of Socio-Demographic Prompting | Arxiv 2024 | 2406.11661 | ||
Extrinsic Evaluation of Cultural Competence in Large Language Models | Arxiv 2024 | 2406.11565 | ||
CulturalTeaming: AI-Assisted Interactive Red-Teaming for Challenging LLMs’ (Lack of) Multicultural Knowledge | Arxiv 2024 | 2404.06664 | ||
Having Beer after Prayer? Measuring Cultural Bias in Large Language Models | ACL 2024 | 2305.14456 | Code | |
Title | Conference / Journal | Paper | Code | Remarks |
---|---|---|---|---|
The Factuality Tax of Diversity-Intervened Text-to-Image Generation: Benchmark and Fact-Augmented Intervention | Arxiv 2024 | 2407.00377v1 | ||
On the Cultural Gap in Text-to-Image Generation | Arxiv 2023 | 2307.02971 | Code | |
Title | Conference / Journal | Paper | Code | Remarks |
---|---|---|---|---|
From Local Concepts to Universals: Evaluating the Multicultural Understanding of Vision-Language Models | Arxiv 2024 | 2407.00263 | ||
Title | Conference / Journal | Paper | Code | Remarks |
---|---|---|---|---|
ViSAGe: A Global-Scale Analysis of Visual Stereotypes in Text-to-Image Generation | ACL 2024 | 2401.06310 | ||
DIG In: Evaluating Disparities in Image Generations with Indicators for Geographic Diversity | ICLR 2024 | 2308.06198 | Code | |
Exploiting Cultural Biases via Homoglyphs in Text-to-Image Synthesis | JAIR 2023 | 2209.08891 | Code | |
Navigating Cultural Chasms: Exploring and Unlocking the Cultural POV of Text-To-Image Models | Arxiv 2023 | 2310.01929 | Code (not released yet) | |
Inspecting the Geographical Representativeness of Images from Text-to-Image Models | ICCV 2023 | 2305.11080 | ||
Easily Accessible Text-to-Image Generation Amplifies Demographic Stereotypes at Large Scale | FAccT '23 | 2211.03759 | ||
Multilingual Conceptual Coverage in Text-to-Image Models | ACL 2023 | 2306.01735 | Code | |
Title | Conference / Journal | Paper | Code | Remarks |
---|---|---|---|---|
The Echoes of Multilinguality: Tracing Cultural Value Shifts during LM Fine-tuning | ACL 2024 | 2405.12744 | ||
Exploring Changes in Nation Perception with Nationality-Assigned | ||||
Personas in LLMs | Arxiv 2024 | 2406.13993 | ||
CULTURE-GEN: Revealing Global Cultural Perception in Language Models through Natural Language Prompting | Arxiv 2024 | 2404.10199v1 | Code | |
Knowledge of cultural moral norms in large language models | ACL 2023 | 2306.01857 | ||
Multilingual Language Models are not Multicultural: A Case Study in Emotion | WASSA: ACL 2023 | 2307.01370 | ||
Social Commonsense for Explanation and Cultural Bias Discovery | ||||
DOSA: A Dataset of Social Artifacts from Different Indian Geographical Subcultures | LREC-COLING 2024 | 2403.14651 | Code | |
Title | Conference / Journal | Paper | Code | Remarks |
---|---|---|---|---|
Multilingual Diversity Improves Vision-Language Representations | Arxiv 2024 | 2405.16915 | ||
No Filter: Cultural and Socioeconomic Diversity in Contrastive Vision–Language Models | Arxiv 2024 | 2405.13777 | ||
Computer Vision Datasets and Models Exhibit Cultural and Linguistic Diversity in Perception | Arxiv 2024 | 2310.14356 | ||
Exploring Visual Culture Awareness in GPT-4V: A Comprehensive Probing | arxiv 2024 | 2402.06015 | ||
‘Person’ == Light-skinned, Western Man, and Sexualization of Women of Color: Stereotypes in Stable Diffusion | EMNLP 2023 Findings | 2310.19981 | ||
Title | Conference / Journal | Paper | Code | Remarks |
---|---|---|---|---|
Cross-Cultural Analysis of Human Values, Morals, and Biases in Folk Tales | EMNLP 2023 | 2023.emnlp-main.311 | ||
Social Commonsense for Explanation and Cultural Bias Discovery | EACL 2023 | 2023.eacl-main.271.pdf | ||
Cross-cultural variation of speech-accompanying gesture: A review | Language and Cognitive Processes: Volume 24, Issue 2, 2009 | 10.1080/01690960802586188 | ||
Title | Conference / Journal | Paper | Code | Remarks |
---|---|---|---|---|
Investigating Cultural Alignment of Large Language Models | Arxiv 2024 | 2402.13231 | ||
Unintended Impacts of LLM Alignment on Global Representation | Arxiv 2024 | 2402.15018 | ||
Assessing Cross-Cultural Alignment between ChatGPT and Human Societies: An Empirical Study | C3NLP: EACL 2023 | 2303.17466 | Analysis | |
Probing Pre-Trained Language Models for Cross-Cultural Differences in Values | C3NLP: EACL 2023 | 2203.13722 | Analysis | |
Title | Conference / Journal | Paper | Code | Remarks |
---|---|---|---|---|
NLPositionality: Characterizing Design Biases of Datasets and Models | ACL 2023 (Outstanding Paper) | 2023.acl-long.505.pdf | Website |
Title | Conference / Journal | Paper | Code | Remarks |
---|---|---|---|---|
Cultural Concept Adaptation on Multimodal Reasoning | EMNLP 2023 | EMNLP Main 18 | ||
Title | Conference / Journal | Paper | Code | Remarks |
---|---|---|---|---|
Cross-Cultural Similarity Features for Cross-Lingual Transfer Learning of Pragmatically Motivated Tasks | EACL 2021 | 2006.09336 | Sentiment Analysis | |
Please feel free to send me pull requests or email (khanuja.simran7@gmail.com) to add links.
License
To the extent possible under law, Simran Khanuja has waived all copyright and related or neighboring rights to this work.