Skip to content

amazon-science/machine-translation-gender-eval

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MT-GenEval

This repository contains the data and code for the MT-GenEval benchmark, which evaluates gender translation accuracy on English -> {Arabic, French, German, Hindi, Italian, Portuguese, Russian, Spanish}. The MT-GenEval benchmark was released in the EMNLP 2022 paper MT-GenEval: A Counterfactual and Contextual Dataset for Evaluating Gender Accuracy in Machine Translation by Anna Currey, Maria Nadejde, Raghavendra Pappagari, Mia Mayer, Stanislas Lauly, Xing Niu, Benjamin Hsu, and Georgiana Dinu.

Citing

@inproceedings{currey-etal-2022-mtgeneval,
    title = "{MT-GenEval}: {A} Counterfactual and Contextual Dataset for Evaluating Gender Accuracy in Machine Translation",
    author = "Currey, Anna  and
      Nadejde, Maria  and
      Pappagari, Raghavendra  and
      Mayer, Mia  and
      Lauly, Stanislas,  and
      Niu, Xing  and
      Hsu, Benjamin  and
      Dinu, Georgiana",
    booktitle = "Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing",
    month = dec,
    year = "2022",
    publisher = "Association for Computational Linguistics",
    url = ""https://arxiv.org/pdf/2211.01355.pdf,
}

Data

The data is originally sourced from Wikipedia. We include sentence-level development and test segments in data/sentences/ and inter-sentence test segments in data/context/.

Compute accuracy

To compute accuracy, use accuracy_metric.py script. Example usage for English-Russian contextual test dataset is as follows

python3 accuracy_metric.py \
        --target_lang ru \
        --dataset contextual \
        --data_split test \
        --hyp PATH_FOR_YOUR_SYSTEM_TRANSLATIONS

Example usage for English-Russian counterfactual test dataset is as follows

python3 accuracy_metric.py \
        --target_lang ru \
        --dataset counterfactual \
        --data_split test \
        --hyp_masculine PATH_FOR_YOUR_SYSTEM_TRANSLATIONS_FOR_MASCULINE_SEGMENTS \
        --hyp_feminine PATH_FOR_YOUR_SYSTEM_TRANSLATIONS_FOR_FEMININE_SEGMENTS

Security

See CONTRIBUTING for more information.

License

The data and code are released under the CC-BY-SA-3.0 License. See LICENSE for details.

About

Data and code for the MT-GenEval benchmark

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •