Skip to content

Files

Latest commit

author
Ubuntu
Apr 11, 2022
db32116 · Apr 11, 2022

History

History
This branch is 225 commits behind Unbabel/COMET:master.

data

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
Apr 11, 2022

Publicly available data for Metrics:

Direct Assessments:

Every year the WMT News Translation task organizers collect thousands of quality annotations in the form of Direct Assessments. Most COMET models use that data either in the form of z-scores or in the form of relative-ranks.

I'll leave here a table with links for that data.

year DA relative ranks paper
2017 🔗 🔗 Results of the WMT17 Metrics Shared Task
2018 🔗 🔗 Results of the WMT18 Metrics Shared Task
2019 🔗 🔗 Results of the WMT19 Metrics Shared Task
2020 🔗 🔗 Results of the WMT20 Metrics Shared Task
2021 🔗 🔗 Results of the WMT21 Metrics Shared Task

Multidimensional Quality Metrics

In the last editions of the WMT Metrics shared task the organizers decided to run evaluation of MT based on Multidimensional Quality Metrics (MQM) based on findings that crowd-sourced Direct Assessments are noisy and do not correlate well with annotations done by experts [Freitag, et al. 2021].

year MQM paper
2020 🔗 A Large-Scale Study of Human Evaluation for Machine Translation
2021 🔗 Results of the WMT21 Metrics Shared Task