- dataset: opus
- model: transformer
- source language(s): eng
- target language(s): arg ast cat cos egl ext fra frm_Latn fvr glg ita lad lad_Latn lij lld_Latn lmo mwl oci osp_Latn pms por roh ron scn spa vec wln
- model: transformer
- pre-processing: normalization + SentencePiece (spm32k,spm32k)
- a sentence initial language token is required in the form of
>>id<<
(id = valid target language ID) - download: opus-2020-06-28.zip
- test set translations: opus-2020-06-28.test.txt
- test set scores: opus-2020-06-28.eval.txt
testset | BLEU | chr-F |
---|---|---|
Tatoeba-test.eng-arg.eng.arg | 2.2 | 0.147 |
Tatoeba-test.eng-ast.eng.ast | 17.2 | 0.415 |
Tatoeba-test.eng-cat.eng.cat | 47.7 | 0.669 |
Tatoeba-test.eng-cos.eng.cos | 3.2 | 0.262 |
Tatoeba-test.eng-egl.eng.egl | 0.4 | 0.119 |
Tatoeba-test.eng-ext.eng.ext | 5.5 | 0.304 |
Tatoeba-test.eng-fra.eng.fra | 45.8 | 0.641 |
Tatoeba-test.eng-frm.eng.frm | 0.9 | 0.212 |
Tatoeba-test.eng-fvr.eng.fvr | 2.6 | 0.260 |
Tatoeba-test.eng-glg.eng.glg | 45.8 | 0.655 |
Tatoeba-test.eng-ita.eng.ita | 45.9 | 0.678 |
Tatoeba-test.eng-lad.eng.lad | 8.9 | 0.324 |
Tatoeba-test.eng-lij.eng.lij | 1.8 | 0.191 |
Tatoeba-test.eng-lld.eng.lld | 0.5 | 0.215 |
Tatoeba-test.eng-lmo.eng.lmo | 0.9 | 0.203 |
Tatoeba-test.eng.multi | 44.1 | 0.645 |
Tatoeba-test.eng-mwl.eng.mwl | 4.1 | 0.331 |
Tatoeba-test.eng-oci.eng.oci | 7.8 | 0.289 |
Tatoeba-test.eng-osp.eng.osp | 10.8 | 0.382 |
Tatoeba-test.eng-pms.eng.pms | 1.8 | 0.197 |
Tatoeba-test.eng-por.eng.por | 41.7 | 0.637 |
Tatoeba-test.eng-roh.eng.roh | 2.8 | 0.257 |
Tatoeba-test.eng-ron.eng.ron | 41.8 | 0.640 |
Tatoeba-test.eng-scn.eng.scn | 1.8 | 0.175 |
Tatoeba-test.eng-spa.eng.spa | 50.3 | 0.691 |
Tatoeba-test.eng-vec.eng.vec | 3.2 | 0.251 |
Tatoeba-test.eng-wln.eng.wln | 6.6 | 0.236 |
- dataset: opus
- model: transformer
- source language(s): eng
- target language(s): arg ast cat cos egl ext fra frm_Latn gcf_Latn glg hat ind ita lad lad_Latn lij lld_Latn lmo max_Latn mfe min mwl oci pap pms por roh ron scn spa tmw_Latn vec wln zlm_Latn zsm_Latn
- model: transformer
- pre-processing: normalization + SentencePiece (spm32k,spm32k)
- a sentence initial language token is required in the form of
>>id<<
(id = valid target language ID) - download: opus-2020-07-14.zip
- test set translations: opus-2020-07-14.test.txt
- test set scores: opus-2020-07-14.eval.txt
testset | BLEU | chr-F |
---|---|---|
Tatoeba-test.eng-arg.eng.arg | 1.7 | 0.133 |
Tatoeba-test.eng-ast.eng.ast | 17.2 | 0.415 |
Tatoeba-test.eng-cat.eng.cat | 47.5 | 0.668 |
Tatoeba-test.eng-cos.eng.cos | 1.8 | 0.215 |
Tatoeba-test.eng-egl.eng.egl | 0.4 | 0.087 |
Tatoeba-test.eng-ext.eng.ext | 13.7 | 0.353 |
Tatoeba-test.eng-fra.eng.fra | 44.1 | 0.629 |
Tatoeba-test.eng-frm.eng.frm | 0.6 | 0.196 |
Tatoeba-test.eng-gcf.eng.gcf | 0.9 | 0.116 |
Tatoeba-test.eng-glg.eng.glg | 43.7 | 0.640 |
Tatoeba-test.eng-hat.eng.hat | 30.1 | 0.529 |
Tatoeba-test.eng-ita.eng.ita | 44.8 | 0.668 |
Tatoeba-test.eng-lad.eng.lad | 7.5 | 0.301 |
Tatoeba-test.eng-lij.eng.lij | 1.5 | 0.187 |
Tatoeba-test.eng-lld.eng.lld | 0.8 | 0.199 |
Tatoeba-test.eng-lmo.eng.lmo | 0.8 | 0.177 |
Tatoeba-test.eng-mfe.eng.mfe | 91.9 | 0.956 |
Tatoeba-test.eng.multi | 42.3 | 0.631 |
Tatoeba-test.eng-mwl.eng.mwl | 2.7 | 0.252 |
Tatoeba-test.eng-oci.eng.oci | 7.3 | 0.290 |
Tatoeba-test.eng-pap.eng.pap | 43.7 | 0.627 |
Tatoeba-test.eng-pms.eng.pms | 2.4 | 0.194 |
Tatoeba-test.eng-por.eng.por | 40.7 | 0.632 |
Tatoeba-test.eng-roh.eng.roh | 3.5 | 0.258 |
Tatoeba-test.eng-ron.eng.ron | 40.0 | 0.628 |
Tatoeba-test.eng-scn.eng.scn | 1.6 | 0.100 |
Tatoeba-test.eng-spa.eng.spa | 48.7 | 0.680 |
Tatoeba-test.eng-vec.eng.vec | 1.9 | 0.166 |
Tatoeba-test.eng-wln.eng.wln | 8.1 | 0.226 |
- dataset: opus
- model: transformer
- source language(s): eng
- target language(s): arg ast cat cos egl ext fra frm_Latn gcf_Latn glg hat ind ita lad lad_Latn lij lld_Latn lmo max_Latn mfe min mwl oci pap pms por roh ron scn spa tmw_Latn vec wln zlm_Latn zsm_Latn
- model: transformer
- pre-processing: normalization + SentencePiece (spm32k,spm32k)
- a sentence initial language token is required in the form of
>>id<<
(id = valid target language ID) - download: opus-2020-07-20.zip
- test set translations: opus-2020-07-20.test.txt
- test set scores: opus-2020-07-20.eval.txt
testset | BLEU | chr-F |
---|---|---|
Tatoeba-test.eng-arg.eng.arg | 1.5 | 0.132 |
Tatoeba-test.eng-ast.eng.ast | 15.4 | 0.413 |
Tatoeba-test.eng-cat.eng.cat | 47.8 | 0.671 |
Tatoeba-test.eng-cos.eng.cos | 3.3 | 0.293 |
Tatoeba-test.eng-egl.eng.egl | 0.2 | 0.085 |
Tatoeba-test.eng-ext.eng.ext | 11.7 | 0.311 |
Tatoeba-test.eng-fra.eng.fra | 44.8 | 0.633 |
Tatoeba-test.eng-frm.eng.frm | 1.0 | 0.213 |
Tatoeba-test.eng-gcf.eng.gcf | 0.8 | 0.119 |
Tatoeba-test.eng-glg.eng.glg | 44.5 | 0.646 |
Tatoeba-test.eng-hat.eng.hat | 25.5 | 0.494 |
Tatoeba-test.eng-ita.eng.ita | 45.1 | 0.673 |
Tatoeba-test.eng-lad.eng.lad | 8.0 | 0.305 |
Tatoeba-test.eng-lij.eng.lij | 1.5 | 0.178 |
Tatoeba-test.eng-lld.eng.lld | 0.4 | 0.171 |
Tatoeba-test.eng-lmo.eng.lmo | 1.5 | 0.191 |
Tatoeba-test.eng-mfe.eng.mfe | 91.9 | 0.956 |
Tatoeba-test.eng-msa.eng.msa | 31.2 | 0.548 |
Tatoeba-test.eng.multi | 42.6 | 0.632 |
Tatoeba-test.eng-mwl.eng.mwl | 3.3 | 0.288 |
Tatoeba-test.eng-oci.eng.oci | 7.5 | 0.287 |
Tatoeba-test.eng-pap.eng.pap | 44.8 | 0.630 |
Tatoeba-test.eng-pms.eng.pms | 2.7 | 0.198 |
Tatoeba-test.eng-por.eng.por | 41.3 | 0.635 |
Tatoeba-test.eng-roh.eng.roh | 4.3 | 0.271 |
Tatoeba-test.eng-ron.eng.ron | 40.6 | 0.631 |
Tatoeba-test.eng-scn.eng.scn | 1.4 | 0.173 |
Tatoeba-test.eng-spa.eng.spa | 49.2 | 0.684 |
Tatoeba-test.eng-vec.eng.vec | 4.8 | 0.240 |
Tatoeba-test.eng-wln.eng.wln | 5.4 | 0.233 |
- dataset: opus
- model: transformer
- source language(s): eng
- target language(s): arg ast cat cos egl ext fra frm_Latn gcf_Latn glg hat ind ita lad lad_Latn lij lld_Latn lmo max_Latn mfe min mwl oci pap pms por roh ron scn spa tmw_Latn vec wln zlm_Latn zsm_Latn
- model: transformer
- pre-processing: normalization + SentencePiece (spm32k,spm32k)
- a sentence initial language token is required in the form of
>>id<<
(id = valid target language ID) - download: opus-2020-07-27.zip
- test set translations: opus-2020-07-27.test.txt
- test set scores: opus-2020-07-27.eval.txt
testset | BLEU | chr-F |
---|---|---|
newsdev2016-enro-engron.eng.ron | 27.3 | 0.565 |
newsdiscussdev2015-enfr-engfra.eng.fra | 29.9 | 0.573 |
newsdiscusstest2015-enfr-engfra.eng.fra | 35.2 | 0.609 |
newssyscomb2009-engfra.eng.fra | 27.8 | 0.569 |
newssyscomb2009-engita.eng.ita | 29.0 | 0.590 |
newssyscomb2009-engspa.eng.spa | 29.5 | 0.567 |
news-test2008-engfra.eng.fra | 25.1 | 0.538 |
news-test2008-engspa.eng.spa | 27.2 | 0.547 |
newstest2009-engfra.eng.fra | 26.6 | 0.557 |
newstest2009-engita.eng.ita | 28.6 | 0.582 |
newstest2009-engspa.eng.spa | 28.7 | 0.565 |
newstest2010-engfra.eng.fra | 29.2 | 0.573 |
newstest2010-engspa.eng.spa | 33.6 | 0.598 |
newstest2011-engfra.eng.fra | 31.2 | 0.591 |
newstest2011-engspa.eng.spa | 34.8 | 0.599 |
newstest2012-engfra.eng.fra | 29.2 | 0.574 |
newstest2012-engspa.eng.spa | 35.1 | 0.601 |
newstest2013-engfra.eng.fra | 29.7 | 0.565 |
newstest2013-engspa.eng.spa | 31.7 | 0.576 |
newstest2016-enro-engron.eng.ron | 25.9 | 0.548 |
Tatoeba-test.eng-arg.eng.arg | 1.7 | 0.131 |
Tatoeba-test.eng-ast.eng.ast | 16.6 | 0.417 |
Tatoeba-test.eng-cat.eng.cat | 47.6 | 0.670 |
Tatoeba-test.eng-cos.eng.cos | 3.3 | 0.284 |
Tatoeba-test.eng-egl.eng.egl | 0.9 | 0.118 |
Tatoeba-test.eng-ext.eng.ext | 8.7 | 0.301 |
Tatoeba-test.eng-fra.eng.fra | 44.8 | 0.633 |
Tatoeba-test.eng-frm.eng.frm | 0.8 | 0.201 |
Tatoeba-test.eng-gcf.eng.gcf | 0.8 | 0.117 |
Tatoeba-test.eng-glg.eng.glg | 44.0 | 0.642 |
Tatoeba-test.eng-hat.eng.hat | 28.8 | 0.510 |
Tatoeba-test.eng-ita.eng.ita | 45.3 | 0.674 |
Tatoeba-test.eng-lad.eng.lad | 8.4 | 0.310 |
Tatoeba-test.eng-lij.eng.lij | 1.4 | 0.178 |
Tatoeba-test.eng-lld.eng.lld | 0.8 | 0.220 |
Tatoeba-test.eng-lmo.eng.lmo | 0.9 | 0.189 |
Tatoeba-test.eng-mfe.eng.mfe | 82.4 | 0.915 |
Tatoeba-test.eng-msa.eng.msa | 31.3 | 0.549 |
Tatoeba-test.eng.multi | 42.6 | 0.633 |
Tatoeba-test.eng-mwl.eng.mwl | 2.9 | 0.311 |
Tatoeba-test.eng-oci.eng.oci | 7.9 | 0.292 |
Tatoeba-test.eng-pap.eng.pap | 47.4 | 0.661 |
Tatoeba-test.eng-pms.eng.pms | 2.5 | 0.198 |
Tatoeba-test.eng-por.eng.por | 41.4 | 0.636 |
Tatoeba-test.eng-roh.eng.roh | 3.2 | 0.259 |
Tatoeba-test.eng-ron.eng.ron | 40.8 | 0.632 |
Tatoeba-test.eng-scn.eng.scn | 1.8 | 0.191 |
Tatoeba-test.eng-spa.eng.spa | 49.4 | 0.685 |
Tatoeba-test.eng-vec.eng.vec | 5.1 | 0.253 |
Tatoeba-test.eng-wln.eng.wln | 7.1 | 0.235 |
- dataset: opus2m
- model: transformer
- source language(s): eng
- target language(s): arg ast cat cos egl ext fra frm_Latn gcf_Latn glg hat ind ita lad lad_Latn lij lld_Latn lmo max_Latn mfe min mwl oci pap pms por roh ron scn spa tmw_Latn vec wln zlm_Latn zsm_Latn
- model: transformer
- pre-processing: normalization + SentencePiece (spm32k,spm32k)
- a sentence initial language token is required in the form of
>>id<<
(id = valid target language ID) - download: opus2m-2020-08-01.zip
- test set translations: opus2m-2020-08-01.test.txt
- test set scores: opus2m-2020-08-01.eval.txt
testset | BLEU | chr-F |
---|---|---|
newsdev2016-enro-engron.eng.ron | 27.6 | 0.567 |
newsdiscussdev2015-enfr-engfra.eng.fra | 30.2 | 0.575 |
newsdiscusstest2015-enfr-engfra.eng.fra | 35.5 | 0.612 |
newssyscomb2009-engfra.eng.fra | 27.9 | 0.570 |
newssyscomb2009-engita.eng.ita | 29.3 | 0.590 |
newssyscomb2009-engspa.eng.spa | 29.6 | 0.570 |
news-test2008-engfra.eng.fra | 25.2 | 0.538 |
news-test2008-engspa.eng.spa | 27.3 | 0.548 |
newstest2009-engfra.eng.fra | 26.9 | 0.560 |
newstest2009-engita.eng.ita | 28.7 | 0.583 |
newstest2009-engspa.eng.spa | 29.0 | 0.568 |
newstest2010-engfra.eng.fra | 29.3 | 0.574 |
newstest2010-engspa.eng.spa | 34.2 | 0.601 |
newstest2011-engfra.eng.fra | 31.4 | 0.592 |
newstest2011-engspa.eng.spa | 35.0 | 0.599 |
newstest2012-engfra.eng.fra | 29.5 | 0.576 |
newstest2012-engspa.eng.spa | 35.5 | 0.603 |
newstest2013-engfra.eng.fra | 29.9 | 0.567 |
newstest2013-engspa.eng.spa | 32.1 | 0.578 |
newstest2016-enro-engron.eng.ron | 26.1 | 0.551 |
Tatoeba-test.eng-arg.eng.arg | 1.4 | 0.125 |
Tatoeba-test.eng-ast.eng.ast | 17.8 | 0.406 |
Tatoeba-test.eng-cat.eng.cat | 48.3 | 0.676 |
Tatoeba-test.eng-cos.eng.cos | 3.2 | 0.275 |
Tatoeba-test.eng-egl.eng.egl | 0.2 | 0.084 |
Tatoeba-test.eng-ext.eng.ext | 11.2 | 0.344 |
Tatoeba-test.eng-fra.eng.fra | 45.3 | 0.637 |
Tatoeba-test.eng-frm.eng.frm | 1.1 | 0.221 |
Tatoeba-test.eng-gcf.eng.gcf | 0.6 | 0.118 |
Tatoeba-test.eng-glg.eng.glg | 44.2 | 0.645 |
Tatoeba-test.eng-hat.eng.hat | 28.0 | 0.502 |
Tatoeba-test.eng-ita.eng.ita | 45.6 | 0.674 |
Tatoeba-test.eng-lad.eng.lad | 8.2 | 0.322 |
Tatoeba-test.eng-lij.eng.lij | 1.4 | 0.182 |
Tatoeba-test.eng-lld.eng.lld | 0.8 | 0.217 |
Tatoeba-test.eng-lmo.eng.lmo | 0.7 | 0.190 |
Tatoeba-test.eng-mfe.eng.mfe | 91.9 | 0.956 |
Tatoeba-test.eng-msa.eng.msa | 31.1 | 0.548 |
Tatoeba-test.eng.multi | 42.9 | 0.636 |
Tatoeba-test.eng-mwl.eng.mwl | 2.1 | 0.234 |
Tatoeba-test.eng-oci.eng.oci | 7.9 | 0.297 |
Tatoeba-test.eng-pap.eng.pap | 44.1 | 0.648 |
Tatoeba-test.eng-pms.eng.pms | 2.1 | 0.190 |
Tatoeba-test.eng-por.eng.por | 41.8 | 0.639 |
Tatoeba-test.eng-roh.eng.roh | 3.5 | 0.261 |
Tatoeba-test.eng-ron.eng.ron | 41.0 | 0.635 |
Tatoeba-test.eng-scn.eng.scn | 1.7 | 0.184 |
Tatoeba-test.eng-spa.eng.spa | 50.1 | 0.689 |
Tatoeba-test.eng-vec.eng.vec | 3.2 | 0.248 |
Tatoeba-test.eng-wln.eng.wln | 7.2 | 0.220 |
- dataset: opus1m+bt
- model: transformer-align
- source language(s): arg eng
- target language(s): ast cat cbk cos egl eng ext fra frm gcf glg hat ind ita jak lad lij lld lmo max mfe min mol msa mwl oci osp pap pms pob por roh ron scn spa tmw vec wln zlm zsm
- model: transformer-align
- pre-processing: normalization + SentencePiece (spm32k,spm32k)
- a sentence initial language token is required in the form of
>>id<<
(id = valid target language ID) - valid language labels: >>fra<< >>spa<< >>roh<< >>zlm_Latn<< >>cos<< >>ext<< >>mfe<< >>scn<< >>lad<< >>mwl<< >>ast<< >>hat<< >>pob<< >>pap<< >>lmo<< >>vec<< >>pms<< >>glg<< >>cat<< >>msa_Latn<< >>wln<< >>ind<< >>ron<< >>por<< >>ita<< >>oci<< >>lij<< >>jak_Latn<< >>eng<< >>min<< >>zlm<< >>mol<< >>cbk_Latn<<
- download: opus1m+bt-2021-03-23.zip
- test set translations: opus1m+bt-2021-03-23.test.txt
- test set scores: opus1m+bt-2021-03-23.eval.txt
testset | BLEU | chr-F | #sent | #words | BP |
---|---|---|---|---|---|
newsdiscussdev2015-enfr.eng-fra | 28.3 | 0.560 | 1500 | 27986 | 1.000 |
newsdiscusstest2015-enfr.eng-fra | 33.1 | 0.594 | 1500 | 28027 | 0.992 |
newssyscomb2009.eng-fra | 26.4 | 0.558 | 502 | 12334 | 0.997 |
news-test2008.eng-fra | 23.9 | 0.529 | 2051 | 52685 | 0.992 |
newstest2009.eng-fra | 25.6 | 0.548 | 2525 | 69278 | 0.978 |
newstest2010.eng-fra | 27.8 | 0.563 | 2489 | 66043 | 0.985 |
Tatoeba-test.eng-arg | 9.3 | 0.311 | 105 | 405 | 1.000 |
Tatoeba-test.eng-ast | 26.3 | 0.492 | 99 | 720 | 0.986 |
Tatoeba-test.eng-cat | 45.4 | 0.654 | 1631 | 12342 | 0.989 |
Tatoeba-test.eng-cbk | 4.8 | 0.262 | 1498 | 10591 | 0.993 |
Tatoeba-test.eng-cos | 36.2 | 0.616 | 5 | 45 | 0.907 |
Tatoeba-test.eng-egl | 0.4 | 0.127 | 84 | 438 | 0.963 |
Tatoeba-test.eng-ext | 4.8 | 0.337 | 69 | 353 | 1.000 |
Tatoeba-test.eng-fra | 40.8 | 0.613 | 10000 | 80759 | 0.973 |
Tatoeba-test.eng-frm | 1.0 | 0.209 | 18 | 211 | 1.000 |
Tatoeba-test.eng-gcf | 0.8 | 0.121 | 99 | 560 | 0.922 |
Tatoeba-test.eng-glg | 41.9 | 0.632 | 1008 | 7828 | 0.978 |
Tatoeba-test.eng-hat | 33.7 | 0.529 | 64 | 416 | 0.951 |
Tatoeba-test.eng-ita | 43.1 | 0.656 | 10000 | 65498 | 0.952 |
Tatoeba-test.eng-lad | 10.6 | 0.324 | 629 | 3354 | 1.000 |
Tatoeba-test.eng-lad_Latn | 11.3 | 0.354 | 582 | 3097 | 1.000 |
Tatoeba-test.eng-lij | 4.6 | 0.289 | 94 | 711 | 0.973 |
Tatoeba-test.eng-lld | 0.8 | 0.214 | 21 | 228 | 0.937 |
Tatoeba-test.eng-lmo | 10.5 | 0.314 | 17 | 124 | 1.000 |
Tatoeba-test.eng-mfe | 83.6 | 0.898 | 7 | 36 | 1.000 |
Tatoeba-test.eng-multi | 39.7 | 0.609 | 10000 | 73684 | 0.968 |
Tatoeba-test.eng-mwl | 19.5 | 0.576 | 4 | 21 | 1.000 |
Tatoeba-test.eng-oci | 10.0 | 0.332 | 841 | 5219 | 0.914 |
Tatoeba-test.eng-osp | 10.8 | 0.365 | 3 | 20 | 1.000 |
Tatoeba-test.eng-pap | 52.0 | 0.699 | 70 | 376 | 1.000 |
Tatoeba-test.eng-pms | 12.6 | 0.338 | 268 | 2244 | 0.945 |
Tatoeba-test.eng-por | 42.2 | 0.643 | 10000 | 75353 | 0.969 |
Tatoeba-test.eng-roh | 20.4 | 0.456 | 16 | 198 | 1.000 |
Tatoeba-test.eng-ron | 34.4 | 0.590 | 5000 | 36833 | 0.971 |
Tatoeba-test.eng-scn | 42.5 | 0.531 | 4 | 42 | 1.000 |
Tatoeba-test.eng-spa | 46.5 | 0.664 | 10000 | 77291 | 0.973 |
Tatoeba-test.eng-vec | 14.1 | 0.325 | 19 | 127 | 0.839 |
Tatoeba-test.eng-wln | 15.3 | 0.328 | 89 | 520 | 0.957 |
- dataset: opus1m+bt
- model: transformer-align
- source language(s): eng
- target language(s): arg ast cat cbk cos egl ext fra frm gcf glg hat ind ita jak lad lij lld lmo max mfe min mol msa mwl oci osp pap pms pob por roh ron scn spa tmw vec wln zlm zsm
- model: transformer-align
- pre-processing: normalization + SentencePiece (spm32k,spm32k)
- a sentence initial language token is required in the form of
>>id<<
(id = valid target language ID) - valid language labels: >>acf<< >>aoa<< >>arg<< >>ast<< >>cat<< >>cbk<< >>cbk_Latn<< >>ccd<< >>cks<< >>cos<< >>cri<< >>crs<< >>dlm<< >>drc<< >>egl<< >>ext<< >>fab<< >>fax<< >>fra<< >>frc<< >>frm<< >>frm_Latn<< >>fro<< >>frp<< >>fur<< >>gcf<< >>gcf_Latn<< >>gcr<< >>glg<< >>hat<< >>idb<< >>ind<< >>ist<< >>ita<< >>itk<< >>jak_Latn<< >>kea<< >>kmv<< >>lad<< >>lad_Latn<< >>lij<< >>lld<< >>lld_Latn<< >>lmo<< >>lou<< >>max_Latn<< >>mcm<< >>mfe<< >>min<< >>mol<< >>msa_Latn<< >>mwl<< >>mxi<< >>mzs<< >>nap<< >>nrf<< >>oci<< >>osp<< >>osp_Latn<< >>pap<< >>pcd<< >>pln<< >>pms<< >>pob<< >>por<< >>pov<< >>pre<< >>pro<< >>rcf<< >>rgn<< >>roh<< >>ron<< >>ruo<< >>rup<< >>ruq<< >>scf<< >>scn<< >>sdc<< >>sdn<< >>spa<< >>spq<< >>src<< >>srd<< >>srm<< >>sro<< >>tmg<< >>tmw_Latn<< >>tvy<< >>vec<< >>vkp<< >>wln<< >>xmm<< >>zlm<< >>zlm_Latn<< >>zsm_Latn<<
- download: opus1m+bt-2021-03-24.zip
- test set translations: opus1m+bt-2021-03-24.test.txt
- test set scores: opus1m+bt-2021-03-24.eval.txt
testset | BLEU | chr-F | #sent | #words | BP |
---|---|---|---|---|---|
newsdev2016-enro.eng-ron | 21.7 | 0.526 | 1999 | 51566 | 0.970 |
newsdiscussdev2015-enfr.eng-fra | 27.8 | 0.556 | 1500 | 27986 | 1.000 |
newsdiscusstest2015-enfr.eng-fra | 32.7 | 0.590 | 1500 | 28027 | 0.997 |
newssyscomb2009.eng-fra | 26.1 | 0.556 | 502 | 12334 | 0.996 |
newssyscomb2009.eng-ita | 27.8 | 0.580 | 502 | 11551 | 1.000 |
newssyscomb2009.eng-spa | 29.0 | 0.566 | 502 | 12506 | 0.981 |
news-test2008.eng-fra | 24.1 | 0.528 | 2051 | 52685 | 0.993 |
news-test2008.eng-spa | 26.4 | 0.541 | 2051 | 52596 | 0.995 |
newstest2009.eng-fra | 25.1 | 0.545 | 2525 | 69278 | 0.976 |
newstest2009.eng-ita | 27.1 | 0.571 | 2525 | 63474 | 1.000 |
newstest2009.eng-spa | 27.9 | 0.561 | 2525 | 68114 | 0.998 |
newstest2010.eng-fra | 27.3 | 0.560 | 2489 | 66043 | 0.986 |
newstest2010.eng-spa | 32.5 | 0.590 | 2489 | 65522 | 0.993 |
newstest2011.eng-fra | 29.3 | 0.576 | 3003 | 80626 | 0.966 |
newstest2011.eng-spa | 33.7 | 0.592 | 3003 | 79476 | 0.978 |
newstest2012.eng-fra | 27.4 | 0.560 | 3003 | 78011 | 0.981 |
newstest2012.eng-spa | 33.7 | 0.591 | 3003 | 79006 | 0.960 |
newstest2013.eng-fra | 28.0 | 0.551 | 3000 | 70037 | 0.968 |
newstest2013.eng-spa | 30.2 | 0.566 | 3000 | 70528 | 0.948 |
newstest2016-enro.eng-ron | 20.8 | 0.511 | 1999 | 49094 | 0.984 |
Tatoeba-test.eng-arg | 15.7 | 0.352 | 105 | 405 | 1.000 |
Tatoeba-test.eng-ast | 25.8 | 0.490 | 99 | 720 | 0.990 |
Tatoeba-test.eng-cat | 44.7 | 0.647 | 1631 | 12342 | 0.983 |
Tatoeba-test.eng-cbk | 4.7 | 0.268 | 1498 | 10591 | 0.911 |
Tatoeba-test.eng-cos | 45.1 | 0.697 | 5 | 45 | 0.931 |
Tatoeba-test.eng-egl | 0.4 | 0.070 | 84 | 438 | 0.858 |
Tatoeba-test.eng-ext | 5.0 | 0.333 | 69 | 353 | 1.000 |
Tatoeba-test.eng-fra | 39.9 | 0.605 | 10000 | 80759 | 0.971 |
Tatoeba-test.eng-frm | 0.9 | 0.210 | 18 | 211 | 1.000 |
Tatoeba-test.eng-gcf | 0.7 | 0.107 | 99 | 560 | 0.986 |
Tatoeba-test.eng-glg | 42.3 | 0.630 | 1008 | 7828 | 0.981 |
Tatoeba-test.eng-hat | 34.4 | 0.561 | 64 | 416 | 0.968 |
Tatoeba-test.eng-ind | 33.9 | 0.598 | 4289 | 28294 | 0.956 |
Tatoeba-test.eng-ita | 42.3 | 0.650 | 10000 | 65498 | 0.951 |
Tatoeba-test.eng-lad | 10.0 | 0.311 | 629 | 3354 | 1.000 |
Tatoeba-test.eng-lad_Latn | 10.7 | 0.340 | 582 | 3097 | 1.000 |
Tatoeba-test.eng-lij | 4.9 | 0.292 | 94 | 711 | 0.973 |
Tatoeba-test.eng-lld | 0.5 | 0.204 | 21 | 228 | 0.927 |
Tatoeba-test.eng-lmo | 13.3 | 0.363 | 17 | 124 | 1.000 |
Tatoeba-test.eng-max_Latn | 3.1 | 0.124 | 127 | 917 | 0.906 |
Tatoeba-test.eng-mfe | 83.6 | 0.909 | 7 | 36 | 1.000 |
Tatoeba-test.eng-min | 5.5 | 0.253 | 19 | 147 | 0.930 |
Tatoeba-test.eng-msa | 28.9 | 0.528 | 5000 | 33629 | 0.974 |
Tatoeba-test.eng-multi | 39.0 | 0.607 | 10000 | 73122 | 0.967 |
Tatoeba-test.eng-mwl | 26.9 | 0.730 | 4 | 21 | 1.000 |
Tatoeba-test.eng-oci | 10.2 | 0.335 | 841 | 5219 | 0.914 |
Tatoeba-test.eng-osp | 14.6 | 0.479 | 3 | 20 | 1.000 |
Tatoeba-test.eng-pap | 46.2 | 0.645 | 70 | 376 | 1.000 |
Tatoeba-test.eng-pms | 12.8 | 0.347 | 268 | 2244 | 0.942 |
Tatoeba-test.eng-por | 41.6 | 0.640 | 10000 | 75353 | 0.972 |
Tatoeba-test.eng-roh | 18.1 | 0.454 | 16 | 198 | 1.000 |
Tatoeba-test.eng-ron | 33.8 | 0.584 | 5000 | 36833 | 0.971 |
Tatoeba-test.eng-scn | 37.2 | 0.482 | 4 | 42 | 1.000 |
Tatoeba-test.eng-spa | 45.9 | 0.661 | 10000 | 77291 | 0.974 |
Tatoeba-test.eng-tmw_Latn | 5.8 | 0.130 | 5 | 23 | 1.000 |
Tatoeba-test.eng-vec | 17.7 | 0.326 | 19 | 127 | 0.918 |
Tatoeba-test.eng-wln | 13.9 | 0.300 | 89 | 520 | 0.949 |
Tatoeba-test.eng-zlm_Latn | 3.0 | 0.329 | 24 | 163 | 0.975 |
Tatoeba-test.eng-zsm_Latn | 3.1 | 0.129 | 536 | 4085 | 1.000 |
tico19-test.eng-fra | 33.6 | 0.590 | 2100 | 64655 | 0.983 |
tico19-test.eng-pob | 41.1 | 0.685 | 2100 | 62729 | 0.943 |
tico19-test.eng-por | 40.8 | 0.684 | 2100 | 62729 | 0.967 |
tico19-test.eng-spa | 42.5 | 0.682 | 2100 | 66591 | 0.949 |
- dataset: opus1m+bt
- model: transformer-align
- source language(s): eng
- target language(s): arg ast cat cbk cos egl ext fra frm gcf glg hat ita lad lij lld lmo mfe mol mwl oci osp pap pms pob por roh ron scn spa vec wln
- model: transformer-align
- pre-processing: normalization + SentencePiece (spm32k,spm32k)
- a sentence initial language token is required in the form of
>>id<<
(id = valid target language ID) - valid language labels: >>acf<< >>aoa<< >>arg<< >>ast<< >>cat<< >>cbk<< >>cbk_Latn<< >>ccd<< >>cks<< >>cos<< >>cri<< >>crs<< >>dlm<< >>drc<< >>egl<< >>ext<< >>fab<< >>fax<< >>fra<< >>frc<< >>frm<< >>frm_Latn<< >>fro<< >>frp<< >>fur<< >>gcf<< >>gcf_Latn<< >>gcr<< >>glg<< >>hat<< >>idb<< >>ist<< >>ita<< >>itk<< >>kea<< >>kmv<< >>lad<< >>lad_Latn<< >>lij<< >>lld<< >>lld_Latn<< >>lmo<< >>lou<< >>mcm<< >>mfe<< >>mol<< >>mwl<< >>mxi<< >>mzs<< >>nap<< >>nrf<< >>oci<< >>osp<< >>osp_Latn<< >>pap<< >>pcd<< >>pln<< >>pms<< >>pob<< >>por<< >>pov<< >>pre<< >>pro<< >>rcf<< >>rgn<< >>roh<< >>ron<< >>ruo<< >>rup<< >>ruq<< >>scf<< >>scn<< >>sdc<< >>sdn<< >>spa<< >>spq<< >>src<< >>srd<< >>sro<< >>tmg<< >>tvy<< >>vec<< >>vkp<< >>wln<<
- download: opus1m+bt-2021-04-10.zip
- test set translations: opus1m+bt-2021-04-10.test.txt
- test set scores: opus1m+bt-2021-04-10.eval.txt
testset | BLEU | chr-F | #sent | #words | BP |
---|---|---|---|---|---|
newsdev2016-enro.eng-ron | 22.4 | 0.531 | 1999 | 51566 | 0.971 |
newsdiscussdev2015-enfr.eng-fra | 28.4 | 0.561 | 1500 | 27986 | 1.000 |
newsdiscusstest2015-enfr.eng-fra | 33.3 | 0.596 | 1500 | 28027 | 0.993 |
newssyscomb2009.eng-fra | 26.6 | 0.561 | 502 | 12334 | 0.997 |
newssyscomb2009.eng-ita | 28.2 | 0.580 | 502 | 11551 | 1.000 |
newssyscomb2009.eng-spa | 28.5 | 0.563 | 502 | 12506 | 0.983 |
news-test2008.eng-fra | 24.0 | 0.530 | 2051 | 52685 | 0.996 |
news-test2008.eng-spa | 26.6 | 0.544 | 2051 | 52596 | 0.998 |
newstest2009.eng-fra | 25.7 | 0.550 | 2525 | 69278 | 0.980 |
newstest2009.eng-ita | 27.6 | 0.575 | 2525 | 63474 | 1.000 |
newstest2009.eng-spa | 28.2 | 0.562 | 2525 | 68114 | 0.999 |
newstest2010.eng-fra | 27.6 | 0.563 | 2489 | 66043 | 0.983 |
newstest2010.eng-spa | 32.8 | 0.593 | 2489 | 65522 | 0.993 |
newstest2011.eng-fra | 29.9 | 0.583 | 3003 | 80626 | 0.970 |
newstest2011.eng-spa | 34.2 | 0.594 | 3003 | 79476 | 0.979 |
newstest2012.eng-fra | 28.0 | 0.565 | 3003 | 78011 | 0.981 |
newstest2012.eng-spa | 34.1 | 0.594 | 3003 | 79006 | 0.962 |
newstest2013.eng-fra | 28.3 | 0.553 | 3000 | 70037 | 0.970 |
newstest2013.eng-spa | 30.8 | 0.569 | 3000 | 70528 | 0.950 |
newstest2016-enro.eng-ron | 21.4 | 0.516 | 1999 | 49094 | 0.986 |
Tatoeba-test.eng-arg | 11.0 | 0.327 | 105 | 405 | 1.000 |
Tatoeba-test.eng-ast | 24.4 | 0.488 | 99 | 720 | 0.993 |
Tatoeba-test.eng-cat | 46.1 | 0.659 | 1631 | 12342 | 0.989 |
Tatoeba-test.eng-cbk | 4.7 | 0.265 | 1498 | 10591 | 0.876 |
Tatoeba-test.eng-cos | 39.1 | 0.619 | 5 | 45 | 1.000 |
Tatoeba-test.eng-egl | 1.1 | 0.124 | 84 | 438 | 0.993 |
Tatoeba-test.eng-ext | 5.9 | 0.315 | 69 | 353 | 1.000 |
Tatoeba-test.eng-fra | 40.9 | 0.613 | 10000 | 80759 | 0.973 |
Tatoeba-test.eng-frm | 1.0 | 0.212 | 18 | 211 | 1.000 |
Tatoeba-test.eng-gcf | 0.8 | 0.121 | 99 | 560 | 0.936 |
Tatoeba-test.eng-glg | 43.5 | 0.636 | 1008 | 7828 | 0.983 |
Tatoeba-test.eng-hat | 35.0 | 0.570 | 64 | 416 | 0.963 |
Tatoeba-test.eng-ita | 43.2 | 0.657 | 10000 | 65498 | 0.954 |
Tatoeba-test.eng-lad | 11.5 | 0.343 | 629 | 3354 | 1.000 |
Tatoeba-test.eng-lad_Latn | 12.4 | 0.375 | 582 | 3097 | 1.000 |
Tatoeba-test.eng-lij | 5.1 | 0.265 | 94 | 711 | 0.941 |
Tatoeba-test.eng-lld | 1.0 | 0.215 | 21 | 228 | 0.932 |
Tatoeba-test.eng-lmo | 6.9 | 0.283 | 17 | 124 | 1.000 |
Tatoeba-test.eng-mfe | 83.6 | 0.909 | 7 | 36 | 1.000 |
Tatoeba-test.eng-multi | 41.6 | 0.623 | 10000 | 74573 | 0.970 |
Tatoeba-test.eng-mwl | 25.4 | 0.685 | 4 | 21 | 1.000 |
Tatoeba-test.eng-oci | 9.7 | 0.330 | 841 | 5219 | 0.913 |
Tatoeba-test.eng-osp | 15.2 | 0.358 | 3 | 20 | 1.000 |
Tatoeba-test.eng-pap | 45.0 | 0.655 | 70 | 376 | 1.000 |
Tatoeba-test.eng-pms | 12.4 | 0.345 | 268 | 2244 | 0.963 |
Tatoeba-test.eng-por | 42.4 | 0.643 | 10000 | 75353 | 0.971 |
Tatoeba-test.eng-roh | 18.4 | 0.438 | 16 | 198 | 0.995 |
Tatoeba-test.eng-ron | 34.8 | 0.589 | 5000 | 36833 | 0.971 |
Tatoeba-test.eng-scn | 35.7 | 0.470 | 4 | 42 | 1.000 |
Tatoeba-test.eng-spa | 47.0 | 0.666 | 10000 | 77291 | 0.975 |
Tatoeba-test.eng-vec | 5.2 | 0.307 | 19 | 127 | 0.960 |
Tatoeba-test.eng-wln | 15.7 | 0.318 | 89 | 520 | 0.973 |
tico19-test.eng-fra | 34.6 | 0.597 | 2100 | 64655 | 0.988 |
tico19-test.eng-pob | 42.5 | 0.691 | 2100 | 62729 | 0.948 |
tico19-test.eng-por | 41.6 | 0.687 | 2100 | 62729 | 0.962 |
tico19-test.eng-spa | 43.1 | 0.685 | 2100 | 66591 | 0.952 |