Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
opus-2020-06-28.yml		opus-2020-06-28.yml
opus-2020-07-14.yml		opus-2020-07-14.yml
opus-2020-07-20.yml		opus-2020-07-20.yml
opus-2020-07-27.yml		opus-2020-07-27.yml
opus.yml		opus.yml
opus1m+bt-2021-03-23.yml		opus1m+bt-2021-03-23.yml
opus1m+bt-2021-03-24.yml		opus1m+bt-2021-03-24.yml
opus1m+bt-2021-04-10.yml		opus1m+bt-2021-04-10.yml
opus1m+bt.yml		opus1m+bt.yml
opus2m-2020-08-01.yml		opus2m-2020-08-01.yml
opus2m.yml		opus2m.yml

README.md

opus-2020-06-28.zip

dataset: opus
model: transformer
source language(s): eng
target language(s): arg ast cat cos egl ext fra frm_Latn fvr glg ita lad lad_Latn lij lld_Latn lmo mwl oci osp_Latn pms por roh ron scn spa vec wln
model: transformer
pre-processing: normalization + SentencePiece (spm32k,spm32k)
a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
download: opus-2020-06-28.zip
test set translations: opus-2020-06-28.test.txt
test set scores: opus-2020-06-28.eval.txt

Benchmarks

testset	BLEU	chr-F
Tatoeba-test.eng-arg.eng.arg	2.2	0.147
Tatoeba-test.eng-ast.eng.ast	17.2	0.415
Tatoeba-test.eng-cat.eng.cat	47.7	0.669
Tatoeba-test.eng-cos.eng.cos	3.2	0.262
Tatoeba-test.eng-egl.eng.egl	0.4	0.119
Tatoeba-test.eng-ext.eng.ext	5.5	0.304
Tatoeba-test.eng-fra.eng.fra	45.8	0.641
Tatoeba-test.eng-frm.eng.frm	0.9	0.212
Tatoeba-test.eng-fvr.eng.fvr	2.6	0.260
Tatoeba-test.eng-glg.eng.glg	45.8	0.655
Tatoeba-test.eng-ita.eng.ita	45.9	0.678
Tatoeba-test.eng-lad.eng.lad	8.9	0.324
Tatoeba-test.eng-lij.eng.lij	1.8	0.191
Tatoeba-test.eng-lld.eng.lld	0.5	0.215
Tatoeba-test.eng-lmo.eng.lmo	0.9	0.203
Tatoeba-test.eng.multi	44.1	0.645
Tatoeba-test.eng-mwl.eng.mwl	4.1	0.331
Tatoeba-test.eng-oci.eng.oci	7.8	0.289
Tatoeba-test.eng-osp.eng.osp	10.8	0.382
Tatoeba-test.eng-pms.eng.pms	1.8	0.197
Tatoeba-test.eng-por.eng.por	41.7	0.637
Tatoeba-test.eng-roh.eng.roh	2.8	0.257
Tatoeba-test.eng-ron.eng.ron	41.8	0.640
Tatoeba-test.eng-scn.eng.scn	1.8	0.175
Tatoeba-test.eng-spa.eng.spa	50.3	0.691
Tatoeba-test.eng-vec.eng.vec	3.2	0.251
Tatoeba-test.eng-wln.eng.wln	6.6	0.236

opus-2020-07-14.zip

dataset: opus
model: transformer
source language(s): eng
target language(s): arg ast cat cos egl ext fra frm_Latn gcf_Latn glg hat ind ita lad lad_Latn lij lld_Latn lmo max_Latn mfe min mwl oci pap pms por roh ron scn spa tmw_Latn vec wln zlm_Latn zsm_Latn
model: transformer
pre-processing: normalization + SentencePiece (spm32k,spm32k)
a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
download: opus-2020-07-14.zip
test set translations: opus-2020-07-14.test.txt
test set scores: opus-2020-07-14.eval.txt

Benchmarks

testset	BLEU	chr-F
Tatoeba-test.eng-arg.eng.arg	1.7	0.133
Tatoeba-test.eng-ast.eng.ast	17.2	0.415
Tatoeba-test.eng-cat.eng.cat	47.5	0.668
Tatoeba-test.eng-cos.eng.cos	1.8	0.215
Tatoeba-test.eng-egl.eng.egl	0.4	0.087
Tatoeba-test.eng-ext.eng.ext	13.7	0.353
Tatoeba-test.eng-fra.eng.fra	44.1	0.629
Tatoeba-test.eng-frm.eng.frm	0.6	0.196
Tatoeba-test.eng-gcf.eng.gcf	0.9	0.116
Tatoeba-test.eng-glg.eng.glg	43.7	0.640
Tatoeba-test.eng-hat.eng.hat	30.1	0.529
Tatoeba-test.eng-ita.eng.ita	44.8	0.668
Tatoeba-test.eng-lad.eng.lad	7.5	0.301
Tatoeba-test.eng-lij.eng.lij	1.5	0.187
Tatoeba-test.eng-lld.eng.lld	0.8	0.199
Tatoeba-test.eng-lmo.eng.lmo	0.8	0.177
Tatoeba-test.eng-mfe.eng.mfe	91.9	0.956
Tatoeba-test.eng.multi	42.3	0.631
Tatoeba-test.eng-mwl.eng.mwl	2.7	0.252
Tatoeba-test.eng-oci.eng.oci	7.3	0.290
Tatoeba-test.eng-pap.eng.pap	43.7	0.627
Tatoeba-test.eng-pms.eng.pms	2.4	0.194
Tatoeba-test.eng-por.eng.por	40.7	0.632
Tatoeba-test.eng-roh.eng.roh	3.5	0.258
Tatoeba-test.eng-ron.eng.ron	40.0	0.628
Tatoeba-test.eng-scn.eng.scn	1.6	0.100
Tatoeba-test.eng-spa.eng.spa	48.7	0.680
Tatoeba-test.eng-vec.eng.vec	1.9	0.166
Tatoeba-test.eng-wln.eng.wln	8.1	0.226

opus-2020-07-20.zip

dataset: opus
model: transformer
source language(s): eng
target language(s): arg ast cat cos egl ext fra frm_Latn gcf_Latn glg hat ind ita lad lad_Latn lij lld_Latn lmo max_Latn mfe min mwl oci pap pms por roh ron scn spa tmw_Latn vec wln zlm_Latn zsm_Latn
model: transformer
pre-processing: normalization + SentencePiece (spm32k,spm32k)
a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
download: opus-2020-07-20.zip
test set translations: opus-2020-07-20.test.txt
test set scores: opus-2020-07-20.eval.txt

Benchmarks

testset	BLEU	chr-F
Tatoeba-test.eng-arg.eng.arg	1.5	0.132
Tatoeba-test.eng-ast.eng.ast	15.4	0.413
Tatoeba-test.eng-cat.eng.cat	47.8	0.671
Tatoeba-test.eng-cos.eng.cos	3.3	0.293
Tatoeba-test.eng-egl.eng.egl	0.2	0.085
Tatoeba-test.eng-ext.eng.ext	11.7	0.311
Tatoeba-test.eng-fra.eng.fra	44.8	0.633
Tatoeba-test.eng-frm.eng.frm	1.0	0.213
Tatoeba-test.eng-gcf.eng.gcf	0.8	0.119
Tatoeba-test.eng-glg.eng.glg	44.5	0.646
Tatoeba-test.eng-hat.eng.hat	25.5	0.494
Tatoeba-test.eng-ita.eng.ita	45.1	0.673
Tatoeba-test.eng-lad.eng.lad	8.0	0.305
Tatoeba-test.eng-lij.eng.lij	1.5	0.178
Tatoeba-test.eng-lld.eng.lld	0.4	0.171
Tatoeba-test.eng-lmo.eng.lmo	1.5	0.191
Tatoeba-test.eng-mfe.eng.mfe	91.9	0.956
Tatoeba-test.eng-msa.eng.msa	31.2	0.548
Tatoeba-test.eng.multi	42.6	0.632
Tatoeba-test.eng-mwl.eng.mwl	3.3	0.288
Tatoeba-test.eng-oci.eng.oci	7.5	0.287
Tatoeba-test.eng-pap.eng.pap	44.8	0.630
Tatoeba-test.eng-pms.eng.pms	2.7	0.198
Tatoeba-test.eng-por.eng.por	41.3	0.635
Tatoeba-test.eng-roh.eng.roh	4.3	0.271
Tatoeba-test.eng-ron.eng.ron	40.6	0.631
Tatoeba-test.eng-scn.eng.scn	1.4	0.173
Tatoeba-test.eng-spa.eng.spa	49.2	0.684
Tatoeba-test.eng-vec.eng.vec	4.8	0.240
Tatoeba-test.eng-wln.eng.wln	5.4	0.233

opus-2020-07-27.zip

dataset: opus
model: transformer
source language(s): eng
target language(s): arg ast cat cos egl ext fra frm_Latn gcf_Latn glg hat ind ita lad lad_Latn lij lld_Latn lmo max_Latn mfe min mwl oci pap pms por roh ron scn spa tmw_Latn vec wln zlm_Latn zsm_Latn
model: transformer
pre-processing: normalization + SentencePiece (spm32k,spm32k)
a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
download: opus-2020-07-27.zip
test set translations: opus-2020-07-27.test.txt
test set scores: opus-2020-07-27.eval.txt

Benchmarks

testset	BLEU	chr-F
newsdev2016-enro-engron.eng.ron	27.3	0.565
newsdiscussdev2015-enfr-engfra.eng.fra	29.9	0.573
newsdiscusstest2015-enfr-engfra.eng.fra	35.2	0.609
newssyscomb2009-engfra.eng.fra	27.8	0.569
newssyscomb2009-engita.eng.ita	29.0	0.590
newssyscomb2009-engspa.eng.spa	29.5	0.567
news-test2008-engfra.eng.fra	25.1	0.538
news-test2008-engspa.eng.spa	27.2	0.547
newstest2009-engfra.eng.fra	26.6	0.557
newstest2009-engita.eng.ita	28.6	0.582
newstest2009-engspa.eng.spa	28.7	0.565
newstest2010-engfra.eng.fra	29.2	0.573
newstest2010-engspa.eng.spa	33.6	0.598
newstest2011-engfra.eng.fra	31.2	0.591
newstest2011-engspa.eng.spa	34.8	0.599
newstest2012-engfra.eng.fra	29.2	0.574
newstest2012-engspa.eng.spa	35.1	0.601
newstest2013-engfra.eng.fra	29.7	0.565
newstest2013-engspa.eng.spa	31.7	0.576
newstest2016-enro-engron.eng.ron	25.9	0.548
Tatoeba-test.eng-arg.eng.arg	1.7	0.131
Tatoeba-test.eng-ast.eng.ast	16.6	0.417
Tatoeba-test.eng-cat.eng.cat	47.6	0.670
Tatoeba-test.eng-cos.eng.cos	3.3	0.284
Tatoeba-test.eng-egl.eng.egl	0.9	0.118
Tatoeba-test.eng-ext.eng.ext	8.7	0.301
Tatoeba-test.eng-fra.eng.fra	44.8	0.633
Tatoeba-test.eng-frm.eng.frm	0.8	0.201
Tatoeba-test.eng-gcf.eng.gcf	0.8	0.117
Tatoeba-test.eng-glg.eng.glg	44.0	0.642
Tatoeba-test.eng-hat.eng.hat	28.8	0.510
Tatoeba-test.eng-ita.eng.ita	45.3	0.674
Tatoeba-test.eng-lad.eng.lad	8.4	0.310
Tatoeba-test.eng-lij.eng.lij	1.4	0.178
Tatoeba-test.eng-lld.eng.lld	0.8	0.220
Tatoeba-test.eng-lmo.eng.lmo	0.9	0.189
Tatoeba-test.eng-mfe.eng.mfe	82.4	0.915
Tatoeba-test.eng-msa.eng.msa	31.3	0.549
Tatoeba-test.eng.multi	42.6	0.633
Tatoeba-test.eng-mwl.eng.mwl	2.9	0.311
Tatoeba-test.eng-oci.eng.oci	7.9	0.292
Tatoeba-test.eng-pap.eng.pap	47.4	0.661
Tatoeba-test.eng-pms.eng.pms	2.5	0.198
Tatoeba-test.eng-por.eng.por	41.4	0.636
Tatoeba-test.eng-roh.eng.roh	3.2	0.259
Tatoeba-test.eng-ron.eng.ron	40.8	0.632
Tatoeba-test.eng-scn.eng.scn	1.8	0.191
Tatoeba-test.eng-spa.eng.spa	49.4	0.685
Tatoeba-test.eng-vec.eng.vec	5.1	0.253
Tatoeba-test.eng-wln.eng.wln	7.1	0.235

opus2m-2020-08-01.zip

dataset: opus2m
model: transformer
source language(s): eng
target language(s): arg ast cat cos egl ext fra frm_Latn gcf_Latn glg hat ind ita lad lad_Latn lij lld_Latn lmo max_Latn mfe min mwl oci pap pms por roh ron scn spa tmw_Latn vec wln zlm_Latn zsm_Latn
model: transformer
pre-processing: normalization + SentencePiece (spm32k,spm32k)
a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
download: opus2m-2020-08-01.zip
test set translations: opus2m-2020-08-01.test.txt
test set scores: opus2m-2020-08-01.eval.txt

Benchmarks

testset	BLEU	chr-F
newsdev2016-enro-engron.eng.ron	27.6	0.567
newsdiscussdev2015-enfr-engfra.eng.fra	30.2	0.575
newsdiscusstest2015-enfr-engfra.eng.fra	35.5	0.612
newssyscomb2009-engfra.eng.fra	27.9	0.570
newssyscomb2009-engita.eng.ita	29.3	0.590
newssyscomb2009-engspa.eng.spa	29.6	0.570
news-test2008-engfra.eng.fra	25.2	0.538
news-test2008-engspa.eng.spa	27.3	0.548
newstest2009-engfra.eng.fra	26.9	0.560
newstest2009-engita.eng.ita	28.7	0.583
newstest2009-engspa.eng.spa	29.0	0.568
newstest2010-engfra.eng.fra	29.3	0.574
newstest2010-engspa.eng.spa	34.2	0.601
newstest2011-engfra.eng.fra	31.4	0.592
newstest2011-engspa.eng.spa	35.0	0.599
newstest2012-engfra.eng.fra	29.5	0.576
newstest2012-engspa.eng.spa	35.5	0.603
newstest2013-engfra.eng.fra	29.9	0.567
newstest2013-engspa.eng.spa	32.1	0.578
newstest2016-enro-engron.eng.ron	26.1	0.551
Tatoeba-test.eng-arg.eng.arg	1.4	0.125
Tatoeba-test.eng-ast.eng.ast	17.8	0.406
Tatoeba-test.eng-cat.eng.cat	48.3	0.676
Tatoeba-test.eng-cos.eng.cos	3.2	0.275
Tatoeba-test.eng-egl.eng.egl	0.2	0.084
Tatoeba-test.eng-ext.eng.ext	11.2	0.344
Tatoeba-test.eng-fra.eng.fra	45.3	0.637
Tatoeba-test.eng-frm.eng.frm	1.1	0.221
Tatoeba-test.eng-gcf.eng.gcf	0.6	0.118
Tatoeba-test.eng-glg.eng.glg	44.2	0.645
Tatoeba-test.eng-hat.eng.hat	28.0	0.502
Tatoeba-test.eng-ita.eng.ita	45.6	0.674
Tatoeba-test.eng-lad.eng.lad	8.2	0.322
Tatoeba-test.eng-lij.eng.lij	1.4	0.182
Tatoeba-test.eng-lld.eng.lld	0.8	0.217
Tatoeba-test.eng-lmo.eng.lmo	0.7	0.190
Tatoeba-test.eng-mfe.eng.mfe	91.9	0.956
Tatoeba-test.eng-msa.eng.msa	31.1	0.548
Tatoeba-test.eng.multi	42.9	0.636
Tatoeba-test.eng-mwl.eng.mwl	2.1	0.234
Tatoeba-test.eng-oci.eng.oci	7.9	0.297
Tatoeba-test.eng-pap.eng.pap	44.1	0.648
Tatoeba-test.eng-pms.eng.pms	2.1	0.190
Tatoeba-test.eng-por.eng.por	41.8	0.639
Tatoeba-test.eng-roh.eng.roh	3.5	0.261
Tatoeba-test.eng-ron.eng.ron	41.0	0.635
Tatoeba-test.eng-scn.eng.scn	1.7	0.184
Tatoeba-test.eng-spa.eng.spa	50.1	0.689
Tatoeba-test.eng-vec.eng.vec	3.2	0.248
Tatoeba-test.eng-wln.eng.wln	7.2	0.220

opus1m+bt-2021-03-23.zip

dataset: opus1m+bt
model: transformer-align
source language(s): arg eng
target language(s): ast cat cbk cos egl eng ext fra frm gcf glg hat ind ita jak lad lij lld lmo max mfe min mol msa mwl oci osp pap pms pob por roh ron scn spa tmw vec wln zlm zsm
model: transformer-align
pre-processing: normalization + SentencePiece (spm32k,spm32k)
a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
valid language labels: >>fra<< >>spa<< >>roh<< >>zlm_Latn<< >>cos<< >>ext<< >>mfe<< >>scn<< >>lad<< >>mwl<< >>ast<< >>hat<< >>pob<< >>pap<< >>lmo<< >>vec<< >>pms<< >>glg<< >>cat<< >>msa_Latn<< >>wln<< >>ind<< >>ron<< >>por<< >>ita<< >>oci<< >>lij<< >>jak_Latn<< >>eng<< >>min<< >>zlm<< >>mol<< >>cbk_Latn<<
download: opus1m+bt-2021-03-23.zip
test set translations: opus1m+bt-2021-03-23.test.txt
test set scores: opus1m+bt-2021-03-23.eval.txt

Benchmarks

testset	BLEU	chr-F	#sent	#words	BP
newsdiscussdev2015-enfr.eng-fra	28.3	0.560	1500	27986	1.000
newsdiscusstest2015-enfr.eng-fra	33.1	0.594	1500	28027	0.992
newssyscomb2009.eng-fra	26.4	0.558	502	12334	0.997
news-test2008.eng-fra	23.9	0.529	2051	52685	0.992
newstest2009.eng-fra	25.6	0.548	2525	69278	0.978
newstest2010.eng-fra	27.8	0.563	2489	66043	0.985
Tatoeba-test.eng-arg	9.3	0.311	105	405	1.000
Tatoeba-test.eng-ast	26.3	0.492	99	720	0.986
Tatoeba-test.eng-cat	45.4	0.654	1631	12342	0.989
Tatoeba-test.eng-cbk	4.8	0.262	1498	10591	0.993
Tatoeba-test.eng-cos	36.2	0.616	5	45	0.907
Tatoeba-test.eng-egl	0.4	0.127	84	438	0.963
Tatoeba-test.eng-ext	4.8	0.337	69	353	1.000
Tatoeba-test.eng-fra	40.8	0.613	10000	80759	0.973
Tatoeba-test.eng-frm	1.0	0.209	18	211	1.000
Tatoeba-test.eng-gcf	0.8	0.121	99	560	0.922
Tatoeba-test.eng-glg	41.9	0.632	1008	7828	0.978
Tatoeba-test.eng-hat	33.7	0.529	64	416	0.951
Tatoeba-test.eng-ita	43.1	0.656	10000	65498	0.952
Tatoeba-test.eng-lad	10.6	0.324	629	3354	1.000
Tatoeba-test.eng-lad_Latn	11.3	0.354	582	3097	1.000
Tatoeba-test.eng-lij	4.6	0.289	94	711	0.973
Tatoeba-test.eng-lld	0.8	0.214	21	228	0.937
Tatoeba-test.eng-lmo	10.5	0.314	17	124	1.000
Tatoeba-test.eng-mfe	83.6	0.898	7	36	1.000
Tatoeba-test.eng-multi	39.7	0.609	10000	73684	0.968
Tatoeba-test.eng-mwl	19.5	0.576	4	21	1.000
Tatoeba-test.eng-oci	10.0	0.332	841	5219	0.914
Tatoeba-test.eng-osp	10.8	0.365	3	20	1.000
Tatoeba-test.eng-pap	52.0	0.699	70	376	1.000
Tatoeba-test.eng-pms	12.6	0.338	268	2244	0.945
Tatoeba-test.eng-por	42.2	0.643	10000	75353	0.969
Tatoeba-test.eng-roh	20.4	0.456	16	198	1.000
Tatoeba-test.eng-ron	34.4	0.590	5000	36833	0.971
Tatoeba-test.eng-scn	42.5	0.531	4	42	1.000
Tatoeba-test.eng-spa	46.5	0.664	10000	77291	0.973
Tatoeba-test.eng-vec	14.1	0.325	19	127	0.839
Tatoeba-test.eng-wln	15.3	0.328	89	520	0.957

opus1m+bt-2021-03-24.zip

dataset: opus1m+bt
model: transformer-align
source language(s): eng
target language(s): arg ast cat cbk cos egl ext fra frm gcf glg hat ind ita jak lad lij lld lmo max mfe min mol msa mwl oci osp pap pms pob por roh ron scn spa tmw vec wln zlm zsm
model: transformer-align
pre-processing: normalization + SentencePiece (spm32k,spm32k)
a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
valid language labels: >>acf<< >>aoa<< >>arg<< >>ast<< >>cat<< >>cbk<< >>cbk_Latn<< >>ccd<< >>cks<< >>cos<< >>cri<< >>crs<< >>dlm<< >>drc<< >>egl<< >>ext<< >>fab<< >>fax<< >>fra<< >>frc<< >>frm<< >>frm_Latn<< >>fro<< >>frp<< >>fur<< >>gcf<< >>gcf_Latn<< >>gcr<< >>glg<< >>hat<< >>idb<< >>ind<< >>ist<< >>ita<< >>itk<< >>jak_Latn<< >>kea<< >>kmv<< >>lad<< >>lad_Latn<< >>lij<< >>lld<< >>lld_Latn<< >>lmo<< >>lou<< >>max_Latn<< >>mcm<< >>mfe<< >>min<< >>mol<< >>msa_Latn<< >>mwl<< >>mxi<< >>mzs<< >>nap<< >>nrf<< >>oci<< >>osp<< >>osp_Latn<< >>pap<< >>pcd<< >>pln<< >>pms<< >>pob<< >>por<< >>pov<< >>pre<< >>pro<< >>rcf<< >>rgn<< >>roh<< >>ron<< >>ruo<< >>rup<< >>ruq<< >>scf<< >>scn<< >>sdc<< >>sdn<< >>spa<< >>spq<< >>src<< >>srd<< >>srm<< >>sro<< >>tmg<< >>tmw_Latn<< >>tvy<< >>vec<< >>vkp<< >>wln<< >>xmm<< >>zlm<< >>zlm_Latn<< >>zsm_Latn<<
download: opus1m+bt-2021-03-24.zip
test set translations: opus1m+bt-2021-03-24.test.txt
test set scores: opus1m+bt-2021-03-24.eval.txt

Benchmarks

testset	BLEU	chr-F	#sent	#words	BP
newsdev2016-enro.eng-ron	21.7	0.526	1999	51566	0.970
newsdiscussdev2015-enfr.eng-fra	27.8	0.556	1500	27986	1.000
newsdiscusstest2015-enfr.eng-fra	32.7	0.590	1500	28027	0.997
newssyscomb2009.eng-fra	26.1	0.556	502	12334	0.996
newssyscomb2009.eng-ita	27.8	0.580	502	11551	1.000
newssyscomb2009.eng-spa	29.0	0.566	502	12506	0.981
news-test2008.eng-fra	24.1	0.528	2051	52685	0.993
news-test2008.eng-spa	26.4	0.541	2051	52596	0.995
newstest2009.eng-fra	25.1	0.545	2525	69278	0.976
newstest2009.eng-ita	27.1	0.571	2525	63474	1.000
newstest2009.eng-spa	27.9	0.561	2525	68114	0.998
newstest2010.eng-fra	27.3	0.560	2489	66043	0.986
newstest2010.eng-spa	32.5	0.590	2489	65522	0.993
newstest2011.eng-fra	29.3	0.576	3003	80626	0.966
newstest2011.eng-spa	33.7	0.592	3003	79476	0.978
newstest2012.eng-fra	27.4	0.560	3003	78011	0.981
newstest2012.eng-spa	33.7	0.591	3003	79006	0.960
newstest2013.eng-fra	28.0	0.551	3000	70037	0.968
newstest2013.eng-spa	30.2	0.566	3000	70528	0.948
newstest2016-enro.eng-ron	20.8	0.511	1999	49094	0.984
Tatoeba-test.eng-arg	15.7	0.352	105	405	1.000
Tatoeba-test.eng-ast	25.8	0.490	99	720	0.990
Tatoeba-test.eng-cat	44.7	0.647	1631	12342	0.983
Tatoeba-test.eng-cbk	4.7	0.268	1498	10591	0.911
Tatoeba-test.eng-cos	45.1	0.697	5	45	0.931
Tatoeba-test.eng-egl	0.4	0.070	84	438	0.858
Tatoeba-test.eng-ext	5.0	0.333	69	353	1.000
Tatoeba-test.eng-fra	39.9	0.605	10000	80759	0.971
Tatoeba-test.eng-frm	0.9	0.210	18	211	1.000
Tatoeba-test.eng-gcf	0.7	0.107	99	560	0.986
Tatoeba-test.eng-glg	42.3	0.630	1008	7828	0.981
Tatoeba-test.eng-hat	34.4	0.561	64	416	0.968
Tatoeba-test.eng-ind	33.9	0.598	4289	28294	0.956
Tatoeba-test.eng-ita	42.3	0.650	10000	65498	0.951
Tatoeba-test.eng-lad	10.0	0.311	629	3354	1.000
Tatoeba-test.eng-lad_Latn	10.7	0.340	582	3097	1.000
Tatoeba-test.eng-lij	4.9	0.292	94	711	0.973
Tatoeba-test.eng-lld	0.5	0.204	21	228	0.927
Tatoeba-test.eng-lmo	13.3	0.363	17	124	1.000
Tatoeba-test.eng-max_Latn	3.1	0.124	127	917	0.906
Tatoeba-test.eng-mfe	83.6	0.909	7	36	1.000
Tatoeba-test.eng-min	5.5	0.253	19	147	0.930
Tatoeba-test.eng-msa	28.9	0.528	5000	33629	0.974
Tatoeba-test.eng-multi	39.0	0.607	10000	73122	0.967
Tatoeba-test.eng-mwl	26.9	0.730	4	21	1.000
Tatoeba-test.eng-oci	10.2	0.335	841	5219	0.914
Tatoeba-test.eng-osp	14.6	0.479	3	20	1.000
Tatoeba-test.eng-pap	46.2	0.645	70	376	1.000
Tatoeba-test.eng-pms	12.8	0.347	268	2244	0.942
Tatoeba-test.eng-por	41.6	0.640	10000	75353	0.972
Tatoeba-test.eng-roh	18.1	0.454	16	198	1.000
Tatoeba-test.eng-ron	33.8	0.584	5000	36833	0.971
Tatoeba-test.eng-scn	37.2	0.482	4	42	1.000
Tatoeba-test.eng-spa	45.9	0.661	10000	77291	0.974
Tatoeba-test.eng-tmw_Latn	5.8	0.130	5	23	1.000
Tatoeba-test.eng-vec	17.7	0.326	19	127	0.918
Tatoeba-test.eng-wln	13.9	0.300	89	520	0.949
Tatoeba-test.eng-zlm_Latn	3.0	0.329	24	163	0.975
Tatoeba-test.eng-zsm_Latn	3.1	0.129	536	4085	1.000
tico19-test.eng-fra	33.6	0.590	2100	64655	0.983
tico19-test.eng-pob	41.1	0.685	2100	62729	0.943
tico19-test.eng-por	40.8	0.684	2100	62729	0.967
tico19-test.eng-spa	42.5	0.682	2100	66591	0.949

opus1m+bt-2021-04-10.zip

dataset: opus1m+bt
model: transformer-align
source language(s): eng
target language(s): arg ast cat cbk cos egl ext fra frm gcf glg hat ita lad lij lld lmo mfe mol mwl oci osp pap pms pob por roh ron scn spa vec wln
model: transformer-align
pre-processing: normalization + SentencePiece (spm32k,spm32k)
a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
valid language labels: >>acf<< >>aoa<< >>arg<< >>ast<< >>cat<< >>cbk<< >>cbk_Latn<< >>ccd<< >>cks<< >>cos<< >>cri<< >>crs<< >>dlm<< >>drc<< >>egl<< >>ext<< >>fab<< >>fax<< >>fra<< >>frc<< >>frm<< >>frm_Latn<< >>fro<< >>frp<< >>fur<< >>gcf<< >>gcf_Latn<< >>gcr<< >>glg<< >>hat<< >>idb<< >>ist<< >>ita<< >>itk<< >>kea<< >>kmv<< >>lad<< >>lad_Latn<< >>lij<< >>lld<< >>lld_Latn<< >>lmo<< >>lou<< >>mcm<< >>mfe<< >>mol<< >>mwl<< >>mxi<< >>mzs<< >>nap<< >>nrf<< >>oci<< >>osp<< >>osp_Latn<< >>pap<< >>pcd<< >>pln<< >>pms<< >>pob<< >>por<< >>pov<< >>pre<< >>pro<< >>rcf<< >>rgn<< >>roh<< >>ron<< >>ruo<< >>rup<< >>ruq<< >>scf<< >>scn<< >>sdc<< >>sdn<< >>spa<< >>spq<< >>src<< >>srd<< >>sro<< >>tmg<< >>tvy<< >>vec<< >>vkp<< >>wln<<
download: opus1m+bt-2021-04-10.zip
test set translations: opus1m+bt-2021-04-10.test.txt
test set scores: opus1m+bt-2021-04-10.eval.txt

Benchmarks

testset	BLEU	chr-F	#sent	#words	BP
newsdev2016-enro.eng-ron	22.4	0.531	1999	51566	0.971
newsdiscussdev2015-enfr.eng-fra	28.4	0.561	1500	27986	1.000
newsdiscusstest2015-enfr.eng-fra	33.3	0.596	1500	28027	0.993
newssyscomb2009.eng-fra	26.6	0.561	502	12334	0.997
newssyscomb2009.eng-ita	28.2	0.580	502	11551	1.000
newssyscomb2009.eng-spa	28.5	0.563	502	12506	0.983
news-test2008.eng-fra	24.0	0.530	2051	52685	0.996
news-test2008.eng-spa	26.6	0.544	2051	52596	0.998
newstest2009.eng-fra	25.7	0.550	2525	69278	0.980
newstest2009.eng-ita	27.6	0.575	2525	63474	1.000
newstest2009.eng-spa	28.2	0.562	2525	68114	0.999
newstest2010.eng-fra	27.6	0.563	2489	66043	0.983
newstest2010.eng-spa	32.8	0.593	2489	65522	0.993
newstest2011.eng-fra	29.9	0.583	3003	80626	0.970
newstest2011.eng-spa	34.2	0.594	3003	79476	0.979
newstest2012.eng-fra	28.0	0.565	3003	78011	0.981
newstest2012.eng-spa	34.1	0.594	3003	79006	0.962
newstest2013.eng-fra	28.3	0.553	3000	70037	0.970
newstest2013.eng-spa	30.8	0.569	3000	70528	0.950
newstest2016-enro.eng-ron	21.4	0.516	1999	49094	0.986
Tatoeba-test.eng-arg	11.0	0.327	105	405	1.000
Tatoeba-test.eng-ast	24.4	0.488	99	720	0.993
Tatoeba-test.eng-cat	46.1	0.659	1631	12342	0.989
Tatoeba-test.eng-cbk	4.7	0.265	1498	10591	0.876
Tatoeba-test.eng-cos	39.1	0.619	5	45	1.000
Tatoeba-test.eng-egl	1.1	0.124	84	438	0.993
Tatoeba-test.eng-ext	5.9	0.315	69	353	1.000
Tatoeba-test.eng-fra	40.9	0.613	10000	80759	0.973
Tatoeba-test.eng-frm	1.0	0.212	18	211	1.000
Tatoeba-test.eng-gcf	0.8	0.121	99	560	0.936
Tatoeba-test.eng-glg	43.5	0.636	1008	7828	0.983
Tatoeba-test.eng-hat	35.0	0.570	64	416	0.963
Tatoeba-test.eng-ita	43.2	0.657	10000	65498	0.954
Tatoeba-test.eng-lad	11.5	0.343	629	3354	1.000
Tatoeba-test.eng-lad_Latn	12.4	0.375	582	3097	1.000
Tatoeba-test.eng-lij	5.1	0.265	94	711	0.941
Tatoeba-test.eng-lld	1.0	0.215	21	228	0.932
Tatoeba-test.eng-lmo	6.9	0.283	17	124	1.000
Tatoeba-test.eng-mfe	83.6	0.909	7	36	1.000
Tatoeba-test.eng-multi	41.6	0.623	10000	74573	0.970
Tatoeba-test.eng-mwl	25.4	0.685	4	21	1.000
Tatoeba-test.eng-oci	9.7	0.330	841	5219	0.913
Tatoeba-test.eng-osp	15.2	0.358	3	20	1.000
Tatoeba-test.eng-pap	45.0	0.655	70	376	1.000
Tatoeba-test.eng-pms	12.4	0.345	268	2244	0.963
Tatoeba-test.eng-por	42.4	0.643	10000	75353	0.971
Tatoeba-test.eng-roh	18.4	0.438	16	198	0.995
Tatoeba-test.eng-ron	34.8	0.589	5000	36833	0.971
Tatoeba-test.eng-scn	35.7	0.470	4	42	1.000
Tatoeba-test.eng-spa	47.0	0.666	10000	77291	0.975
Tatoeba-test.eng-vec	5.2	0.307	19	127	0.960
Tatoeba-test.eng-wln	15.7	0.318	89	520	0.973
tico19-test.eng-fra	34.6	0.597	2100	64655	0.988
tico19-test.eng-pob	42.5	0.691	2100	62729	0.948
tico19-test.eng-por	41.6	0.687	2100	62729	0.962
tico19-test.eng-spa	43.1	0.685	2100	66591	0.952

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

eng-roa

eng-roa

README.md

opus-2020-06-28.zip

Benchmarks

opus-2020-07-14.zip

Benchmarks

opus-2020-07-20.zip

Benchmarks

opus-2020-07-27.zip

Benchmarks

opus2m-2020-08-01.zip

Benchmarks

opus1m+bt-2021-03-23.zip

Benchmarks

opus1m+bt-2021-03-24.zip

Benchmarks

opus1m+bt-2021-04-10.zip

Benchmarks

Files

eng-roa

Directory actions

More options

Directory actions

More options

Latest commit

History

eng-roa

Folders and files

parent directory

README.md

opus-2020-06-28.zip

Benchmarks

opus-2020-07-14.zip

Benchmarks

opus-2020-07-20.zip

Benchmarks

opus-2020-07-27.zip

Benchmarks

opus2m-2020-08-01.zip

Benchmarks

opus1m+bt-2021-03-23.zip

Benchmarks

opus1m+bt-2021-03-24.zip

Benchmarks

opus1m+bt-2021-04-10.zip

Benchmarks