Skip to content

Latest commit

 

History

History

inc-eng

opus-2020-06-28.zip

  • dataset: opus
  • model: transformer
  • source language(s): asm awa ben bho gom guj hif_Latn hin mai mar npi ori pan_Guru pnb rom sin snd_Arab urd
  • target language(s): eng
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • download: opus-2020-06-28.zip
  • test set translations: opus-2020-06-28.test.txt
  • test set scores: opus-2020-06-28.eval.txt

Benchmarks

testset BLEU chr-F
Tatoeba-test.asm-eng.asm.eng 17.3 0.357
Tatoeba-test.awa-eng.awa.eng 6.9 0.224
Tatoeba-test.ben-eng.ben.eng 46.3 0.606
Tatoeba-test.bho-eng.bho.eng 30.6 0.456
Tatoeba-test.guj-eng.guj.eng 19.0 0.367
Tatoeba-test.hif-eng.hif.eng 4.2 0.240
Tatoeba-test.hin-eng.hin.eng 38.9 0.568
Tatoeba-test.kok-eng.kok.eng 4.8 0.238
Tatoeba-test.lah-eng.lah.eng 17.6 0.284
Tatoeba-test.mai-eng.mai.eng 47.6 0.699
Tatoeba-test.mar-eng.mar.eng 23.0 0.475
Tatoeba-test.multi.eng 27.6 0.490
Tatoeba-test.nep-eng.nep.eng 1.4 0.189
Tatoeba-test.ori-eng.ori.eng 2.0 0.207
Tatoeba-test.pan-eng.pan.eng 15.5 0.349
Tatoeba-test.rom-eng.rom.eng 3.2 0.174
Tatoeba-test.sin-eng.sin.eng 30.5 0.526
Tatoeba-test.snd-eng.snd.eng 10.0 0.330
Tatoeba-test.urd-eng.urd.eng 28.0 0.476

opus-2020-07-26.zip

  • dataset: opus
  • model: transformer
  • source language(s): asm awa ben bho gom guj hif_Latn hin mai mar npi ori pan_Guru pnb rom san_Deva sin snd_Arab urd
  • target language(s): eng
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • download: opus-2020-07-26.zip
  • test set translations: opus-2020-07-26.test.txt
  • test set scores: opus-2020-07-26.eval.txt

Benchmarks

testset BLEU chr-F
newsdev2014-hineng.hin.eng 8.7 0.335
newsdev2019-engu-gujeng.guj.eng 8.3 0.308
newstest2014-hien-hineng.hin.eng 12.7 0.389
newstest2019-guen-gujeng.guj.eng 5.9 0.280
Tatoeba-test.asm-eng.asm.eng 18.0 0.360
Tatoeba-test.awa-eng.awa.eng 6.8 0.217
Tatoeba-test.ben-eng.ben.eng 44.6 0.594
Tatoeba-test.bho-eng.bho.eng 28.1 0.462
Tatoeba-test.guj-eng.guj.eng 16.6 0.362
Tatoeba-test.hif-eng.hif.eng 4.4 0.235
Tatoeba-test.hin-eng.hin.eng 38.0 0.556
Tatoeba-test.kok-eng.kok.eng 1.4 0.153
Tatoeba-test.lah-eng.lah.eng 15.3 0.266
Tatoeba-test.mai-eng.mai.eng 51.8 0.661
Tatoeba-test.mar-eng.mar.eng 22.6 0.470
Tatoeba-test.multi.eng 26.8 0.484
Tatoeba-test.nep-eng.nep.eng 2.8 0.180
Tatoeba-test.ori-eng.ori.eng 3.4 0.219
Tatoeba-test.pan-eng.pan.eng 15.2 0.373
Tatoeba-test.rom-eng.rom.eng 1.3 0.166
Tatoeba-test.san-eng.san.eng 3.1 0.167
Tatoeba-test.sin-eng.sin.eng 28.2 0.507
Tatoeba-test.snd-eng.snd.eng 38.5 0.500
Tatoeba-test.urd-eng.urd.eng 25.2 0.451

opus2m-2020-08-01.zip

  • dataset: opus2m
  • model: transformer
  • source language(s): asm awa ben bho gom guj hif_Latn hin mai mar npi ori pan_Guru pnb rom san_Deva sin snd_Arab urd
  • target language(s): eng
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • download: opus2m-2020-08-01.zip
  • test set translations: opus2m-2020-08-01.test.txt
  • test set scores: opus2m-2020-08-01.eval.txt

Benchmarks

testset BLEU chr-F
newsdev2014-hineng.hin.eng 8.9 0.341
newsdev2019-engu-gujeng.guj.eng 8.7 0.321
newstest2014-hien-hineng.hin.eng 13.1 0.396
newstest2019-guen-gujeng.guj.eng 6.5 0.290
Tatoeba-test.asm-eng.asm.eng 18.1 0.363
Tatoeba-test.awa-eng.awa.eng 6.2 0.222
Tatoeba-test.ben-eng.ben.eng 44.7 0.595
Tatoeba-test.bho-eng.bho.eng 29.4 0.458
Tatoeba-test.guj-eng.guj.eng 19.3 0.383
Tatoeba-test.hif-eng.hif.eng 3.7 0.220
Tatoeba-test.hin-eng.hin.eng 38.6 0.564
Tatoeba-test.kok-eng.kok.eng 6.6 0.287
Tatoeba-test.lah-eng.lah.eng 16.0 0.272
Tatoeba-test.mai-eng.mai.eng 75.6 0.796
Tatoeba-test.mar-eng.mar.eng 25.9 0.497
Tatoeba-test.multi.eng 29.0 0.502
Tatoeba-test.nep-eng.nep.eng 4.5 0.198
Tatoeba-test.ori-eng.ori.eng 5.0 0.226
Tatoeba-test.pan-eng.pan.eng 17.4 0.375
Tatoeba-test.rom-eng.rom.eng 1.7 0.174
Tatoeba-test.san-eng.san.eng 5.0 0.173
Tatoeba-test.sin-eng.sin.eng 31.2 0.511
Tatoeba-test.snd-eng.snd.eng 45.7 0.670
Tatoeba-test.urd-eng.urd.eng 25.6 0.456

opus4m-2020-08-12.zip

  • dataset: opus4m
  • model: transformer
  • source language(s): asm awa ben bho gom guj hif_Latn hin mai mar npi ori pan_Guru pnb rom san_Deva sin snd_Arab urd
  • target language(s): eng
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • download: opus4m-2020-08-12.zip
  • test set translations: opus4m-2020-08-12.test.txt
  • test set scores: opus4m-2020-08-12.eval.txt

Benchmarks

testset BLEU chr-F
newsdev2014-hineng.hin.eng 9.2 0.350
newsdev2019-engu-gujeng.guj.eng 10.1 0.339
newstest2014-hien-hineng.hin.eng 13.8 0.410
newstest2019-guen-gujeng.guj.eng 6.9 0.297
Tatoeba-test.asm-eng.asm.eng 19.8 0.382
Tatoeba-test.awa-eng.awa.eng 8.8 0.234
Tatoeba-test.ben-eng.ben.eng 45.1 0.601
Tatoeba-test.bho-eng.bho.eng 25.7 0.411
Tatoeba-test.guj-eng.guj.eng 21.8 0.386
Tatoeba-test.hif-eng.hif.eng 9.0 0.288
Tatoeba-test.hin-eng.hin.eng 39.2 0.570
Tatoeba-test.kok-eng.kok.eng 1.8 0.147
Tatoeba-test.lah-eng.lah.eng 17.5 0.315
Tatoeba-test.mai-eng.mai.eng 53.2 0.713
Tatoeba-test.mar-eng.mar.eng 26.6 0.504
Tatoeba-test.multi.eng 30.0 0.510
Tatoeba-test.nep-eng.nep.eng 3.8 0.206
Tatoeba-test.ori-eng.ori.eng 5.8 0.229
Tatoeba-test.pan-eng.pan.eng 17.3 0.370
Tatoeba-test.rom-eng.rom.eng 1.8 0.172
Tatoeba-test.san-eng.san.eng 4.8 0.173
Tatoeba-test.sin-eng.sin.eng 32.0 0.525
Tatoeba-test.snd-eng.snd.eng 38.5 0.500
Tatoeba-test.urd-eng.urd.eng 26.6 0.468

opus1m+bt-2021-05-01.zip

Benchmarks

testset BLEU chr-F #sent #words BP
newsdev2014.hin-eng 11.6 0.403 520 10406 0.934
newsdev2019-engu.guj-eng 13.4 0.394 1998 41862 1.000
newstest2014-hien.hin-eng 17.6 0.469 2507 55571 0.998
newstest2019-guen.guj-eng 8.6 0.339 1016 17778 1.000
Tatoeba-test.asm-eng 19.2 0.381 117 706 1.000
Tatoeba-test.awa-eng 14.8 0.299 279 1335 1.000
Tatoeba-test.ben-eng 47.2 0.619 2500 13978 0.988
Tatoeba-test.bho-eng 26.6 0.458 42 283 1.000
Tatoeba-test.gbm-eng 17.1 0.312 39 156 1.000
Tatoeba-test.guj-eng 21.4 0.389 154 962 1.000
Tatoeba-test.hif-eng 4.1 0.285 36 241 0.962
Tatoeba-test.hin-eng 42.4 0.601 5000 33943 0.972
Tatoeba-test.kok-eng 4.2 0.254 1 7 1.000
Tatoeba-test.lah-eng 14.4 0.291 32 196 1.000
Tatoeba-test.mai-eng 41.0 0.650 8 26 0.920
Tatoeba-test.mar-eng 45.0 0.640 10000 64825 1.000
Tatoeba-test.multi-eng 40.2 0.582 10000 64508 1.000
Tatoeba-test.nep-eng 24.7 0.430 115 508 1.000
Tatoeba-test.ori-eng 0.3 0.138 33 238 1.000
Tatoeba-test.pan-eng 18.1 0.378 87 616 1.000
Tatoeba-test.rom-eng 5.8 0.229 671 4457 1.000
Tatoeba-test.san-eng 2.7 0.184 144 657 1.000
Tatoeba-test.sin-eng 30.6 0.515 45 260 0.981
Tatoeba-test.snd-eng 28.1 0.456 4 19 1.000
Tatoeba-test.urd-eng 27.7 0.478 1663 12027 0.990
tico19-test.ben-eng 20.7 0.480 2100 56848 0.957
tico19-test.hin-eng 27.9 0.547 2100 56347 0.978
tico19-test.mar-eng 20.4 0.502 2100 56339 1.000
tico19-test.nep-eng 24.6 0.527 2100 56848 0.973
tico19-test.urd-eng 16.5 0.425 2100 56339 0.992