Skip to content

Latest commit

 

History

History
90 lines (73 loc) · 5.14 KB

Models-py.md

File metadata and controls

90 lines (73 loc) · 5.14 KB
layout title
page
OpenNMT-py models

This page lists pretrained models for OpenNMT-py.

  • TOC {:toc}

Translation

{:.pretrained} | | New! NLLB 200 3.3B - Transformer (download) | | | New! NLLB 200 1.3B - Transformer (download) | | | New! NLLB 200 1.3B distilled - Transformer (download) |

New! NLLB 200 600M - Transformer (download)
Configuration Yaml file example to run inference inference config
Please change the source and terget languages in the yaml
Sentence Piece model SP Model
Results cf Forum

{:.pretrained}

New! v3 English-German - Transformer Large (download)
BPE Model BPE
'{"mode": "aggressive", "joiner_annotate": True, "preserve_placeholders": True, "case_markup": True, "soft_case_regions": True, "preserve_segmented_tokens": True, "segment_case": True, "segment_numbers": True, "segment_alphabet_change": True}'
BLEU newstest2014 = 31.2
newstest2016 = 40.7
newstest2017 = 32.9
newstest2018 = 49.1
newstest2019 = 45.9

{:.pretrained}

English-German - v2 format model Transformer (download)
Configuration Base Transformer configuration with standard training options
Data WMT with shared SentencePiece model
Original Paper replication
BLEU newstest2014 = 26.89
newstest2017 = 28.09

{:.pretrained}

German-English - 2-layer BiLSTM (download)
Configuration 2-layer BiLSTM with hidden size 500 trained for 20 epochs
Data IWSLT '14 DE-EN
BLEU 30.33

Summarization

English

{:.pretrained}

2-layer LSTM (download)
Configuration 2-layer LSTM with hidden size 500 trained for 20 epochs
Data Gigaword standard
Gigaword F-Score R1 = 33.60
R2 = 16.29
RL = 31.45

{:.pretrained}

2-layer LSTM with copy attention (download)
Configuration 2-layer LSTM with hidden size 500 and copy attention trained for 20 epochs
Data Gigaword standard
Gigaword F-Score R1 = 35.51
R2 = 17.35
RL = 33.17

{:.pretrained}

Transformer (download)
Configuration See OpenNMT-py summarization example
Data CNN/Daily Mail

{:.pretrained}

1-layer BiLSTM (download)
Configuration See OpenNMT-py summarization example
Data CNN/Daily Mail
Gigaword F-Score R1 = 39.12
R2 = 17.35
RL = 36.12

Chinese

{:.pretrained}

1-layer BiLSTM (download)
Author playma
Configuration Preprocessing options: src_vocab_size 8000, tgt_vocab_size 8000, src_seq_length 400, tgt_seq_length 30, src_seq_length_trunc 400, tgt_seq_length_trunc 100.
Training options: 1 layer, LSTM 300, WE 500, encoder_type brnn, input feed, AdaGrad, adagrad_accumulator_init 0.1, learning_rate 0.15, 30 epochs
Data LCSTS
Gigaword F-Score R1 = 35.67
R2 = 23.06
RL = 33.14

Dialog

{:.pretrained}

2-layer LSTM (download)
Configuration 2 layers, LSTM 500, WE 500, input feed, dropout 0.2, global_attention mlp, start_decay_at 7, 13 epochs
Data OpenSubtitles