[ported model] FSMT (FairSeq MachineTranslation) #6940

stas00 · 2020-09-04T06:34:31Z

This PR implements the spec specified at #5419

The new model is FSMT (aka FairSeqMachineTranslation): FSMTForConditionalGeneration which comes with 4 models:

"facebook/wmt19-ru-en"
"facebook/wmt19-en-ru"
"facebook/wmt19-de-en"
"facebook/wmt19-en-de"

This is a ported version of fairseq wmt19 transformer which includes 3 languages and 4 pairs.

For more details of the original, please see, Facebook FAIR's WMT19 News Translation Task Submission.

Huge, huge thanks to @sshleifer, who has been incredibly supportive of this very difficult, yet, fun learning experience! Thank you, Sam!

And many thanks to all those who wrote all the existing transformers code, so that I just needed to tweak a few things here and there, rather than write from scratch. And, last, but not least, to the fairseq developers, who have done the heavy lifting with the initial training and finetuning, and coding.

The tokenizer is a tweaked XLM tokenizer, the model is a tweaked Bart model. There were too many differences that I couldn't just subclass either of these 2, having 2 unmerged dictionaries of different sized being the main cause. But there were quite a few other nuances, please see the porting notes in the code.

There are a few more things to complete, in particular we currently don't have support for model ensemble, which is used by fairseq - they run eval on an ensemble of 4 model checkpoints. This implementation currently uses only the first checkpoint.
And then more work on matching fairseq outputs is needed - no beam is perfect, and with beam search there are some small differences - I was encouraged to release the model and continue working on improving it.

I'm still a few points behind on the BLEU score - most likely due to having the ensemble, but since I am not able to reproduce fairseq reported scores, I'm not sure how to evaluate against a single model. See the issue. I added the current and the expected scores in the model cards. If one of you has already started working on ensemble support please let me know.

You will find 'Porting Notes' in modeling_fsmt.py and tokenization_fsmt.py with what has been done, nuances and what still needs to be done.

The 4 models are up on s3 and can be used already.

Usage:

from transformers.tokenization_fsmt import FSMTTokenizer
from transformers.modeling_fsmt import FSMTForConditionalGeneration
mname = "facebook/wmt19-en-ru"
tokenizer = FSMTTokenizer.from_pretrained(mname)
model = FSMTForConditionalGeneration.from_pretrained(mname)

input = "Machine learning is great, isn't it?
input_ids = tokenizer.encode(input, return_tensors="pt")
outputs = model.generate(input_ids)
decoded = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(decoded) # Машинное обучение - это здорово, не так ли?

edit: we have 5 more wmt models en/de from https://github.com/jungokasai/deep-shallow/ ready to be added as well, once this is merged.

@sshleifer

codecov · 2020-09-04T06:51:40Z

Codecov Report

Merging #6940 into master will increase coverage by 2.47%.
The diff coverage is 94.60%.

@@            Coverage Diff             @@
##           master    #6940      +/-   ##
==========================================
+ Coverage   79.62%   82.10%   +2.47%     
==========================================
  Files         168      171       +3     
  Lines       32284    33044     +760     
==========================================
+ Hits        25706    27130    +1424     
+ Misses       6578     5914     -664

Impacted Files	Coverage Δ
src/transformers/modeling_fsmt.py	`93.58% <93.58%> (ø)`
src/transformers/tokenization_fsmt.py	`95.23% <95.23%> (ø)`
src/transformers/__init__.py	`99.35% <100.00%> (+0.01%)`	⬆️
src/transformers/configuration_auto.py	`96.15% <100.00%> (+0.04%)`	⬆️
src/transformers/configuration_fsmt.py	`100.00% <100.00%> (ø)`
src/transformers/modeling_auto.py	`82.38% <100.00%> (+0.08%)`	⬆️
src/transformers/tokenization_auto.py	`91.93% <100.00%> (+0.13%)`	⬆️
src/transformers/modeling_tf_distilbert.py	`64.55% <0.00%> (-34.28%)`	⬇️
src/transformers/modeling_tf_gpt2.py	`71.84% <0.00%> (-23.17%)`	⬇️
src/transformers/tokenization_albert.py	`70.19% <0.00%> (-23.08%)`	⬇️
... and 15 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 90cde2e...1be40e3. Read the comment docs.

stas00 · 2020-09-04T07:01:53Z

Here is a little paraphrase script to amuse you:

from transformers.tokenization_fsmt import FSMTTokenizer
from transformers.modeling_fsmt import FSMTForConditionalGeneration

text = "Every morning when I wake up, I experience an exquisite joy - the joy of being Salvador Dalí - and I ask myself in rapture: What wonderful things is this Salvador Dalí going to accomplish today?"

def translate(src_lang, tgt_lang, text):
    mname = f"facebook/wmt19-{src_lang}-{tgt_lang}"
    tokenizer = FSMTTokenizer.from_pretrained(mname)
    model = FSMTForConditionalGeneration.from_pretrained(mname)

    input_ids = tokenizer.encode(text, return_tensors='pt')
    outputs = model.generate(input_ids, num_beams=5, early_stopping=True)
    decoded = tokenizer.decode(outputs[0], skip_special_tokens=True)
    return decoded

def paraphrase(src_lang, tgt_lang, text):
    return translate(tgt_lang, src_lang, translate(src_lang, tgt_lang, text))

print(f"original:\n{text}")
print(f"paraphrased en-ru-en:\n{paraphrase('en', 'ru', text)}")
print(f"paraphrased en-de-en:\n{paraphrase('en', 'de', text)}")

original:

Every morning when I wake up, I experience an exquisite joy - the joy of being Salvador Dalí - and I ask myself in rapture: What wonderful things is this Salvador Dalí going to accomplish today?
paraphrased en-ru-en:

Every morning when I wake up, I have an amazing joy - the joy of being Salvador Dali - and I ask myself in awe: what wonderful things is this Salvador Dali going to do today?
paraphrased en-de-en:

Every morning when I wake up, I experience an exquisite joy - the joy of being Salvador Dalí - and I ask myself in ecstasy: what wonderful things will this Salvador Dalí do today?

Dali would have been proud! :)

kalyangvs · 2020-09-04T10:41:30Z

Hi, @stas00
Can the models be torchscripted or quantized?
I understand they are from fairseq and are pre-trained. What about optimizations in training a seq2seq model in transfomers?

sshleifer

Have not read modeling.py yet, but left some other nitpicks.
More importantly, I couldn't replicate run_eval.py results from this branch.

model_cards/stas/fsmt-wmt19-de-en/README.md

model_cards/stas/fsmt-wmt19-en-de/README.md

src/transformers/__init__.py

src/transformers/generation_utils.py

src/transformers/modeling_fsmt.py

tests/test_modeling_fsmt.py

sshleifer · 2020-09-04T15:19:53Z

Also the integration test fails in my torch 1.5.1 environment: https://gist.github.com/sshleifer/4ba0386e06d2b348c809f80c19f283fd

sshleifer · 2020-09-04T15:20:32Z

Super excited about this!

stas00 · 2020-09-04T16:39:55Z

Hi, @stas00
Can the models be torchscripted or quantized?
I understand they are from fairseq and are pre-trained. What about optimizations in training a seq2seq model in transfomers?

The first step is just to make things work and have a similar BLEU performance. At a later stage we can work on more goals. The plan is to polish this PR, have it merged and then I plan to post to the forums and then you guys can experiment, report problems, ask for things, etc. How does that sound?

stas00 · 2020-09-04T17:56:11Z

Have not read modeling.py yet, but left some other nitpicks.

Thank you very much, @sshleifer - I will address those later today.

More importantly, I couldn't replicate run_eval.py results from this branch.

I know why. I uploaded an experimental version of the models last night, thought I forced the caching off, as the models were re-downloaded, but just now while re-running run_eval I got suddenly a re-download and blue is 0.1. So the experimental model didn't work. :(

So I still need to sort out the caching issue: #6916

I'm reverting the models - takes a while to upload 5GB. I will update once this is complete and then you can re-eval.

I'm also thinking - needing an actual run_eval quality test, which can be run as a part of the test suite. perhaps on a small sample, maybe 100 instead of 2000 and a smallish beam size? then it can be slow, but not too slow?

Also, as I mentioned earlier there is no way to override num_beans in run_eval so one has to manually change it in configuration_fsmt.py.

So you were running it with num_beans=8.

Here are the results that I get for PAIR=en-ru:

#  15:
# {'bleu': 31.2512, 'n_obs': 1997, 'runtime': 521, 'seconds_per_sample': 0.2609}
#  50:
# {'bleu': 31.2695, 'n_obs': 1997, 'runtime': 1692, 'seconds_per_sample': 0.8473}

I will rebase, once this is merged #6948 - thank you!

stas00 · 2020-09-04T18:52:22Z

edit: CDN has been updated so you're good to go to eval the model.

So models have been updated, but I can't figure out how to bypass caching, so still getting the old versions - might have to wait 24h :( See this issue: #6916 (comment)

So until this caching issue is sorted out (or 24h have passed) please don't waste your time on trying to eval this model. It won't work.

stas00 · 2020-09-15T23:05:27Z

FYI, I moved all the data-prep-convert/eval/card writing scripts into their own place: #7155 so the convert script got much shorter.

LysandreJik

This is great, great job @stas00!! I've added a few comments regarding documentation and logging. Please ping me once you've solved this and I'll merge this PR!

I think you can do Add suggestion to batch to prevent the issue where you can't find the comments anymore from happening! Otherwise, feel free to just to the modifications yourself, I don't need to be co-author!

docs/source/model_doc/fsmt.rst

src/transformers/configuration_fsmt.py

src/transformers/convert_fsmt_original_pytorch_checkpoint_to_pytorch.py

src/transformers/modeling_fsmt.py

src/transformers/tokenization_fsmt.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

stas00 · 2020-09-16T16:35:45Z

I think we freaked out github, it stopped reporting checks.

@LysandreJik - this is good to go - thanks a lot for your feedback.

stas00 · 2020-09-16T17:13:02Z

I think you can do Add suggestion to batch to prevent the issue where you can't find the comments anymore from happening!

Oh, that was a super-helpful hint. I wish I knew about it 2 days ago. Thanks a lot!

Otherwise, feel free to just to the modifications yourself, I don't need to be co-author!

It's a team work ;) Thank you for your contribution, @LysandreJik!

LysandreJik

Great, thanks a lot @stas00!

* ready for PR * cleanup * correct FSMT_PRETRAINED_MODEL_ARCHIVE_LIST * fix * perfectionism * revert change from another PR * odd, already committed this one * non-interactive upload workaround * backup the failed experiment * store langs in config * workaround for localizing model path * doc clean up as in huggingface#6956 * style * back out debug mode * document: run_eval.py --num_beams 10 * remove unneeded constant * typo * re-use bart's Attention * re-use EncoderLayer, DecoderLayer from bart * refactor * send to cuda and fp16 * cleanup * revert (moved to another PR) * better error message * document run_eval --num_beams * solve the problem of tokenizer finding the right files when model is local * polish, remove hardcoded config * add a note that the file is autogenerated to avoid losing changes * prep for org change, remove unneeded code * switch to model4.pt, update scores * s/python/bash/ * missing init (but doesn't impact the finetuned model) * cleanup * major refactor (reuse-bart) * new model, new expected weights * cleanup * cleanup * full link * fix model type * merge porting notes * style * cleanup * have to create a DecoderConfig object to handle vocab_size properly * doc fix * add note (not a public class) * parametrize * - add bleu scores integration tests * skip test if sacrebleu is not installed * cache heavy models/tokenizers * some tweaks * remove tokens that aren't used * more purging * simplify code * switch to using decoder_start_token_id * add doc * Revert "major refactor (reuse-bart)" This reverts commit 226dad1. * decouple from bart * remove unused code #1 * remove unused code #2 * remove unused code huggingface#3 * update instructions * clean up * move bleu eval to examples * check import only once * move data+gen script into files * reuse via import * take less space * add prepare_seq2seq_batch (auto-tested) * cleanup * recode test to use json instead of yaml * ignore keys not needed * use the new -y in transformers-cli upload -y * [xlm tok] config dict: fix str into int to match definition (huggingface#7034) * [s2s] --eval_max_generate_length (huggingface#7018) * Fix CI with change of name of nlp (huggingface#7054) * nlp -> datasets * More nlp -> datasets * Woopsie * More nlp -> datasets * One last * extending to support allen_nlp wmt models - allow a specific checkpoint file to be passed - more arg settings - scripts for allen_nlp models * sync with changes * s/fsmt-wmt/wmt/ in model names * s/fsmt-wmt/wmt/ in model names (p2) * s/fsmt-wmt/wmt/ in model names (p3) * switch to a better checkpoint * typo * make non-optional args such - adjust tests where possible or skip when there is no other choice * consistency * style * adjust header * cards moved (model rename) * use best custom hparams * update info * remove old cards * cleanup * s/stas/facebook/ * update scores * s/allen_nlp/allenai/ * url maps aren't needed * typo * move all the doc / build /eval generators to their own scripts * cleanup * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * fix indent * duplicated line * style * use the correct add_start_docstrings * oops * resizing can't be done with the core approach, due to 2 dicts * check that the arg is a list * style * style Co-authored-by: Sam Shleifer <sshleifer@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* ready for PR * cleanup * correct FSMT_PRETRAINED_MODEL_ARCHIVE_LIST * fix * perfectionism * revert change from another PR * odd, already committed this one * non-interactive upload workaround * backup the failed experiment * store langs in config * workaround for localizing model path * doc clean up as in #6956 * style * back out debug mode * document: run_eval.py --num_beams 10 * remove unneeded constant * typo * re-use bart's Attention * re-use EncoderLayer, DecoderLayer from bart * refactor * send to cuda and fp16 * cleanup * revert (moved to another PR) * better error message * document run_eval --num_beams * solve the problem of tokenizer finding the right files when model is local * polish, remove hardcoded config * add a note that the file is autogenerated to avoid losing changes * prep for org change, remove unneeded code * switch to model4.pt, update scores * s/python/bash/ * missing init (but doesn't impact the finetuned model) * cleanup * major refactor (reuse-bart) * new model, new expected weights * cleanup * cleanup * full link * fix model type * merge porting notes * style * cleanup * have to create a DecoderConfig object to handle vocab_size properly * doc fix * add note (not a public class) * parametrize * - add bleu scores integration tests * skip test if sacrebleu is not installed * cache heavy models/tokenizers * some tweaks * remove tokens that aren't used * more purging * simplify code * switch to using decoder_start_token_id * add doc * Revert "major refactor (reuse-bart)" This reverts commit 226dad1. * decouple from bart * remove unused code #1 * remove unused code #2 * remove unused code #3 * update instructions * clean up * move bleu eval to examples * check import only once * move data+gen script into files * reuse via import * take less space * add prepare_seq2seq_batch (auto-tested) * cleanup * recode test to use json instead of yaml * ignore keys not needed * use the new -y in transformers-cli upload -y * [xlm tok] config dict: fix str into int to match definition (#7034) * [s2s] --eval_max_generate_length (#7018) * Fix CI with change of name of nlp (#7054) * nlp -> datasets * More nlp -> datasets * Woopsie * More nlp -> datasets * One last * extending to support allen_nlp wmt models - allow a specific checkpoint file to be passed - more arg settings - scripts for allen_nlp models * sync with changes * s/fsmt-wmt/wmt/ in model names * s/fsmt-wmt/wmt/ in model names (p2) * s/fsmt-wmt/wmt/ in model names (p3) * switch to a better checkpoint * typo * make non-optional args such - adjust tests where possible or skip when there is no other choice * consistency * style * adjust header * cards moved (model rename) * use best custom hparams * update info * remove old cards * cleanup * s/stas/facebook/ * update scores * s/allen_nlp/allenai/ * url maps aren't needed * typo * move all the doc / build /eval generators to their own scripts * cleanup * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * fix indent * duplicated line * style * use the correct add_start_docstrings * oops * resizing can't be done with the core approach, due to 2 dicts * check that the arg is a list * style * style Co-authored-by: Sam Shleifer <sshleifer@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* ready for PR * cleanup * correct FSMT_PRETRAINED_MODEL_ARCHIVE_LIST * fix * perfectionism * revert change from another PR * odd, already committed this one * non-interactive upload workaround * backup the failed experiment * store langs in config * workaround for localizing model path * doc clean up as in huggingface#6956 * style * back out debug mode * document: run_eval.py --num_beams 10 * remove unneeded constant * typo * re-use bart's Attention * re-use EncoderLayer, DecoderLayer from bart * refactor * send to cuda and fp16 * cleanup * revert (moved to another PR) * better error message * document run_eval --num_beams * solve the problem of tokenizer finding the right files when model is local * polish, remove hardcoded config * add a note that the file is autogenerated to avoid losing changes * prep for org change, remove unneeded code * switch to model4.pt, update scores * s/python/bash/ * missing init (but doesn't impact the finetuned model) * cleanup * major refactor (reuse-bart) * new model, new expected weights * cleanup * cleanup * full link * fix model type * merge porting notes * style * cleanup * have to create a DecoderConfig object to handle vocab_size properly * doc fix * add note (not a public class) * parametrize * - add bleu scores integration tests * skip test if sacrebleu is not installed * cache heavy models/tokenizers * some tweaks * remove tokens that aren't used * more purging * simplify code * switch to using decoder_start_token_id * add doc * Revert "major refactor (reuse-bart)" This reverts commit 226dad1. * decouple from bart * remove unused code huggingface#1 * remove unused code huggingface#2 * remove unused code huggingface#3 * update instructions * clean up * move bleu eval to examples * check import only once * move data+gen script into files * reuse via import * take less space * add prepare_seq2seq_batch (auto-tested) * cleanup * recode test to use json instead of yaml * ignore keys not needed * use the new -y in transformers-cli upload -y * [xlm tok] config dict: fix str into int to match definition (huggingface#7034) * [s2s] --eval_max_generate_length (huggingface#7018) * Fix CI with change of name of nlp (huggingface#7054) * nlp -> datasets * More nlp -> datasets * Woopsie * More nlp -> datasets * One last * extending to support allen_nlp wmt models - allow a specific checkpoint file to be passed - more arg settings - scripts for allen_nlp models * sync with changes * s/fsmt-wmt/wmt/ in model names * s/fsmt-wmt/wmt/ in model names (p2) * s/fsmt-wmt/wmt/ in model names (p3) * switch to a better checkpoint * typo * make non-optional args such - adjust tests where possible or skip when there is no other choice * consistency * style * adjust header * cards moved (model rename) * use best custom hparams * update info * remove old cards * cleanup * s/stas/facebook/ * update scores * s/allen_nlp/allenai/ * url maps aren't needed * typo * move all the doc / build /eval generators to their own scripts * cleanup * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * fix indent * duplicated line * style * use the correct add_start_docstrings * oops * resizing can't be done with the core approach, due to 2 dicts * check that the arg is a list * style * style Co-authored-by: Sam Shleifer <sshleifer@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* ready for PR * cleanup * correct FSMT_PRETRAINED_MODEL_ARCHIVE_LIST * fix * perfectionism * revert change from another PR * odd, already committed this one * non-interactive upload workaround * backup the failed experiment * store langs in config * workaround for localizing model path * doc clean up as in huggingface#6956 * style * back out debug mode * document: run_eval.py --num_beams 10 * remove unneeded constant * typo * re-use bart's Attention * re-use EncoderLayer, DecoderLayer from bart * refactor * send to cuda and fp16 * cleanup * revert (moved to another PR) * better error message * document run_eval --num_beams * solve the problem of tokenizer finding the right files when model is local * polish, remove hardcoded config * add a note that the file is autogenerated to avoid losing changes * prep for org change, remove unneeded code * switch to model4.pt, update scores * s/python/bash/ * missing init (but doesn't impact the finetuned model) * cleanup * major refactor (reuse-bart) * new model, new expected weights * cleanup * cleanup * full link * fix model type * merge porting notes * style * cleanup * have to create a DecoderConfig object to handle vocab_size properly * doc fix * add note (not a public class) * parametrize * - add bleu scores integration tests * skip test if sacrebleu is not installed * cache heavy models/tokenizers * some tweaks * remove tokens that aren't used * more purging * simplify code * switch to using decoder_start_token_id * add doc * Revert "major refactor (reuse-bart)" This reverts commit 226dad1. * decouple from bart * remove unused code #1 * remove unused code huggingface#2 * remove unused code huggingface#3 * update instructions * clean up * move bleu eval to examples * check import only once * move data+gen script into files * reuse via import * take less space * add prepare_seq2seq_batch (auto-tested) * cleanup * recode test to use json instead of yaml * ignore keys not needed * use the new -y in transformers-cli upload -y * [xlm tok] config dict: fix str into int to match definition (huggingface#7034) * [s2s] --eval_max_generate_length (huggingface#7018) * Fix CI with change of name of nlp (huggingface#7054) * nlp -> datasets * More nlp -> datasets * Woopsie * More nlp -> datasets * One last * extending to support allen_nlp wmt models - allow a specific checkpoint file to be passed - more arg settings - scripts for allen_nlp models * sync with changes * s/fsmt-wmt/wmt/ in model names * s/fsmt-wmt/wmt/ in model names (p2) * s/fsmt-wmt/wmt/ in model names (p3) * switch to a better checkpoint * typo * make non-optional args such - adjust tests where possible or skip when there is no other choice * consistency * style * adjust header * cards moved (model rename) * use best custom hparams * update info * remove old cards * cleanup * s/stas/facebook/ * update scores * s/allen_nlp/allenai/ * url maps aren't needed * typo * move all the doc / build /eval generators to their own scripts * cleanup * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * fix indent * duplicated line * style * use the correct add_start_docstrings * oops * resizing can't be done with the core approach, due to 2 dicts * check that the arg is a list * style * style Co-authored-by: Sam Shleifer <sshleifer@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

…#6940)" This reverts commit 467d614.

stas00 added 2 commits September 3, 2020 23:29

ready for PR

6ea2650

cleanup

825f71e

julien-c added the model card Related to pretrained model cards label Sep 4, 2020

stas00 added 3 commits September 3, 2020 23:38

correct FSMT_PRETRAINED_MODEL_ARCHIVE_LIST

f05b7c4

fix

08ffb0c

perfectionism

aab6348

stas00 mentioned this pull request Sep 4, 2020

High Quality EN-DE/EN-FR Translators #5419

Closed

stas00 added 2 commits September 4, 2020 00:11

revert change from another PR

de7fdd3

odd, already committed this one

2bd939d

sshleifer suggested changes Sep 4, 2020

View reviewed changes

sshleifer linked an issue Sep 4, 2020 that may be closed by this pull request

High Quality EN-DE/EN-FR Translators #5419

Closed

stas00 added 3 commits September 4, 2020 12:49

non-interactive upload workaround

6db7364

backup the failed experiment

1e62879

Merge remote-tracking branch 'origin/master' into fair-wmt-clean

7fd4e9e

stas00 mentioned this pull request Sep 4, 2020

[model weights caching] model upload doesn't check model weights hash #6916

Closed

stas00 added 6 commits September 4, 2020 15:21

store langs in config

7918e27

workaround for localizing model path

d17bf3d

Merge remote-tracking branch 'origin/master' into fair-wmt-clean

1247623

doc clean up as in huggingface#6956

9e15bde

style

c8b16ba

back out debug mode

a95e04a

url maps aren't needed

7f36737

stas00 requested a review from LysandreJik September 15, 2020 19:53

typo

f894fb9

stas00 mentioned this pull request Sep 15, 2020

[model cards] ported allenai Deep Encoder, Shallow Decoder models #7153

Merged

move all the doc / build /eval generators to their own scripts

99483e0

cleanup

f9f4f83

LysandreJik approved these changes Sep 16, 2020

View reviewed changes

stas00 and others added 5 commits September 16, 2020 09:14

Apply suggestions from code review

361299b

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

Apply suggestions from code review

2f3da54

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

fix indent

d8591d8

duplicated line

78f81b2

style

dbfa7c6

stas00 added 2 commits September 16, 2020 10:27

use the correct add_start_docstrings

a5185ce

oops

a3eb3b4

stas00 mentioned this pull request Sep 16, 2020

use the correct add_start_docstrings #7174

Merged

stas00 added 4 commits September 16, 2020 13:08

resizing can't be done with the core approach, due to 2 dicts

9e13d10

check that the arg is a list

cd0e95e

style

5b986cf

style

1be40e3

LysandreJik approved these changes Sep 17, 2020

View reviewed changes

LysandreJik merged commit 1eeb206 into huggingface:master Sep 17, 2020

fabiocapsouza added a commit to fabiocapsouza/transformers that referenced this pull request Nov 15, 2020

Revert "[ported model] FSMT (FairSeq MachineTranslation) (huggingface…

27c044e

…#6940)" This reverts commit 467d614.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ported model] FSMT (FairSeq MachineTranslation) #6940

[ported model] FSMT (FairSeq MachineTranslation) #6940

stas00 commented Sep 4, 2020 •

edited

Loading

codecov bot commented Sep 4, 2020 •

edited

Loading

stas00 commented Sep 4, 2020 •

edited

Loading

kalyangvs commented Sep 4, 2020

sshleifer left a comment

sshleifer commented Sep 4, 2020

sshleifer commented Sep 4, 2020

stas00 commented Sep 4, 2020

stas00 commented Sep 4, 2020 •

edited

Loading

stas00 commented Sep 4, 2020 •

edited

Loading

stas00 commented Sep 15, 2020

LysandreJik left a comment •

edited

Loading

stas00 commented Sep 16, 2020

stas00 commented Sep 16, 2020 •

edited

Loading

LysandreJik left a comment

[ported model] FSMT (FairSeq MachineTranslation) #6940

[ported model] FSMT (FairSeq MachineTranslation) #6940

Conversation

stas00 commented Sep 4, 2020 • edited Loading

codecov bot commented Sep 4, 2020 • edited Loading

Codecov Report

stas00 commented Sep 4, 2020 • edited Loading

kalyangvs commented Sep 4, 2020

sshleifer left a comment

Choose a reason for hiding this comment

sshleifer commented Sep 4, 2020

sshleifer commented Sep 4, 2020

stas00 commented Sep 4, 2020

stas00 commented Sep 4, 2020 • edited Loading

stas00 commented Sep 4, 2020 • edited Loading

stas00 commented Sep 15, 2020

LysandreJik left a comment • edited Loading

Choose a reason for hiding this comment

stas00 commented Sep 16, 2020

stas00 commented Sep 16, 2020 • edited Loading

LysandreJik left a comment

Choose a reason for hiding this comment

stas00 commented Sep 4, 2020 •

edited

Loading

codecov bot commented Sep 4, 2020 •

edited

Loading

stas00 commented Sep 4, 2020 •

edited

Loading

stas00 commented Sep 4, 2020 •

edited

Loading

stas00 commented Sep 4, 2020 •

edited

Loading

LysandreJik left a comment •

edited

Loading

stas00 commented Sep 16, 2020 •

edited

Loading