Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ported model] FSMT (FairSeq MachineTranslation) #6940

Merged
merged 118 commits into from
Sep 17, 2020

Conversation

stas00
Copy link
Contributor

@stas00 stas00 commented Sep 4, 2020

This PR implements the spec specified at #5419

The new model is FSMT (aka FairSeqMachineTranslation): FSMTForConditionalGeneration which comes with 4 models:

  • "facebook/wmt19-ru-en"
  • "facebook/wmt19-en-ru"
  • "facebook/wmt19-de-en"
  • "facebook/wmt19-en-de"

This is a ported version of fairseq wmt19 transformer which includes 3 languages and 4 pairs.

For more details of the original, please see, Facebook FAIR's WMT19 News Translation Task Submission.

Huge, huge thanks to @sshleifer, who has been incredibly supportive of this very difficult, yet, fun learning experience! Thank you, Sam!

And many thanks to all those who wrote all the existing transformers code, so that I just needed to tweak a few things here and there, rather than write from scratch. And, last, but not least, to the fairseq developers, who have done the heavy lifting with the initial training and finetuning, and coding.

The tokenizer is a tweaked XLM tokenizer, the model is a tweaked Bart model. There were too many differences that I couldn't just subclass either of these 2, having 2 unmerged dictionaries of different sized being the main cause. But there were quite a few other nuances, please see the porting notes in the code.

There are a few more things to complete, in particular we currently don't have support for model ensemble, which is used by fairseq - they run eval on an ensemble of 4 model checkpoints. This implementation currently uses only the first checkpoint.
And then more work on matching fairseq outputs is needed - no beam is perfect, and with beam search there are some small differences - I was encouraged to release the model and continue working on improving it.

I'm still a few points behind on the BLEU score - most likely due to having the ensemble, but since I am not able to reproduce fairseq reported scores, I'm not sure how to evaluate against a single model. See the issue. I added the current and the expected scores in the model cards. If one of you has already started working on ensemble support please let me know.

You will find 'Porting Notes' in modeling_fsmt.py and tokenization_fsmt.py with what has been done, nuances and what still needs to be done.

The 4 models are up on s3 and can be used already.

Usage:

from transformers.tokenization_fsmt import FSMTTokenizer
from transformers.modeling_fsmt import FSMTForConditionalGeneration
mname = "facebook/wmt19-en-ru"
tokenizer = FSMTTokenizer.from_pretrained(mname)
model = FSMTForConditionalGeneration.from_pretrained(mname)

input = "Machine learning is great, isn't it?
input_ids = tokenizer.encode(input, return_tensors="pt")
outputs = model.generate(input_ids)
decoded = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(decoded) # Машинное обучение - это здорово, не так ли?

edit: we have 5 more wmt models en/de from https://github.com/jungokasai/deep-shallow/ ready to be added as well, once this is merged.

@sshleifer

@julien-c julien-c added the model card Related to pretrained model cards label Sep 4, 2020
@codecov
Copy link

codecov bot commented Sep 4, 2020

Codecov Report

Merging #6940 into master will increase coverage by 2.47%.
The diff coverage is 94.60%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #6940      +/-   ##
==========================================
+ Coverage   79.62%   82.10%   +2.47%     
==========================================
  Files         168      171       +3     
  Lines       32284    33044     +760     
==========================================
+ Hits        25706    27130    +1424     
+ Misses       6578     5914     -664     
Impacted Files Coverage Δ
src/transformers/modeling_fsmt.py 93.58% <93.58%> (ø)
src/transformers/tokenization_fsmt.py 95.23% <95.23%> (ø)
src/transformers/__init__.py 99.35% <100.00%> (+0.01%) ⬆️
src/transformers/configuration_auto.py 96.15% <100.00%> (+0.04%) ⬆️
src/transformers/configuration_fsmt.py 100.00% <100.00%> (ø)
src/transformers/modeling_auto.py 82.38% <100.00%> (+0.08%) ⬆️
src/transformers/tokenization_auto.py 91.93% <100.00%> (+0.13%) ⬆️
src/transformers/modeling_tf_distilbert.py 64.55% <0.00%> (-34.28%) ⬇️
src/transformers/modeling_tf_gpt2.py 71.84% <0.00%> (-23.17%) ⬇️
src/transformers/tokenization_albert.py 70.19% <0.00%> (-23.08%) ⬇️
... and 15 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 90cde2e...1be40e3. Read the comment docs.

@stas00
Copy link
Contributor Author

stas00 commented Sep 4, 2020

Here is a little paraphrase script to amuse you:

from transformers.tokenization_fsmt import FSMTTokenizer
from transformers.modeling_fsmt import FSMTForConditionalGeneration

text = "Every morning when I wake up, I experience an exquisite joy - the joy of being Salvador Dalí - and I ask myself in rapture: What wonderful things is this Salvador Dalí going to accomplish today?"

def translate(src_lang, tgt_lang, text):
    mname = f"facebook/wmt19-{src_lang}-{tgt_lang}"
    tokenizer = FSMTTokenizer.from_pretrained(mname)
    model = FSMTForConditionalGeneration.from_pretrained(mname)

    input_ids = tokenizer.encode(text, return_tensors='pt')
    outputs = model.generate(input_ids, num_beams=5, early_stopping=True)
    decoded = tokenizer.decode(outputs[0], skip_special_tokens=True)
    return decoded

def paraphrase(src_lang, tgt_lang, text):
    return translate(tgt_lang, src_lang, translate(src_lang, tgt_lang, text))

print(f"original:\n{text}")
print(f"paraphrased en-ru-en:\n{paraphrase('en', 'ru', text)}")
print(f"paraphrased en-de-en:\n{paraphrase('en', 'de', text)}")
  • original:

    Every morning when I wake up, I experience an exquisite joy - the joy of being Salvador Dalí - and I ask myself in rapture: What wonderful things is this Salvador Dalí going to accomplish today?

  • paraphrased en-ru-en:

    Every morning when I wake up, I have an amazing joy - the joy of being Salvador Dali - and I ask myself in awe: what wonderful things is this Salvador Dali going to do today?

  • paraphrased en-de-en:

    Every morning when I wake up, I experience an exquisite joy - the joy of being Salvador Dalí - and I ask myself in ecstasy: what wonderful things will this Salvador Dalí do today?

Dali would have been proud! :)

@kalyangvs
Copy link

Hi, @stas00
Can the models be torchscripted or quantized?
I understand they are from fairseq and are pre-trained. What about optimizations in training a seq2seq model in transfomers?

Copy link
Contributor

@sshleifer sshleifer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have not read modeling.py yet, but left some other nitpicks.
More importantly, I couldn't replicate run_eval.py results from this branch.

model_cards/stas/fsmt-wmt19-de-en/README.md Outdated Show resolved Hide resolved
model_cards/stas/fsmt-wmt19-en-de/README.md Outdated Show resolved Hide resolved
model_cards/stas/fsmt-wmt19-en-de/README.md Outdated Show resolved Hide resolved
model_cards/stas/fsmt-wmt19-en-de/README.md Outdated Show resolved Hide resolved
src/transformers/__init__.py Outdated Show resolved Hide resolved
src/transformers/generation_utils.py Outdated Show resolved Hide resolved
src/transformers/modeling_fsmt.py Show resolved Hide resolved
tests/test_modeling_fsmt.py Outdated Show resolved Hide resolved
tests/test_modeling_fsmt.py Outdated Show resolved Hide resolved
tests/test_modeling_fsmt.py Outdated Show resolved Hide resolved
@sshleifer sshleifer linked an issue Sep 4, 2020 that may be closed by this pull request
@sshleifer
Copy link
Contributor

Also the integration test fails in my torch 1.5.1 environment: https://gist.github.com/sshleifer/4ba0386e06d2b348c809f80c19f283fd

@sshleifer
Copy link
Contributor

Super excited about this!

@stas00
Copy link
Contributor Author

stas00 commented Sep 4, 2020

Hi, @stas00
Can the models be torchscripted or quantized?
I understand they are from fairseq and are pre-trained. What about optimizations in training a seq2seq model in transfomers?

The first step is just to make things work and have a similar BLEU performance. At a later stage we can work on more goals. The plan is to polish this PR, have it merged and then I plan to post to the forums and then you guys can experiment, report problems, ask for things, etc. How does that sound?

@stas00
Copy link
Contributor Author

stas00 commented Sep 4, 2020

Have not read modeling.py yet, but left some other nitpicks.

Thank you very much, @sshleifer - I will address those later today.

More importantly, I couldn't replicate run_eval.py results from this branch.

I know why. I uploaded an experimental version of the models last night, thought I forced the caching off, as the models were re-downloaded, but just now while re-running run_eval I got suddenly a re-download and blue is 0.1. So the experimental model didn't work. :(

So I still need to sort out the caching issue: #6916

I'm reverting the models - takes a while to upload 5GB. I will update once this is complete and then you can re-eval.


I'm also thinking - needing an actual run_eval quality test, which can be run as a part of the test suite. perhaps on a small sample, maybe 100 instead of 2000 and a smallish beam size? then it can be slow, but not too slow?


Also, as I mentioned earlier there is no way to override num_beans in run_eval so one has to manually change it in configuration_fsmt.py.

So you were running it with num_beans=8.

Here are the results that I get for PAIR=en-ru:

#  15:
# {'bleu': 31.2512, 'n_obs': 1997, 'runtime': 521, 'seconds_per_sample': 0.2609}
#  50:
# {'bleu': 31.2695, 'n_obs': 1997, 'runtime': 1692, 'seconds_per_sample': 0.8473}

I will rebase, once this is merged #6948 - thank you!

@stas00
Copy link
Contributor Author

stas00 commented Sep 4, 2020

edit: CDN has been updated so you're good to go to eval the model.

So models have been updated, but I can't figure out how to bypass caching, so still getting the old versions - might have to wait 24h :( See this issue: #6916 (comment)

So until this caching issue is sorted out (or 24h have passed) please don't waste your time on trying to eval this model. It won't work.

@stas00
Copy link
Contributor Author

stas00 commented Sep 15, 2020

FYI, I moved all the data-prep-convert/eval/card writing scripts into their own place: #7155 so the convert script got much shorter.

Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, great job @stas00!! I've added a few comments regarding documentation and logging. Please ping me once you've solved this and I'll merge this PR!

I think you can do Add suggestion to batch to prevent the issue where you can't find the comments anymore from happening! Otherwise, feel free to just to the modifications yourself, I don't need to be co-author!

docs/source/model_doc/fsmt.rst Outdated Show resolved Hide resolved
src/transformers/configuration_fsmt.py Outdated Show resolved Hide resolved
src/transformers/configuration_fsmt.py Outdated Show resolved Hide resolved
src/transformers/modeling_fsmt.py Outdated Show resolved Hide resolved
src/transformers/modeling_fsmt.py Outdated Show resolved Hide resolved
src/transformers/modeling_fsmt.py Outdated Show resolved Hide resolved
src/transformers/tokenization_fsmt.py Outdated Show resolved Hide resolved
stas00 and others added 5 commits September 16, 2020 09:14
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
@stas00
Copy link
Contributor Author

stas00 commented Sep 16, 2020

I think we freaked out github, it stopped reporting checks.

@LysandreJik - this is good to go - thanks a lot for your feedback.

@stas00
Copy link
Contributor Author

stas00 commented Sep 16, 2020

I think you can do Add suggestion to batch to prevent the issue where you can't find the comments anymore from happening!

Oh, that was a super-helpful hint. I wish I knew about it 2 days ago. Thanks a lot!

Otherwise, feel free to just to the modifications yourself, I don't need to be co-author!

It's a team work ;) Thank you for your contribution, @LysandreJik!

Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thanks a lot @stas00!

@LysandreJik LysandreJik merged commit 1eeb206 into huggingface:master Sep 17, 2020
sshleifer added a commit to sshleifer/transformers_fork that referenced this pull request Sep 17, 2020
* ready for PR

* cleanup

* correct FSMT_PRETRAINED_MODEL_ARCHIVE_LIST

* fix

* perfectionism

* revert change from another PR

* odd, already committed this one

* non-interactive upload workaround

* backup the failed experiment

* store langs in config

* workaround for localizing model path

* doc clean up as in huggingface#6956

* style

* back out debug mode

* document: run_eval.py --num_beams 10

* remove unneeded constant

* typo

* re-use bart's Attention

* re-use EncoderLayer, DecoderLayer from bart

* refactor

* send to cuda and fp16

* cleanup

* revert (moved to another PR)

* better error message

* document run_eval --num_beams

* solve the problem of tokenizer finding the right files when model is local

* polish, remove hardcoded config

* add a note that the file is autogenerated to avoid losing changes

* prep for org change, remove unneeded code

* switch to model4.pt, update scores

* s/python/bash/

* missing init (but doesn't impact the finetuned model)

* cleanup

* major refactor (reuse-bart)

* new model, new expected weights

* cleanup

* cleanup

* full link

* fix model type

* merge porting notes

* style

* cleanup

* have to create a DecoderConfig object to handle vocab_size properly

* doc fix

* add note (not a public class)

* parametrize

* - add bleu scores integration tests

* skip test if sacrebleu is not installed

* cache heavy models/tokenizers

* some tweaks

* remove tokens that aren't used

* more purging

* simplify code

* switch to using decoder_start_token_id

* add doc

* Revert "major refactor (reuse-bart)"

This reverts commit 226dad1.

* decouple from bart

* remove unused code #1

* remove unused code #2

* remove unused code huggingface#3

* update instructions

* clean up

* move bleu eval to examples

* check import only once

* move data+gen script into files

* reuse via import

* take less space

* add prepare_seq2seq_batch (auto-tested)

* cleanup

* recode test to use json instead of yaml

* ignore keys not needed

* use the new -y in transformers-cli upload -y

* [xlm tok] config dict: fix str into int to match definition (huggingface#7034)

* [s2s] --eval_max_generate_length (huggingface#7018)

* Fix CI with change of name of nlp (huggingface#7054)

* nlp -> datasets

* More nlp -> datasets

* Woopsie

* More nlp -> datasets

* One last

* extending to support allen_nlp wmt models

- allow a specific checkpoint file to be passed
- more arg settings
- scripts for allen_nlp models

* sync with changes

* s/fsmt-wmt/wmt/ in model names

* s/fsmt-wmt/wmt/ in model names (p2)

* s/fsmt-wmt/wmt/ in model names (p3)

* switch to a better checkpoint

* typo

* make non-optional args such - adjust tests where possible or skip when there is no other choice

* consistency

* style

* adjust header

* cards moved (model rename)

* use best custom hparams

* update info

* remove old cards

* cleanup

* s/stas/facebook/

* update scores

* s/allen_nlp/allenai/

* url maps aren't needed

* typo

* move all the doc / build /eval generators to their own scripts

* cleanup

* Apply suggestions from code review

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Apply suggestions from code review

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* fix indent

* duplicated line

* style

* use the correct add_start_docstrings

* oops

* resizing can't be done with the core approach, due to 2 dicts

* check that the arg is a list

* style

* style

Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
mfuntowicz pushed a commit that referenced this pull request Sep 18, 2020
* ready for PR

* cleanup

* correct FSMT_PRETRAINED_MODEL_ARCHIVE_LIST

* fix

* perfectionism

* revert change from another PR

* odd, already committed this one

* non-interactive upload workaround

* backup the failed experiment

* store langs in config

* workaround for localizing model path

* doc clean up as in #6956

* style

* back out debug mode

* document: run_eval.py --num_beams 10

* remove unneeded constant

* typo

* re-use bart's Attention

* re-use EncoderLayer, DecoderLayer from bart

* refactor

* send to cuda and fp16

* cleanup

* revert (moved to another PR)

* better error message

* document run_eval --num_beams

* solve the problem of tokenizer finding the right files when model is local

* polish, remove hardcoded config

* add a note that the file is autogenerated to avoid losing changes

* prep for org change, remove unneeded code

* switch to model4.pt, update scores

* s/python/bash/

* missing init (but doesn't impact the finetuned model)

* cleanup

* major refactor (reuse-bart)

* new model, new expected weights

* cleanup

* cleanup

* full link

* fix model type

* merge porting notes

* style

* cleanup

* have to create a DecoderConfig object to handle vocab_size properly

* doc fix

* add note (not a public class)

* parametrize

* - add bleu scores integration tests

* skip test if sacrebleu is not installed

* cache heavy models/tokenizers

* some tweaks

* remove tokens that aren't used

* more purging

* simplify code

* switch to using decoder_start_token_id

* add doc

* Revert "major refactor (reuse-bart)"

This reverts commit 226dad1.

* decouple from bart

* remove unused code #1

* remove unused code #2

* remove unused code #3

* update instructions

* clean up

* move bleu eval to examples

* check import only once

* move data+gen script into files

* reuse via import

* take less space

* add prepare_seq2seq_batch (auto-tested)

* cleanup

* recode test to use json instead of yaml

* ignore keys not needed

* use the new -y in transformers-cli upload -y

* [xlm tok] config dict: fix str into int to match definition (#7034)

* [s2s] --eval_max_generate_length (#7018)

* Fix CI with change of name of nlp (#7054)

* nlp -> datasets

* More nlp -> datasets

* Woopsie

* More nlp -> datasets

* One last

* extending to support allen_nlp wmt models

- allow a specific checkpoint file to be passed
- more arg settings
- scripts for allen_nlp models

* sync with changes

* s/fsmt-wmt/wmt/ in model names

* s/fsmt-wmt/wmt/ in model names (p2)

* s/fsmt-wmt/wmt/ in model names (p3)

* switch to a better checkpoint

* typo

* make non-optional args such - adjust tests where possible or skip when there is no other choice

* consistency

* style

* adjust header

* cards moved (model rename)

* use best custom hparams

* update info

* remove old cards

* cleanup

* s/stas/facebook/

* update scores

* s/allen_nlp/allenai/

* url maps aren't needed

* typo

* move all the doc / build /eval generators to their own scripts

* cleanup

* Apply suggestions from code review

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Apply suggestions from code review

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* fix indent

* duplicated line

* style

* use the correct add_start_docstrings

* oops

* resizing can't be done with the core approach, due to 2 dicts

* check that the arg is a list

* style

* style

Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Zigur pushed a commit to Zigur/transformers that referenced this pull request Oct 26, 2020
* ready for PR

* cleanup

* correct FSMT_PRETRAINED_MODEL_ARCHIVE_LIST

* fix

* perfectionism

* revert change from another PR

* odd, already committed this one

* non-interactive upload workaround

* backup the failed experiment

* store langs in config

* workaround for localizing model path

* doc clean up as in huggingface#6956

* style

* back out debug mode

* document: run_eval.py --num_beams 10

* remove unneeded constant

* typo

* re-use bart's Attention

* re-use EncoderLayer, DecoderLayer from bart

* refactor

* send to cuda and fp16

* cleanup

* revert (moved to another PR)

* better error message

* document run_eval --num_beams

* solve the problem of tokenizer finding the right files when model is local

* polish, remove hardcoded config

* add a note that the file is autogenerated to avoid losing changes

* prep for org change, remove unneeded code

* switch to model4.pt, update scores

* s/python/bash/

* missing init (but doesn't impact the finetuned model)

* cleanup

* major refactor (reuse-bart)

* new model, new expected weights

* cleanup

* cleanup

* full link

* fix model type

* merge porting notes

* style

* cleanup

* have to create a DecoderConfig object to handle vocab_size properly

* doc fix

* add note (not a public class)

* parametrize

* - add bleu scores integration tests

* skip test if sacrebleu is not installed

* cache heavy models/tokenizers

* some tweaks

* remove tokens that aren't used

* more purging

* simplify code

* switch to using decoder_start_token_id

* add doc

* Revert "major refactor (reuse-bart)"

This reverts commit 226dad1.

* decouple from bart

* remove unused code huggingface#1

* remove unused code huggingface#2

* remove unused code huggingface#3

* update instructions

* clean up

* move bleu eval to examples

* check import only once

* move data+gen script into files

* reuse via import

* take less space

* add prepare_seq2seq_batch (auto-tested)

* cleanup

* recode test to use json instead of yaml

* ignore keys not needed

* use the new -y in transformers-cli upload -y

* [xlm tok] config dict: fix str into int to match definition (huggingface#7034)

* [s2s] --eval_max_generate_length (huggingface#7018)

* Fix CI with change of name of nlp (huggingface#7054)

* nlp -> datasets

* More nlp -> datasets

* Woopsie

* More nlp -> datasets

* One last

* extending to support allen_nlp wmt models

- allow a specific checkpoint file to be passed
- more arg settings
- scripts for allen_nlp models

* sync with changes

* s/fsmt-wmt/wmt/ in model names

* s/fsmt-wmt/wmt/ in model names (p2)

* s/fsmt-wmt/wmt/ in model names (p3)

* switch to a better checkpoint

* typo

* make non-optional args such - adjust tests where possible or skip when there is no other choice

* consistency

* style

* adjust header

* cards moved (model rename)

* use best custom hparams

* update info

* remove old cards

* cleanup

* s/stas/facebook/

* update scores

* s/allen_nlp/allenai/

* url maps aren't needed

* typo

* move all the doc / build /eval generators to their own scripts

* cleanup

* Apply suggestions from code review

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Apply suggestions from code review

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* fix indent

* duplicated line

* style

* use the correct add_start_docstrings

* oops

* resizing can't be done with the core approach, due to 2 dicts

* check that the arg is a list

* style

* style

Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
fabiocapsouza pushed a commit to fabiocapsouza/transformers that referenced this pull request Nov 15, 2020
* ready for PR

* cleanup

* correct FSMT_PRETRAINED_MODEL_ARCHIVE_LIST

* fix

* perfectionism

* revert change from another PR

* odd, already committed this one

* non-interactive upload workaround

* backup the failed experiment

* store langs in config

* workaround for localizing model path

* doc clean up as in huggingface#6956

* style

* back out debug mode

* document: run_eval.py --num_beams 10

* remove unneeded constant

* typo

* re-use bart's Attention

* re-use EncoderLayer, DecoderLayer from bart

* refactor

* send to cuda and fp16

* cleanup

* revert (moved to another PR)

* better error message

* document run_eval --num_beams

* solve the problem of tokenizer finding the right files when model is local

* polish, remove hardcoded config

* add a note that the file is autogenerated to avoid losing changes

* prep for org change, remove unneeded code

* switch to model4.pt, update scores

* s/python/bash/

* missing init (but doesn't impact the finetuned model)

* cleanup

* major refactor (reuse-bart)

* new model, new expected weights

* cleanup

* cleanup

* full link

* fix model type

* merge porting notes

* style

* cleanup

* have to create a DecoderConfig object to handle vocab_size properly

* doc fix

* add note (not a public class)

* parametrize

* - add bleu scores integration tests

* skip test if sacrebleu is not installed

* cache heavy models/tokenizers

* some tweaks

* remove tokens that aren't used

* more purging

* simplify code

* switch to using decoder_start_token_id

* add doc

* Revert "major refactor (reuse-bart)"

This reverts commit 226dad1.

* decouple from bart

* remove unused code #1

* remove unused code huggingface#2

* remove unused code huggingface#3

* update instructions

* clean up

* move bleu eval to examples

* check import only once

* move data+gen script into files

* reuse via import

* take less space

* add prepare_seq2seq_batch (auto-tested)

* cleanup

* recode test to use json instead of yaml

* ignore keys not needed

* use the new -y in transformers-cli upload -y

* [xlm tok] config dict: fix str into int to match definition (huggingface#7034)

* [s2s] --eval_max_generate_length (huggingface#7018)

* Fix CI with change of name of nlp (huggingface#7054)

* nlp -> datasets

* More nlp -> datasets

* Woopsie

* More nlp -> datasets

* One last

* extending to support allen_nlp wmt models

- allow a specific checkpoint file to be passed
- more arg settings
- scripts for allen_nlp models

* sync with changes

* s/fsmt-wmt/wmt/ in model names

* s/fsmt-wmt/wmt/ in model names (p2)

* s/fsmt-wmt/wmt/ in model names (p3)

* switch to a better checkpoint

* typo

* make non-optional args such - adjust tests where possible or skip when there is no other choice

* consistency

* style

* adjust header

* cards moved (model rename)

* use best custom hparams

* update info

* remove old cards

* cleanup

* s/stas/facebook/

* update scores

* s/allen_nlp/allenai/

* url maps aren't needed

* typo

* move all the doc / build /eval generators to their own scripts

* cleanup

* Apply suggestions from code review

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Apply suggestions from code review

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* fix indent

* duplicated line

* style

* use the correct add_start_docstrings

* oops

* resizing can't be done with the core approach, due to 2 dicts

* check that the arg is a list

* style

* style

Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
fabiocapsouza added a commit to fabiocapsouza/transformers that referenced this pull request Nov 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
model card Related to pretrained model cards
Projects
None yet
Development

Successfully merging this pull request may close these issues.

High Quality EN-DE/EN-FR Translators
7 participants