[BART] BART agent #2781

klshuster · 2020-06-19T23:46:10Z

Patch description

BART Agent: https://arxiv.org/abs/1910.13461.

Relies on the model implemented in fairseq. As such, I provide an open-sourced version of the conversion script @stephenroller et. al. used for BlenderBot.
The BART agent can be instantiated as simply -m bart, however it is recommended to specify --init-model zoo:bart/bart_large/model or -mf zoo:bart/bart_large/model so that a dictionary is not needlessly built.
Also substituted our gelu implementation for the one in torch.nn.functional. This provides a speed-up of about 10%; see details in logs section.
Added a test to check the pre-trained model's ppl when trained on an integration test task.

cc @jaseweston as I remember one of the interns might have wanted BART?

Testing steps

Add nightly bart test to see whether the model can be fine-tuned on an integration task.

Most of the time is actually just loading the model.

$ python -m pytest tests/nightly/gpu/test_bart.py
.
.
.
================================================================================== test session starts ==================================================================================
platform linux -- Python 3.6.9, pytest-5.3.2, py-1.8.1, pluggy-0.13.1
rootdir: /private/home/kshuster/ParlAI, inifile: pytest.ini
plugins: requests-mock-1.7.0
collected 1 item

tests/nightly/gpu/test_bart.py .                                                                                                                                                  [100%]

=============================================================================== slowest 10 test durations ===============================================================================
76.42s call     tests/nightly/gpu/test_bart.py::TestBartModel::test_bart

(0.00 durations hidden.  Use -vv to show these durations.)
======================================================================= 1 passed, 73 warnings in 77.97s (0:01:17) =======================================================================

Logs

Gelu trained with SGD

New gelu

$ parlai train_model -m bart -mf /tmp/bart_c2_new_gelu -t convai2 -bs 24 --fp16 true -eps 1 --dict-file zoo:bart/bart_large/model.dict
.
.
.
num_epochs completed:1.0 time elapsed:1543.1565182209015s
.
.
.
eval completed in 442.08s
valid:
    accuracy  bleu-4  exs    f1  gpu_mem  loss  lr   ppl  token_acc  total_train_updates   tpb
    .0001282  .01081 7801 .2066    .3493 2.449   1 11.58      .4617                 5478 321.3

old Gelu

$ parlai train_model -m bart -mf /tmp/bart_c2_old_gelu -t convai2 -bs 24 --fp16 true -eps 1
.
.
.
num_epochs completed:1.0 time elapsed:1727.4218549728394s
.
.
.
valid:
    accuracy  bleu-4  exs    f1  gpu_mem  loss  lr   ppl  token_acc  total_train_updates   tpb
    .0002564  .01066 7801 .2027    .3847 2.443   1 11.51      .4637                 5478 321.3

Actual training with adam on 1 epoch of Convai2

$ parlai train_model -m bart -mf /tmp/bart_c2_new_gelu_adam -t convai2 -bs 24 --fp16 true -eps 1 -lr 1e-5 --optimizer adam
.
.
.

valid:
    accuracy  bleu-4  exs    f1  gpu_mem  loss    lr   ppl  token_acc  total_train_updates   tpb
    .0001282  .01229 7801 .2035    .6361 2.386 1e-05 10.87      .4741                 5478 321.3

emilydinan · 2020-06-24T13:58:45Z

Re: not building dictionary. You can do a hack like this:

ParlAI/parlai/agents/hugging_face/gpt2.py

Line 229 in ce0caf6

dict_maxexs=0, # skip building dictionary

(Just set max_exs to 0 so it's not built)

parlai/agents/bart/README.md

parlai/agents/bart/bart.py

parlai/scripts/convert_fairseq_to_parlai.py

parlai/zoo/model_list.py

stephenroller · 2020-06-24T22:24:26Z

tests/nightly/gpu/test_bart.py::TestBartModel::test_bart SKIPPED [ 33%]
You need to add fairseq into the nightly installs. Same place we install nltk i think.

parlai/core/torch_generator_agent.py

parlai/scripts/convert_fairseq_to_parlai.py

klshuster · 2020-06-25T15:56:19Z

i moved the script to parlai/agents/bart/ for now

klshuster · 2020-06-25T22:20:33Z

typing is failing on files i did not touch...

parlai/agents/bart/bart.py

stephenroller

lgtm

bart agent

128a90b

facebook-github-bot added the CLA Signed label Jun 19, 2020

klshuster added 5 commits June 22, 2020 14:30

Merge branch 'master' into bart_agent

fd69861

bart agent - it works!

d4d315a

black among others

6e97dd8

add test

d3a8101

modle list update

5d68d23

klshuster requested review from stephenroller, emilydinan and jaseweston June 24, 2020 00:17

klshuster marked this pull request as ready for review June 24, 2020 00:18

klshuster changed the title ~~[WIP] [BART] BART agent~~ [BART] BART agent Jun 24, 2020

README

82b1da7

emilydinan reviewed Jun 24, 2020

View reviewed changes

parlai/agents/bart/README.md Show resolved Hide resolved

parlai/agents/bart/bart.py Show resolved Hide resolved

parlai/agents/bart/bart.py Show resolved Hide resolved

parlai/agents/bart/bart.py Outdated Show resolved Hide resolved

parlai/agents/bart/bart.py Show resolved Hide resolved

typing & formatting & address emily comments

e1e8515

stephenroller reviewed Jun 24, 2020

View reviewed changes

parlai/agents/bart/bart.py Outdated Show resolved Hide resolved

stephenroller reviewed Jun 24, 2020

View reviewed changes

parlai/scripts/convert_fairseq_to_parlai.py Outdated Show resolved Hide resolved

stephenroller reviewed Jun 24, 2020

View reviewed changes

parlai/scripts/convert_fairseq_to_parlai.py Outdated Show resolved Hide resolved

stephenroller reviewed Jun 24, 2020

View reviewed changes

parlai/zoo/model_list.py Outdated Show resolved Hide resolved

Merge branch 'master' into bart_agent

d307aa2

emilydinan reviewed Jun 25, 2020

View reviewed changes

address stephen & emily comments

741a7e8

klshuster added 3 commits June 25, 2020 17:09

test caught a bug lol

805dc53

black

908fb6c

typing

a799d92

stephenroller reviewed Jun 26, 2020

View reviewed changes

parlai/agents/bart/bart.py Outdated Show resolved Hide resolved

klshuster mentioned this pull request Jun 26, 2020

Refactor out forward_embedding from decoder. #2793

Merged

klshuster added 3 commits June 26, 2020 13:21

Merge branch 'master' into bart_agent

d41bf50

lower bsz for test

eefb965

retrying new test config for bart

801fa59

stephenroller approved these changes Jun 29, 2020

View reviewed changes

remove unused import

5c9956e

klshuster merged commit aed650b into facebookresearch:master Jun 29, 2020

klshuster deleted the bart_agent branch June 29, 2020 23:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BART] BART agent #2781

[BART] BART agent #2781

klshuster commented Jun 19, 2020 •

edited

Loading

emilydinan commented Jun 24, 2020

stephenroller commented Jun 24, 2020

klshuster commented Jun 25, 2020

klshuster commented Jun 25, 2020

stephenroller left a comment

[BART] BART agent #2781

[BART] BART agent #2781

Conversation

klshuster commented Jun 19, 2020 • edited Loading

Patch description

Testing steps

Logs

Gelu trained with SGD

Actual training with adam on 1 epoch of Convai2

emilydinan commented Jun 24, 2020

stephenroller commented Jun 24, 2020

klshuster commented Jun 25, 2020

klshuster commented Jun 25, 2020

stephenroller left a comment

Choose a reason for hiding this comment

klshuster commented Jun 19, 2020 •

edited

Loading