Skip to content
This repository has been archived by the owner on Nov 3, 2023. It is now read-only.

[BART] BART agent #2781

Merged
merged 17 commits into from
Jun 29, 2020
Merged

[BART] BART agent #2781

merged 17 commits into from
Jun 29, 2020

Conversation

klshuster
Copy link
Contributor

@klshuster klshuster commented Jun 19, 2020

Patch description

BART Agent: https://arxiv.org/abs/1910.13461.

  • Relies on the model implemented in fairseq. As such, I provide an open-sourced version of the conversion script @stephenroller et. al. used for BlenderBot.
  • The BART agent can be instantiated as simply -m bart, however it is recommended to specify --init-model zoo:bart/bart_large/model or -mf zoo:bart/bart_large/model so that a dictionary is not needlessly built.
  • Also substituted our gelu implementation for the one in torch.nn.functional. This provides a speed-up of about 10%; see details in logs section.
  • Added a test to check the pre-trained model's ppl when trained on an integration test task.

cc @jaseweston as I remember one of the interns might have wanted BART?

Testing steps

Add nightly bart test to see whether the model can be fine-tuned on an integration task.

Most of the time is actually just loading the model.

$ python -m pytest tests/nightly/gpu/test_bart.py
.
.
.
================================================================================== test session starts ==================================================================================
platform linux -- Python 3.6.9, pytest-5.3.2, py-1.8.1, pluggy-0.13.1
rootdir: /private/home/kshuster/ParlAI, inifile: pytest.ini
plugins: requests-mock-1.7.0
collected 1 item

tests/nightly/gpu/test_bart.py .                                                                                                                                                  [100%]

=============================================================================== slowest 10 test durations ===============================================================================
76.42s call     tests/nightly/gpu/test_bart.py::TestBartModel::test_bart

(0.00 durations hidden.  Use -vv to show these durations.)
======================================================================= 1 passed, 73 warnings in 77.97s (0:01:17) =======================================================================

Logs

Gelu trained with SGD

New gelu

$ parlai train_model -m bart -mf /tmp/bart_c2_new_gelu -t convai2 -bs 24 --fp16 true -eps 1 --dict-file zoo:bart/bart_large/model.dict
.
.
.
num_epochs completed:1.0 time elapsed:1543.1565182209015s
.
.
.
eval completed in 442.08s
valid:
    accuracy  bleu-4  exs    f1  gpu_mem  loss  lr   ppl  token_acc  total_train_updates   tpb
    .0001282  .01081 7801 .2066    .3493 2.449   1 11.58      .4617                 5478 321.3

old Gelu

$ parlai train_model -m bart -mf /tmp/bart_c2_old_gelu -t convai2 -bs 24 --fp16 true -eps 1
.
.
.
num_epochs completed:1.0 time elapsed:1727.4218549728394s
.
.
.
valid:
    accuracy  bleu-4  exs    f1  gpu_mem  loss  lr   ppl  token_acc  total_train_updates   tpb
    .0002564  .01066 7801 .2027    .3847 2.443   1 11.51      .4637                 5478 321.3

Actual training with adam on 1 epoch of Convai2

$ parlai train_model -m bart -mf /tmp/bart_c2_new_gelu_adam -t convai2 -bs 24 --fp16 true -eps 1 -lr 1e-5 --optimizer adam
.
.
.

valid:
    accuracy  bleu-4  exs    f1  gpu_mem  loss    lr   ppl  token_acc  total_train_updates   tpb
    .0001282  .01229 7801 .2035    .6361 2.386 1e-05 10.87      .4741                 5478 321.3

@klshuster klshuster marked this pull request as ready for review June 24, 2020 00:18
@klshuster klshuster changed the title [WIP] [BART] BART agent [BART] BART agent Jun 24, 2020
@emilydinan
Copy link
Contributor

Re: not building dictionary. You can do a hack like this:

dict_maxexs=0, # skip building dictionary

(Just set max_exs to 0 so it's not built)

parlai/agents/bart/README.md Show resolved Hide resolved
parlai/agents/bart/bart.py Show resolved Hide resolved
parlai/agents/bart/bart.py Show resolved Hide resolved
parlai/agents/bart/bart.py Outdated Show resolved Hide resolved
parlai/agents/bart/bart.py Show resolved Hide resolved
@stephenroller
Copy link
Contributor

tests/nightly/gpu/test_bart.py::TestBartModel::test_bart SKIPPED [ 33%]
You need to add fairseq into the nightly installs. Same place we install nltk i think.

parlai/core/torch_generator_agent.py Show resolved Hide resolved
parlai/core/torch_generator_agent.py Outdated Show resolved Hide resolved
parlai/scripts/convert_fairseq_to_parlai.py Outdated Show resolved Hide resolved
parlai/scripts/convert_fairseq_to_parlai.py Outdated Show resolved Hide resolved
parlai/scripts/convert_fairseq_to_parlai.py Outdated Show resolved Hide resolved
@klshuster
Copy link
Contributor Author

i moved the script to parlai/agents/bart/ for now

@klshuster
Copy link
Contributor Author

typing is failing on files i did not touch...

Copy link
Contributor

@stephenroller stephenroller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@klshuster klshuster merged commit aed650b into facebookresearch:master Jun 29, 2020
@klshuster klshuster deleted the bart_agent branch June 29, 2020 23:46
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants