Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[lightning_base] fix s2s logging, only make train_loader once #6404

Merged
merged 7 commits into from
Aug 17, 2020

Conversation

sshleifer
Copy link
Contributor

@sshleifer sshleifer commented Aug 11, 2020

setup is called many times (incl twice by trainer.test), creating a dataloader each time. Will only creating a train_loader on the first call cause bad side effects that I don't understand @nateraw @williamFalcon ?

I read docs [https://pytorch-lightning.readthedocs.io/en/latest/introduction_guide.html], so I think I'm fine, but not sure.

cc @stas00

Also:

  • add a fast test for run_ner
  • [pl] centralize data_dir argument to add_generic_args cause rule of 3

Checks:

  • verified xsum distillation trains well, has good LR logs. (warmup+linear decay are honored.)

@@ -31,3 +34,24 @@ def test_run_ner(self):
with patch.object(sys, "argv", ["run.py"] + testargs):
result = run_ner.main()
self.assertLess(result["eval_loss"], 1.5)


def test_run_ner_pl(self):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this doesn't test lightning and should be in a separate PR.

examples/lightning_base.py Outdated Show resolved Hide resolved
@sshleifer sshleifer changed the title lightning_base: delete setup function that makes dataloader many times lightning_base: only make train_loader once, dont make lr_scheduler twice Aug 11, 2020
@@ -161,13 +161,6 @@ def add_model_specific_args(parser, root_dir):
type=int,
help="The number of GPUs allocated for this, it is by default 0 meaning none",
)
parser.add_argument(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

centralize data_dir argument to add_generic_args cause rule of 3

@codecov
Copy link

codecov bot commented Aug 11, 2020

Codecov Report

Merging #6404 into master will decrease coverage by 0.11%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #6404      +/-   ##
==========================================
- Coverage   80.38%   80.26%   -0.12%     
==========================================
  Files         156      156              
  Lines       28058    28058              
==========================================
- Hits        22554    22521      -33     
- Misses       5504     5537      +33     
Impacted Files Coverage Δ
src/transformers/tokenization_roberta.py 76.71% <0.00%> (-21.92%) ⬇️
src/transformers/tokenization_utils_base.py 86.58% <0.00%> (-7.19%) ⬇️
src/transformers/tokenization_transfo_xl.py 38.73% <0.00%> (-3.76%) ⬇️
src/transformers/tokenization_auto.py 95.55% <0.00%> (-2.23%) ⬇️
src/transformers/tokenization_utils_fast.py 92.14% <0.00%> (-2.15%) ⬇️
src/transformers/tokenization_openai.py 82.57% <0.00%> (-1.52%) ⬇️
src/transformers/configuration_utils.py 95.91% <0.00%> (-0.69%) ⬇️
src/transformers/tokenization_bert.py 91.07% <0.00%> (-0.45%) ⬇️
src/transformers/tokenization_utils.py 90.00% <0.00%> (-0.41%) ⬇️
src/transformers/tokenization_albert.py 87.50% <0.00%> (+58.65%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 72add6c...e43061d. Read the comment docs.

@sshleifer sshleifer changed the title lightning_base: only make train_loader once, dont make lr_scheduler twice [lightning_base] fix s2s logging, only make train_loader once Aug 17, 2020
@sshleifer sshleifer merged commit 84c265f into huggingface:master Aug 17, 2020
@sshleifer sshleifer deleted the delete-setup branch August 17, 2020 02:49
Zigur pushed a commit to Zigur/transformers that referenced this pull request Oct 26, 2020
fabiocapsouza added a commit to fabiocapsouza/transformers that referenced this pull request Nov 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant