Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DataCollatorForLanguageModeling modifies input_ids via labels variable #8619

Closed
1 of 4 tasks
sveitser opened this issue Nov 18, 2020 · 1 comment · Fixed by #8621
Closed
1 of 4 tasks

DataCollatorForLanguageModeling modifies input_ids via labels variable #8619

sveitser opened this issue Nov 18, 2020 · 1 comment · Fixed by #8621

Comments

@sveitser
Copy link

The cloning step was removed in #8308 at https://github.com/huggingface/transformers/pull/8308/files#diff-046566f2b40a246c7d533457cd7f6f07830516da845b904086f36b3cfe0d5965L201 so now the code that sets padded labels to -100 is operating on the input_ids tensor directly.

I suspect the code then fails when trying to look up the embedding for -100 .

cc @sgugger

Environment info

  • transformers version: 3.5.1
  • Platform: Linux-5.4.72-x86_64-with
  • Python version: 3.8.6
  • PyTorch version (GPU?): 1.7.0 (True)
  • Tensorflow version (GPU?): not installed (NA)
  • Using GPU in script?:
  • Using distributed or parallel set-up in script?:

Who can help

Information

Model I am using (Bert, XLNet ...):

The problem arises when using:

  • the official example scripts: (give details below)
  • my own modified scripts: (give details below)

The tasks I am working on is:

  • an official GLUE/SQUaD task: (give the name)
  • my own task or dataset: (give details below)

To reproduce

Steps to reproduce the behavior:

  1. Use DataCollatorForLanguageModeling with Trainer and a tokenizer with pad_token
  File "/home/lulu/r/buganart/dialog/.build/pip_packages/bin/finetune", line 33, in <module>
    sys.exit(load_entry_point('dialog', 'console_scripts', 'finetune')())
  File "/home/lulu/r/buganart/dialog/dialog/finetune.py", line 139, in main
    trainer.train()
  File "/nix/store/0jdyxgmg88y6sbjm3xkqdn06f493ahf2-python3-3.8.6-env/lib/python3.8/site-packages/transformers/trainer.py", line 775, in train
    tr_loss += self.training_step(model, inputs)
  File "/nix/store/0jdyxgmg88y6sbjm3xkqdn06f493ahf2-python3-3.8.6-env/lib/python3.8/site-packages/transformers/trainer.py", line 1112, in training_step
    loss = self.compute_loss(model, inputs)
  File "/nix/store/0jdyxgmg88y6sbjm3xkqdn06f493ahf2-python3-3.8.6-env/lib/python3.8/site-packages/transformers/trainer.py", line 1136, in compute_loss
    outputs = model(**inputs)
  File "/nix/store/0jdyxgmg88y6sbjm3xkqdn06f493ahf2-python3-3.8.6-env/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/nix/store/0jdyxgmg88y6sbjm3xkqdn06f493ahf2-python3-3.8.6-env/lib/python3.8/site-packages/transformers/modeling_gpt2.py", line 774, in forward
    transformer_outputs = self.transformer(
  File "/nix/store/0jdyxgmg88y6sbjm3xkqdn06f493ahf2-python3-3.8.6-env/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/nix/store/0jdyxgmg88y6sbjm3xkqdn06f493ahf2-python3-3.8.6-env/lib/python3.8/site-packages/transformers/modeling_gpt2.py", line 612, in forward
    inputs_embeds = self.wte(input_ids)
  File "/nix/store/0jdyxgmg88y6sbjm3xkqdn06f493ahf2-python3-3.8.6-env/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/nix/store/0jdyxgmg88y6sbjm3xkqdn06f493ahf2-python3-3.8.6-env/lib/python3.8/site-packages/torch/nn/modules/sparse.py", line 124, in forward
    return F.embedding(
  File "/nix/store/0jdyxgmg88y6sbjm3xkqdn06f493ahf2-python3-3.8.6-env/lib/python3.8/site-packages/torch/nn/functional.py", line 1852, in embedding
    return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
IndexError: index out of range in self 

My script is here https://github.com/buganart/dialog/blob/master/dialog/finetune.py .

Expected behavior

@sgugger
Copy link
Collaborator

sgugger commented Nov 18, 2020

Ah yes, only the detach was supposed to be removed but I guess I went a bit too far with my mouse, sorry about that. Will fix right now, thanks for flagging!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants