Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

apply_chunking_to_forward should only be the same in the chunking dimension #8349

Closed
2 of 4 tasks
pedrocolon93 opened this issue Nov 6, 2020 · 2 comments · Fixed by #8391
Closed
2 of 4 tasks

apply_chunking_to_forward should only be the same in the chunking dimension #8349

pedrocolon93 opened this issue Nov 6, 2020 · 2 comments · Fixed by #8391

Comments

@pedrocolon93
Copy link
Contributor

Environment info

  • transformers version: 3.4.0
  • Platform: All
  • Python version:
  • PyTorch version (GPU?):
  • Tensorflow version (GPU?):
  • Using GPU in script?:
  • Using distributed or parallel set-up in script?:

Who can help

Information

Model I am using (Bert, XLNet ...): XLNet

The problem arises when using:

  • the official example scripts: (give details below)
  • my own modified scripts: (give details below)

The tasks I am working on is:

  • an official GLUE/SQUaD task: (give the name)
  • my own task or dataset: (give details below)

To reproduce

Steps to reproduce the behavior:

  1. Simply send in 2 tensor to the apply_chunking_to_forward that have the same batch length, same batch size, but different dimensionality and it will pop up with an exception

Expected behavior

Should only chunk if they are the same in the chunk dimension

 assert len(input_tensors) > 0, "{} has to be a tuple/list of tensors".format(input_tensors)
    tensor_shape = input_tensors[0].shape
    assert all(
        input_tensor.shape == tensor_shape for input_tensor in input_tensors
    ), "All input tenors have to be of the same shape"

Should be:

    tensor_shape = input_tensors[0].shape[chunk_dim]
    assert all(
        input_tensor.shape[chunk_dim] == tensor_shape for input_tensor in input_tensors
    ), "All input tenors have to be of the same shape"

In here if there are 2 input tensors with the shapes:
[512,2,768] and [512,2,300] the method throws an exception when it should only chunk based on the chunk dimension (in this case 2).

@patrickvonplaten
Copy link
Contributor

Great catch @pedrocolon93 ! Do you feel like opening a PR to fix it? :-) Feel free to tag me and I'll help you!

@pedrocolon93
Copy link
Contributor Author

Sounds good! I'll get it in and link it later today!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants