-
Notifications
You must be signed in to change notification settings - Fork 27.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add warning message if model uses input_ids
that include padding tokens, but no attention_mask
is provided.
#16136
Comments
Models usually don't know the right pad token ID as pointed out in the issue (I'm also not sure that community-contributed models or models not as heavily used as BERT have the right pas token ID in their configs), so I'm not in favor of this. Plus, the check of the inputs at each forward pass would slow down performance. I agree that it's a common error, and it would make a very nice addition to the troubleshooting guide IMO, but I'm not sure we can add anything in the library to properly warn users without hurting performance or having a lot of false alarms. |
Hmm, think we can be pretty confident that |
Yay more content for the troubleshooting guide! I'll work on a PR for this 👍 |
Hey, @patrickvonplaten can I work on this issue? |
Sure that'd be great. Just to make sure we don't do duplicated work here - @ydshieh you haven't started on this one yet no? |
Hi, @Pawank06 @patrickvonplaten Not really. On Sep. 2022, I rebased the branch @patrickvonplaten created add_important_warning_padding_attention_mask, but then turned my focus to other tasks. @Pawank06, maybe you can pull that branch, rebase on the latest main, and continue what @patrickvonplaten has done? Don't hesitate if you need any help ❤️ |
@ydshieh @patrickvonplaten Ok can you assign me this issue and also can you please share me the file path |
@anruijian Let's wait a bit for @Pawank06 's response :-) Thank you for expressing the interest 💯 |
I see. @anruijian , you can take a look on this comment, and let me know if you have any question before working on it. Thank you! |
@ydshieh I have checked the add_important_warning_padding_attention_mask and would like to confirm my understanding of the current status and next steps before proceeding with my work. As of now, the task has been completed for the Torch version. The next steps involve adding an equivalent warning function to the TensorFlow and Flax versions. More specifically, in FlaxPreTrainedModel, modeling_flax_bert.pyand TFPreTrainedModel, modeling_tf_bert.py. Thank you! |
Hi @anruijian . No, the torch part is not finished yet. @patrickvonplaten added a method The goal is to have the same change made in elif input_ids is not None:
input_shape = input_ids.size() |
@patrickvonplaten @ydshieh It looks like none of the pull requests were committed yet, I'd like to take a stab at this issue if it's ok. Thanks. |
First good issue
A current error is that a user forwards a batched tensor of
input_ids
that include a padding token, e.g.input_ids = torch.tensor([["hello", "this", "is", "a", "long", "string"], ["hello", "<pad>", "<pad>", "<pad>", "<pad>"]]
In this case, the
attention_mask
should be provided as well. Otherwise the output hidden_states will be incorrectly computed. This is quite a common silent error IMO.With @LysandreJik @sgugger, we have decided to not automatically create the
attention_mask
that masks out the padding tokens in this case because of the reasons explains here: #15479 (comment) . However as pointed out in #15479, we should IMO at least displa a warning since this error happens a lot IMO.As a first good issue, one could add such a warning to the BertModel in a first case which would go something like:
What do you think @sgugger @LysandreJik ?
The text was updated successfully, but these errors were encountered: