-
Notifications
You must be signed in to change notification settings - Fork 27.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TFEncoderDecoder not handling labels correctly #14357
Comments
Hi, I think all the inputs should be unpacked as keyword arguments before inputted into Is there any reason that you want to pass a dict directly? |
The inputs are not unpacked in the model train_step(), which is what is used when you train the model using fit(). See TFPretrainedModel.train_step (line 802):
|
Do you maybe find some time to look into this? :-) |
@Rocketknight1 , @NielsRogge @ydshieh - I think we can solve this issue with the new design now no? |
I didn't follow this issue until now. I can try to look at this if @Rocketknight1 is OK. |
@ydshieh Sure, yes! I'm sorry I've been slow with it. |
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
activate :-) |
Environment info
Google Colab
transformers
version: master branch. With the latest release (4.12.3) you can't replicate this problem, as it fails with other issue that has already been fixed in master (support for cross-attention in TF GPT2)Who can help
Tagging @patrickvonplaten as he has done the latest merges on TFEncoderDecoder.
Information
In TFEncoderDecoder, when the input is passed as dict, the encoder
input_processing
function "unpacks it", also unpacking the labels (if they are there). The labels end up being passed to the encoder call, which shouldn't happen, as the labels are only needed for the decoder, and causes the encoder call to fail.The consequence is that trying to fit a TFEncoderDecoder using
.fit()
with a tf.data.Dataset results in this error.To reproduce
Expected behavior
This should handle labels correctly, as they are needed in order to fit the model.
A workaround that works is adding this bit on the call:
The text was updated successfully, but these errors were encountered: