Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ALBERT] : ValueError: Layer #1 (named "predictions") expects 11 weight(s), but the saved weights have 10 element(s). #2024

Closed
4 tasks
gradient-school opened this issue Dec 2, 2019 · 9 comments
Assignees
Labels

Comments

@gradient-school
Copy link

🐛 Bug

Model I am using (Bert, XLNet....): ALBERT

Language I am using the model on (English, Chinese....): English

The problem arise when using:

  • the official example scripts: (give details)
  • my own modified scripts: (give details)

import tensorflow as tf
from transformers import *
#Download AlbertMaskedLM model
model = TFAlbertForMaskedLM.from_pretrained('albert-large-v2')

The tasks I am working on is:

  • an official GLUE/SQUaD task: (give the name)
  • my own task or dataset: (give details) Initial validation

To Reproduce

Steps to reproduce the behavior:
import tensorflow as tf
from transformers import *
#Download AlbertMaskedLM model
model = TFAlbertForMaskedLM.from_pretrained('albert-large-v2')
1.
2.
3.

The code throws an error as follows :

100%|██████████| 484/484 [00:00<00:00, 271069.99B/s]
100%|██████████| 87059544/87059544 [00:03<00:00, 28448930.07B/s]

ValueError Traceback (most recent call last)
in ()
----> 1 model = TFAlbertForMaskedLM.from_pretrained('albert-large-v2')

3 frames
/usr/local/lib/python3.6/dist-packages/transformers/modeling_tf_utils.py in from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
287 # 'by_name' allow us to do transfer learning by skipping/adding layers
288 # see https://github.com/tensorflow/tensorflow/blob/00fad90125b18b80fe054de1055770cfb8fe4ba3/tensorflow/python/keras/engine/network.py#L1339-L1357
--> 289 model.load_weights(resolved_archive_file, by_name=True)
290
291 ret = model(model.dummy_inputs, training=False) # Make sure restore ops are run

/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/keras/engine/training.py in load_weights(self, filepath, by_name)
179 raise ValueError('Load weights is not yet supported with TPUStrategy '
180 'with steps_per_run greater than 1.')
--> 181 return super(Model, self).load_weights(filepath, by_name)
182
183 @trackable.no_automatic_dependency_tracking

/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/keras/engine/network.py in load_weights(self, filepath, by_name)
1173 f = f['model_weights']
1174 if by_name:
-> 1175 saving.load_weights_from_hdf5_group_by_name(f, self.layers)
1176 else:
1177 saving.load_weights_from_hdf5_group(f, self.layers)

/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/keras/saving/hdf5_format.py in load_weights_from_hdf5_group_by_name(f, layers)
749 '") expects ' + str(len(symbolic_weights)) +
750 ' weight(s), but the saved weights' + ' have ' +
--> 751 str(len(weight_values)) + ' element(s).')
752 # Set values.
753 for i in range(len(weight_values)):

ValueError: Layer #1 (named "predictions") expects 11 weight(s), but the saved weights have 10 element(s).

Expected behavior

TFAlbertMaskedLM model can not be loaded from pre-trained

Environment

  • OS: Linux (Colab)
  • Python version: 3.6
  • PyTorch version: Tensorflow 2.0
  • PyTorch Transformers version (or branch):
  • Using GPU ? Yes
  • Distributed of parallel setup ?
  • Any other relevant information:

Additional context

@thomwolf
Copy link
Member

thomwolf commented Dec 5, 2019

cc @LysandreJik

@LysandreJik
Copy link
Member

It should be fixed now, thanks for raising an issue.

@gradient-school
Copy link
Author

Thanks @LysandreJik for your prompt response. The issue mentioned above is resolved but I am getting an error in converting predicted IDs back to token using AlbertTokenizer. Here is the error that I am seeing (pred_index value below is 29324). Please advise or let me know if I should open another issue as original issue has been resolved.

TypeError Traceback (most recent call last)
in ()
----> 1 pred_token = tokenizer.convert_ids_to_tokens([pred_index])[0]
2 print('Predicted token:', pred_token)

2 frames
/usr/local/lib/python3.6/dist-packages/transformers/tokenization_utils.py in convert_ids_to_tokens(self, ids, skip_special_tokens)
1034 tokens.append(self.added_tokens_decoder[index])
1035 else:
-> 1036 tokens.append(self._convert_id_to_token(index))
1037 return tokens
1038

/usr/local/lib/python3.6/dist-packages/transformers/tokenization_albert.py in _convert_id_to_token(self, index, return_unicode)
172 def _convert_id_to_token(self, index, return_unicode=True):
173 """Converts an index (integer) in a token (string/unicode) using the vocab."""
--> 174 token = self.sp_model.IdToPiece(index)
175 if six.PY2 and return_unicode and isinstance(token, str):
176 token = token.decode('utf-8')

/usr/local/lib/python3.6/dist-packages/sentencepiece.py in IdToPiece(self, id)
185
186 def IdToPiece(self, id):
--> 187 return _sentencepiece.SentencePieceProcessor_IdToPiece(self, id)
188
189 def GetScore(self, id):

TypeError: in method 'SentencePieceProcessor_IdToPiece', argument 2 of type 'int'

@LysandreJik
Copy link
Member

Hmm, I have no issues running this code snippet:

from transformers import AlbertTokenizer

tokenizer = AlbertTokenizer.from_pretrained("albert-large-v2")

print(tokenizer.convert_ids_to_tokens(29324))
# or
print(tokenizer.convert_ids_to_tokens([29324]))

Is there a way you could give us a short code sample that reproduces the problem, so that we may debug what's happening? Thank you.

@gradient-school
Copy link
Author

@LysandreJik thanks for your response. I figured out the issue. Below is the code which reproduces the issue. In the below code, 'pred_index' comes out as numpy.int64 and when placed in 'convert_ids_to_tokens' method, it throws the error mentioned above. If I convert it to an int then it works fine.

Here is the example code to reproduce the issue

Encode a text inputs

text = "What is the fastest car in the world."
tokenized_text = tokenizer.tokenize(text)

#Get tokenizer
tokenizer = AlbertTokenizer.from_pretrained('albert-base-v2')

#Lets mask 'world' and check if model can predict it
tokenized_text[7] = '[MASK]'

#Convert tokenized text to indexes
indexed_tokens = tokenizer.convert_tokens_to_ids(tokenized_text)

#Download AlbertMaskedLM model
model = TFAlbertForMaskedLM.from_pretrained('albert-base-v2')

#Prediction
inputs = tf.constant(indexed_tokens)[None,:]
outputs = model(inputs)

#Lets check the prediction at index 7 (in place of [MASK])
pred_index = tf.argmax(outputs[0][0,7]).numpy()
pred_token = tokenizer.convert_ids_to_tokens([pred_index])[0]
print('Predicted token:', pred_token)

@gradient-school
Copy link
Author

Please note that above code works as is for BERT (but throws an error for Albert).

@LysandreJik
Copy link
Member

This is probably the exact same problem than #945

If I understand correctly SentencePiece doesn't like numpy integers and crashes. Should we cast it to an int @thomwolf?

@thomwolf
Copy link
Member

Yes I think so. We can probably just add a int(idx) in the base tokenizer class PretrainedTokenizer before the call to _convert_id_to_tokens so we can even input tensors in addition to np arrays.

@stale
Copy link

stale bot commented Feb 12, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label Feb 12, 2020
@stale stale bot closed this as completed Feb 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants