-
Notifications
You must be signed in to change notification settings - Fork 31.2k
Closed
Description
🐛 Bug
Model I am using (BertForMultipleChoice):
Language I am using the model on (English.):
The problem arise when using:
- the official example scripts:
Arises when running through the last piece of example code found here:
https://github.com/huggingface/transformers#quick-tour - my own modified scripts: (give details)
The tasks I am working on is:
- an official GLUE/SQUaD task: (give the name)
- my own task or dataset: (give details)
To Reproduce
Steps to reproduce the behavior:
- Run the sample code from the last loop of the Quick Tour
- Need to supply directory names, see my code (below) for how I did that...
- When BertForMultipleChoice runs the line
reshaped_logits = logits.view(-1, num_choices)in modeling_bert.py we get a runtime errorRuntimeError: shape '[-1, 16]' is invalid for input of size 1
# Each architecture is provided with several class for fine-tuning on down-stream tasks, e.g.
BERT_MODEL_CLASSES = [BertModel, BertForPreTraining, BertForMaskedLM, BertForNextSentencePrediction,
BertForSequenceClassification, BertForMultipleChoice, BertForTokenClassification,
BertForQuestionAnswering]
# All the classes for an architecture can be initiated from pretrained weights for this architecture
# Note that additional weights added for fine-tuning are only initialized
# and need to be trained on the down-stream task
pretrained_weights = 'bert-base-uncased'
tokenizer = BertTokenizer.from_pretrained(pretrained_weights)
for model_class in BERT_MODEL_CLASSES:
print("Processing", model_class.__name__, "...")
# Store class name as target directory
model_dir_name = model_class.__name__+"/"
# Load pretrained model/tokenizer
model = model_class.from_pretrained(pretrained_weights)
# Models can return full list of hidden-states & attentions weights at each layer
model = model_class.from_pretrained(pretrained_weights,
output_hidden_states=True,
output_attentions=True)
input_ids = torch.tensor([tokenizer.encode("Let's see all hidden-states and attentions on this text")])
all_hidden_states, all_attentions = model(input_ids)[-2:]
# Models are compatible with Torchscript
model = model_class.from_pretrained(pretrained_weights, torchscript=True)
traced_model = torch.jit.trace(model, (input_ids,))
save_directory = 'BERT_test/'+ model_dir_name
if not os.path.isdir(save_directory):
os.mkdir(save_directory)
# Simple serialization for models and tokenizers
model.save_pretrained(save_directory) # save
model = model_class.from_pretrained(save_directory) # re-load
tokenizer.save_pretrained(save_directory) # save
tokenizer = BertTokenizer.from_pretrained(save_directory) # re-load
# SOTA examples for GLUE, SQUAD, text generation...
The error:
I1111 20:47:30.383128 21676 modeling_utils.py:383] loading weights file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-pytorch_model.bin from cache at C:\Users\User\.cache\torch\transformers\aa1ef1aede4482d0dbcd4d52baad8ae300e60902e88fcb0bebdec09afd232066.36ca03ab34a1a5d5fa7bc3d03d55c4fa650fed07220e2eeebc06ce58d0e9a157
I1111 20:47:33.363116 21676 modeling_utils.py:453] Weights of BertForMultipleChoice not initialized from pretrained model: ['classifier.weight', 'classifier.bias']
I1111 20:47:33.365122 21676 modeling_utils.py:456] Weights from pretrained model not used in BertForMultipleChoice: ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias']
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-10-da6597bd484d> in <module>()
19 output_attentions=True)
20 input_ids = torch.tensor([tokenizer.encode("Let's see all hidden-states and attentions on this text")])
---> 21 all_hidden_states, all_attentions = model(input_ids)[-2:]
22
23 # Models are compatible with Torchscript
G:\Anaconda3\envs\pytorch1\lib\site-packages\torch\nn\modules\module.py in __call__(self, *input, **kwargs)
539 result = self._slow_forward(*input, **kwargs)
540 else:
--> 541 result = self.forward(*input, **kwargs)
542 for hook in self._forward_hooks.values():
543 hook_result = hook(self, input, result)
g:\deeplearning\huggingface\transformers\transformers\modeling_bert.py in forward(self, input_ids, attention_mask, token_type_ids, position_ids, head_mask, inputs_embeds, labels)
1096 pooled_output = self.dropout(pooled_output)
1097 logits = self.classifier(pooled_output)
-> 1098 reshaped_logits = logits.view(-1, num_choices)
1099
1100 outputs = (reshaped_logits,) + outputs[2:] # add hidden states and attention if they are here
RuntimeError: shape '[-1, 16]' is invalid for input of size 1
Expected behavior
Should run without error!
Environment
- OS: Windows 10
- Python version: 3.6.6
- PyTorch version: 1.3.0
- PyTorch Transformers version (or branch): 2.1.1
- Using GPU ? Yes
- Distributed of parallel setup ? No
- Any other relevant information:
Additional context
barrh
Metadata
Metadata
Assignees
Labels
No labels