BertForMultipleChoice QuickTour issue with weights?

## 🐛 Bug



Model I am using (BertForMultipleChoice):

Language I am using the model on (English.):

The problem arise when using:
* [x] the official example scripts:
Arises when running through the last piece of example code found here:
https://github.com/huggingface/transformers#quick-tour
* [ ] my own modified scripts: (give details)

The tasks I am working on is:
* [ ] an official GLUE/SQUaD task: (give the name)
* [ ] my own task or dataset: (give details)

## To Reproduce

Steps to reproduce the behavior:

1. Run the sample code from the last loop of the Quick Tour
2. Need to supply directory names, see my code (below) for how I did that...
3. When BertForMultipleChoice runs the line `reshaped_logits = logits.view(-1, num_choices)` in modeling_bert.py we get a runtime error `RuntimeError: shape '[-1, 16]' is invalid for input of size 1`


```
# Each architecture is provided with several class for fine-tuning on down-stream tasks, e.g.
BERT_MODEL_CLASSES = [BertModel, BertForPreTraining, BertForMaskedLM, BertForNextSentencePrediction,
                      BertForSequenceClassification, BertForMultipleChoice, BertForTokenClassification,
                      BertForQuestionAnswering]
```
```
# All the classes for an architecture can be initiated from pretrained weights for this architecture
# Note that additional weights added for fine-tuning are only initialized
# and need to be trained on the down-stream task
pretrained_weights = 'bert-base-uncased'
tokenizer = BertTokenizer.from_pretrained(pretrained_weights)
for model_class in BERT_MODEL_CLASSES:
    
    print("Processing", model_class.__name__, "...")
    
    # Store class name as target directory
    model_dir_name = model_class.__name__+"/"
    
    # Load pretrained model/tokenizer
    model = model_class.from_pretrained(pretrained_weights)

    # Models can return full list of hidden-states & attentions weights at each layer
    model = model_class.from_pretrained(pretrained_weights,
                                        output_hidden_states=True,
                                        output_attentions=True)
    input_ids = torch.tensor([tokenizer.encode("Let's see all hidden-states and attentions on this text")])
    all_hidden_states, all_attentions = model(input_ids)[-2:]

    # Models are compatible with Torchscript
    model = model_class.from_pretrained(pretrained_weights, torchscript=True)
    traced_model = torch.jit.trace(model, (input_ids,))
    
    save_directory = 'BERT_test/'+ model_dir_name
    if not os.path.isdir(save_directory):
        os.mkdir(save_directory)
        
    # Simple serialization for models and tokenizers
    model.save_pretrained(save_directory)  # save
    model = model_class.from_pretrained(save_directory)  # re-load
    tokenizer.save_pretrained(save_directory)  # save
    tokenizer = BertTokenizer.from_pretrained(save_directory)  # re-load

    # SOTA examples for GLUE, SQUAD, text generation...
```
The error:
```
I1111 20:47:30.383128 21676 modeling_utils.py:383] loading weights file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-pytorch_model.bin from cache at C:\Users\User\.cache\torch\transformers\aa1ef1aede4482d0dbcd4d52baad8ae300e60902e88fcb0bebdec09afd232066.36ca03ab34a1a5d5fa7bc3d03d55c4fa650fed07220e2eeebc06ce58d0e9a157
I1111 20:47:33.363116 21676 modeling_utils.py:453] Weights of BertForMultipleChoice not initialized from pretrained model: ['classifier.weight', 'classifier.bias']
I1111 20:47:33.365122 21676 modeling_utils.py:456] Weights from pretrained model not used in BertForMultipleChoice: ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias']
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-10-da6597bd484d> in <module>()
     19                                         output_attentions=True)
     20     input_ids = torch.tensor([tokenizer.encode("Let's see all hidden-states and attentions on this text")])
---> 21     all_hidden_states, all_attentions = model(input_ids)[-2:]
     22 
     23     # Models are compatible with Torchscript

G:\Anaconda3\envs\pytorch1\lib\site-packages\torch\nn\modules\module.py in __call__(self, *input, **kwargs)
    539             result = self._slow_forward(*input, **kwargs)
    540         else:
--> 541             result = self.forward(*input, **kwargs)
    542         for hook in self._forward_hooks.values():
    543             hook_result = hook(self, input, result)

g:\deeplearning\huggingface\transformers\transformers\modeling_bert.py in forward(self, input_ids, attention_mask, token_type_ids, position_ids, head_mask, inputs_embeds, labels)
   1096         pooled_output = self.dropout(pooled_output)
   1097         logits = self.classifier(pooled_output)
-> 1098         reshaped_logits = logits.view(-1, num_choices)
   1099 
   1100         outputs = (reshaped_logits,) + outputs[2:]  # add hidden states and attention if they are here

RuntimeError: shape '[-1, 16]' is invalid for input of size 1

```
## Expected behavior
Should run without error!



## Environment

* OS: Windows 10
* Python version: 3.6.6
* PyTorch version: 1.3.0
* PyTorch Transformers version (or branch): 2.1.1
* Using GPU ? Yes
* Distributed of parallel setup ? No
* Any other relevant information:

## Additional context

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

BertForMultipleChoice QuickTour issue with weights? #1789

🐛 Bug

To Reproduce

Expected behavior

Environment

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

BertForMultipleChoice QuickTour issue with weights? #1789

Description

🐛 Bug

To Reproduce

Expected behavior

Environment

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions