Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

model loading the checkpoint error #20

Closed
TIANRENK opened this issue Nov 14, 2018 · 18 comments
Closed

model loading the checkpoint error #20

TIANRENK opened this issue Nov 14, 2018 · 18 comments

Comments

@TIANRENK
Copy link

RuntimeError: Error(s) in loading state_dict for BertModel:
size mismatch for embeddings.token_type_embeddings.weight: copying a param of torch.Size([16, 768]) from checkpoint, where the shape is torch.Size([2, 768]) in current model.

@TIANRENK
Copy link
Author

But I print the model.embeddings.token_type_embeddings it was Embedding(16,768) .

@thomwolf
Copy link
Member

which model are you loading?

@TIANRENK
Copy link
Author

which model are you loading?

the pre-trained model chinese_L-12_H-768_A-12

@TIANRENK
Copy link
Author

mycode:
bert_config = BertConfig.from_json_file('bert_config.json')
model=BertModel(bert_config)
model.load_state_dict(torch.load('pytorch_model.bin'))

The error:
RuntimeError: Error(s) in loading state_dict for BertModel:
size mismatch for embeddings.token_type_embeddings.weight: copying a param of torch.Size([16, 768]) from checkpoint, where the shape is torch.Size([2, 768]) in current model.

@thomwolf
Copy link
Member

I'm testing the chinese model.
Do you use the config.json of the chinese_L-12_H-768_A-12 ?
Can you send the content of your config_json ?

@TIANRENK
Copy link
Author

I'm testing the chinese model.
Do you use the config.json of the chinese_L-12_H-768_A-12 ?
Can you send the content of your config_json ?

In the 'config.json' of the chinese_L-12_H-768_A-12 ,the type_vocab_size=2.But I change the config.type_vocab_size=16, it still error.

@TIANRENK
Copy link
Author

I'm testing the chinese model.
Do you use the config.json of the chinese_L-12_H-768_A-12 ?
Can you send the content of your config_json ?

{
"attention_probs_dropout_prob": 0.1,
"directionality": "bidi",
"hidden_act": "gelu",
"hidden_dropout_prob": 0.1,
"hidden_size": 768,
"initializer_range": 0.02,
"intermediate_size": 3072,
"max_position_embeddings": 512,
"num_attention_heads": 12,
"num_hidden_layers": 12,
"pooler_fc_size": 768,
"pooler_num_attention_heads": 12,
"pooler_num_fc_layers": 3,
"pooler_size_per_head": 128,
"pooler_type": "first_token_transform",
"type_vocab_size": 2,
"vocab_size": 21128
}

I change my code:
bert_config = BertConfig.from_json_file('bert_config.json')
bert_config.type_vocab_size=16
model=BertModel(bert_config)
model.load_state_dict(torch.load('pytorch_model.bin'))

it still error.

@TIANRENK
Copy link
Author

I see you have "type_vocab_size": 2 in your config file, how is that?

Yes,but I change it in my code.

@TIANRENK
Copy link
Author

is your pytorch_model.bin the good converted model of the chinese one (and not of an English one)?

I think it's good.

@thomwolf
Copy link
Member

Ok, I have the models. I think type_vocab_size should be 2 also for chinese. I am wondering why it is 16 in your pytorch_model.bin

@TIANRENK
Copy link
Author

I have no idea.Did my model make the wrong convert?

@thomwolf
Copy link
Member

I am testing that right now. I haven't played with the multi-lingual models yet.

@TIANRENK
Copy link
Author

I am testing that right now. I haven't played with the multi-lingual models yet.

I also use it for the first time.I am looking forward to your test results.

@TIANRENK
Copy link
Author

I am testing that right now. I haven't played with the multi-lingual models yet.

When I was converting the model .

Traceback (most recent call last):
File "convert_tf_checkpoint_to_pytorch.py", line 95, in
convert()
File "convert_tf_checkpoint_to_pytorch.py", line 85, in convert
assert pointer.shape == array.shape
AssertionError: (torch.Size([16, 768]), (2, 768))

@thomwolf
Copy link
Member

are you supplying a config file with "type_vocab_size": 2 to the conversion script?

@TIANRENK
Copy link
Author

are you supplying a config file with "type_vocab_size": 2 to the conversion script?

I used the 'bert_config.json' of the chinese_L-12_H-768_A-12 when I was converting .

@thomwolf
Copy link
Member

Ok, I think I found the issue, your BertConfig is not build from the configuration file for some reason and thus use the default value of type_vocab_size in BertConfig which is 16.

This error happen on my system when I use config = BertConfig('bert_config.json') instead of config = BertConfig.from_json_file('bert_config.json').

I will make sure these two ways of initializing the configuration file (from parameters or from json file) cannot be messed up.

@imxiaomin
Copy link

运行时错误:加载 BertModel state_dict时出错:embeddings.token_type_embeddings 的大小不匹配.weight:
复制火炬参数。大小([16, 768]) 从检查点开始,其中形状为火炬。当前模型中的大小([2, 768]

i have the same problem as you. did you solve the problem?

jameshennessytempus pushed a commit to jameshennessytempus/transformers that referenced this issue Jun 1, 2023
etemadiamd pushed a commit to etemadiamd/transformers that referenced this issue Aug 29, 2023
lcong pushed a commit to lcong/transformers that referenced this issue Apr 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants