model loading the checkpoint error #20

TIANRENK · 2018-11-14T08:13:34Z

RuntimeError: Error(s) in loading state_dict for BertModel:
size mismatch for embeddings.token_type_embeddings.weight: copying a param of torch.Size([16, 768]) from checkpoint, where the shape is torch.Size([2, 768]) in current model.

TIANRENK · 2018-11-14T08:24:49Z

But I print the model.embeddings.token_type_embeddings it was Embedding(16,768) .

thomwolf · 2018-11-14T08:31:07Z

which model are you loading?

TIANRENK · 2018-11-14T08:32:54Z

which model are you loading?

the pre-trained model chinese_L-12_H-768_A-12

TIANRENK · 2018-11-14T08:35:47Z

mycode:
bert_config = BertConfig.from_json_file('bert_config.json')
model=BertModel(bert_config)
model.load_state_dict(torch.load('pytorch_model.bin'))

The error:
RuntimeError: Error(s) in loading state_dict for BertModel:
size mismatch for embeddings.token_type_embeddings.weight: copying a param of torch.Size([16, 768]) from checkpoint, where the shape is torch.Size([2, 768]) in current model.

thomwolf · 2018-11-14T08:38:45Z

I'm testing the chinese model.
Do you use the config.json of the chinese_L-12_H-768_A-12 ?
Can you send the content of your config_json ?

TIANRENK · 2018-11-14T08:41:36Z

I'm testing the chinese model.
Do you use the config.json of the chinese_L-12_H-768_A-12 ?
Can you send the content of your config_json ?

In the 'config.json' of the chinese_L-12_H-768_A-12 ,the type_vocab_size=2.But I change the config.type_vocab_size=16, it still error.

TIANRENK · 2018-11-14T08:43:21Z

I'm testing the chinese model.
Do you use the config.json of the chinese_L-12_H-768_A-12 ?
Can you send the content of your config_json ?

{
"attention_probs_dropout_prob": 0.1,
"directionality": "bidi",
"hidden_act": "gelu",
"hidden_dropout_prob": 0.1,
"hidden_size": 768,
"initializer_range": 0.02,
"intermediate_size": 3072,
"max_position_embeddings": 512,
"num_attention_heads": 12,
"num_hidden_layers": 12,
"pooler_fc_size": 768,
"pooler_num_attention_heads": 12,
"pooler_num_fc_layers": 3,
"pooler_size_per_head": 128,
"pooler_type": "first_token_transform",
"type_vocab_size": 2,
"vocab_size": 21128
}

I change my code:
bert_config = BertConfig.from_json_file('bert_config.json')
bert_config.type_vocab_size=16
model=BertModel(bert_config)
model.load_state_dict(torch.load('pytorch_model.bin'))

it still error.

TIANRENK · 2018-11-14T08:45:13Z

I see you have "type_vocab_size": 2 in your config file, how is that?

Yes,but I change it in my code.

TIANRENK · 2018-11-14T08:46:12Z

is your pytorch_model.bin the good converted model of the chinese one (and not of an English one)?

I think it's good.

thomwolf · 2018-11-14T08:46:35Z

Ok, I have the models. I think type_vocab_size should be 2 also for chinese. I am wondering why it is 16 in your pytorch_model.bin

TIANRENK · 2018-11-14T08:48:21Z

I have no idea.Did my model make the wrong convert?

thomwolf · 2018-11-14T08:48:56Z

I am testing that right now. I haven't played with the multi-lingual models yet.

TIANRENK · 2018-11-14T08:51:29Z

I am testing that right now. I haven't played with the multi-lingual models yet.

I also use it for the first time.I am looking forward to your test results.

TIANRENK · 2018-11-14T09:01:19Z

I am testing that right now. I haven't played with the multi-lingual models yet.

When I was converting the model .

Traceback (most recent call last):
File "convert_tf_checkpoint_to_pytorch.py", line 95, in
convert()
File "convert_tf_checkpoint_to_pytorch.py", line 85, in convert
assert pointer.shape == array.shape
AssertionError: (torch.Size([16, 768]), (2, 768))

thomwolf · 2018-11-14T09:04:06Z

are you supplying a config file with "type_vocab_size": 2 to the conversion script?

TIANRENK · 2018-11-14T09:12:53Z

are you supplying a config file with "type_vocab_size": 2 to the conversion script?

I used the 'bert_config.json' of the chinese_L-12_H-768_A-12 when I was converting .

thomwolf · 2018-11-14T09:54:21Z

Ok, I think I found the issue, your BertConfig is not build from the configuration file for some reason and thus use the default value of type_vocab_size in BertConfig which is 16.

This error happen on my system when I use config = BertConfig('bert_config.json') instead of config = BertConfig.from_json_file('bert_config.json').

I will make sure these two ways of initializing the configuration file (from parameters or from json file) cannot be messed up.

add quac codalab submission pipeline (cont.)

imxiaomin · 2023-05-18T02:30:07Z

运行时错误：加载 BertModel state_dict时出错：embeddings.token_type_embeddings 的大小不匹配.weight：
复制火炬参数。大小（[16， 768]）从检查点开始，其中形状为火炬。当前模型中的大小（[2， 768]

i have the same problem as you. did you solve the problem?

Pop

…2023-01-11 IFU 2023-01-11

iter.next() to next(iter)

This reverts commit 0d1be56.

thomwolf closed this as completed Nov 15, 2018

maeotaku mentioned this issue May 23, 2019

bert->onnx ->caffe2 weird error #633

Closed

HongyanJiao mentioned this issue Sep 19, 2019

traced_model #1291

Closed

stevezheng23 added a commit to stevezheng23/transformers that referenced this issue Mar 24, 2020

Merge pull request huggingface#20 from stevezheng23/dev/zheng/quac

2b1c8dc

add quac codalab submission pipeline (cont.)

youssefavx mentioned this issue Sep 13, 2020

ValueError: Wrong shape for input_ids (shape torch.Size([18])) or attention_mask (shape torch.Size([18])) cisnlp/simalign#10

Closed

SUFEHeisenberg mentioned this issue Sep 29, 2020

RuntimeError: Error(s) in loading state_dict for BertModel 649453932/Bert-Chinese-Text-Classification-Pytorch#55

Open

manchandasahil mentioned this issue Mar 22, 2021

Longformer training : CUDA error: device-side assert triggered #10852

Closed

2 tasks

jameshennessytempus pushed a commit to jameshennessytempus/transformers that referenced this issue Jun 1, 2023

Merge pull request huggingface#20 from jamesthesnake/pop

94046dd

Pop

lwmlyy mentioned this issue Aug 15, 2023

add util for ram efficient loading of model when using fsdp #25107

Merged

1 task

etemadiamd pushed a commit to etemadiamd/transformers that referenced this issue Aug 29, 2023

Merge pull request huggingface#20 from ROCmSoftwarePlatform/IFU-main-…

87a6ed1

…2023-01-11 IFU 2023-01-11

lcong pushed a commit to lcong/transformers that referenced this issue Apr 9, 2024

Merge pull request huggingface#20 from danblae/master

fbb6f59

iter.next() to next(iter)

ZYC-ModelCloud pushed a commit to ZYC-ModelCloud/transformers that referenced this issue Nov 14, 2024

Always enable safetensors for save_quantized() (huggingface#20)

0d1be56

ZYC-ModelCloud pushed a commit to ZYC-ModelCloud/transformers that referenced this issue Nov 14, 2024

Revert "Always enable safetensors for save_quantized() (huggingface#20)"

5dd4930

This reverts commit 0d1be56.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

model loading the checkpoint error #20

model loading the checkpoint error #20

TIANRENK commented Nov 14, 2018

TIANRENK commented Nov 14, 2018

thomwolf commented Nov 14, 2018

TIANRENK commented Nov 14, 2018

TIANRENK commented Nov 14, 2018

thomwolf commented Nov 14, 2018

TIANRENK commented Nov 14, 2018

TIANRENK commented Nov 14, 2018

TIANRENK commented Nov 14, 2018

TIANRENK commented Nov 14, 2018

thomwolf commented Nov 14, 2018

TIANRENK commented Nov 14, 2018

thomwolf commented Nov 14, 2018

TIANRENK commented Nov 14, 2018

TIANRENK commented Nov 14, 2018

thomwolf commented Nov 14, 2018

TIANRENK commented Nov 14, 2018

thomwolf commented Nov 14, 2018

imxiaomin commented May 18, 2023

model loading the checkpoint error #20

model loading the checkpoint error #20

Comments

TIANRENK commented Nov 14, 2018

TIANRENK commented Nov 14, 2018

thomwolf commented Nov 14, 2018

TIANRENK commented Nov 14, 2018

TIANRENK commented Nov 14, 2018

thomwolf commented Nov 14, 2018

TIANRENK commented Nov 14, 2018

TIANRENK commented Nov 14, 2018

TIANRENK commented Nov 14, 2018

TIANRENK commented Nov 14, 2018

thomwolf commented Nov 14, 2018

TIANRENK commented Nov 14, 2018

thomwolf commented Nov 14, 2018

TIANRENK commented Nov 14, 2018

TIANRENK commented Nov 14, 2018

thomwolf commented Nov 14, 2018

TIANRENK commented Nov 14, 2018

thomwolf commented Nov 14, 2018

imxiaomin commented May 18, 2023