Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: Error(s) in loading state_dict for BertModel #55

Open
SUFEHeisenberg opened this issue Sep 29, 2020 · 1 comment
Open

Comments

@SUFEHeisenberg
Copy link

您好!
最近用您的pytorch框架加载了很多的预训练模型fintune自己的任务都成功了,但是在用albert族的model的时候却都没成功。
报错如下:

$ python run.py --model albert_base_bright
Loading data...
401it [00:04, 96.21it/s]
140it [00:01, 101.19it/s]
135it [00:01, 86.25it/s]
Time usage: 0:00:07
Traceback (most recent call last):
  File "run.py", line 39, in <module>
    model = x.Model(config).to(config.device)
  File "F:\PycharmProjects\Bert-Chinese-Text-Classification-Pytorch-master\models\albert_base_bright.py", line 40, in __init__
    self.bert = BertModel.from_pretrained(config.bert_path,config=model_config)
  File "D:\anaconda3\lib\site-packages\pytorch_transformers\modeling_utils.py", line 594, in from_pretrained
    model.__class__.__name__, "\n\t".join(error_msgs)))
**RuntimeError: Error(s) in loading state_dict for BertModel:
        size mismatch for bert.embeddings.word_embeddings.weight: copying a param with shape torch.Size([21128, 128]) from checkpoint, the shape in current model is torch.Size([21128, 768]).**

其中albert_base_bright的的config.json如下:

{
  "attention_probs_dropout_prob": 0.0,
  "directionality": "bidi", 
  "hidden_act": "gelu", 
  "hidden_dropout_prob": 0.0,
  "hidden_size": 768,
  "embedding_size": 128,
  "initializer_range": 0.02, 
  "intermediate_size": 3072 ,
  "max_position_embeddings": 512, 
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "pooler_fc_size": 768,
  "pooler_num_attention_heads": 12,
  "pooler_num_fc_layers": 3, 
  "pooler_size_per_head": 128, 
  "pooler_type": "first_token_transform", 
  "type_vocab_size": 2, 
  "vocab_size": 21128,
   "ln_type":"postln"

}

albert族的model只有albert_xxlarge_zh能够成功,albert_xxlarge_zh的json配置文件如下:

{
  "attention_probs_dropout_prob": 0,
  "hidden_act": "relu",
  "hidden_dropout_prob": 0,
  "embedding_size": 128,
  "hidden_size": 4096,
  "initializer_range": 0.01,
  "intermediate_size": 16384,
  "max_position_embeddings": 512,
  "num_attention_heads": 16,
  "num_hidden_layers": 12,
  "num_hidden_groups": 1,
  "net_structure_type": 0,
  "layers_to_keep": [],
  "gap_size": 0,
  "num_memory_blocks": 0,
  "inner_group_num": 1,
  "down_scale_factor": 1,
  "type_vocab_size": 2,
  "vocab_size": 21128
}

我在github上找到了这个 issues
于是我用了HuggingFace的pytorch_transfomers来加载模型:

from pytorch_transformers import BertModel, BertConfig,BertTokenizer
class Model(nn.Module):

    def __init__(self, config):
        super(Model, self).__init__()
        model_config = BertConfig.from_json_file(os.path.join(config.bert_path,'config.json'))
        self.bert = BertModel.from_pretrained(config.bert_path,config=model_config)
        for param in self.bert.parameters():
            param.requires_grad = True
        self.fc = nn.Linear(config.hidden_size, config.num_classes)

    def forward(self, x):
        context = x[0]  # 输入的句子
        mask = x[2]  # 对padding部分进行mask,和句子一个size,padding部分用0表示,如:[1, 1, 1, 1, 0, 0]
        _, pooled = self.bert(context, attention_mask=mask, output_all_encoded_layers=False)
        out = self.fc(pooled)
        return out

请问您遇到过这种报错么?这种情况是什么原因呢?是要用convert_to_pytorch那几个文件转换一下吗?
望您有时间能够抽空回复,多谢!

@zhoumo580691212
Copy link

zhoumo580691212 commented Sep 29, 2022

你好,我今天遇到了同样情况,想用albert来跑,把models/albert.py中的self.hidden_size改成312,便可适配albert-tiny顺利跑通。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants