Why the weights are not intialized ? #339

lemo2012 · 2019-03-03T06:14:33Z

03/03/2019 14:13:01 - INFO - pytorch_pretrained_bert.modeling - Weights of BertForMultiLabelSequenceClassification not initialized from pretrained model: ['classifier.weight', 'classifier.bias']
03/03/2019 14:13:01 - INFO - pytorch_pretrained_bert.modeling - Weights from pretrained model not used in BertForMultiLabelSequenceClassification: ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias']

jplehmann · 2019-03-05T18:11:10Z

I was wondering this myself. It looks like there's some configuration mismatch -- some parameters found which aren't used, and a few expected that aren't found.

I'm not sure if this is expected, since the top-level task-specific classifier is correctly NOT pre-trained... or if it's something more.

(question about lower performance moved into new issue)

jplehmann · 2019-03-05T18:16:20Z

I dug up some related issues which confirms my guess above -- this kind of message is expected since the models are not yet find-tuned to the task.

#161
#180

thomwolf · 2019-03-06T09:46:23Z

Yes this is the expected behavior.
I don't want to make the warning messages says this is "all good" because in some case, depending on the model you are loading in, this could be an unwanted behavior (not loading all the weights).

viva2202 · 2020-05-31T21:43:33Z

Hello @thomwolf : I continued pre-training with bert-base-uncased without fine tuning on round about 22K sequences and the precision @ K for MaskedLM task did not change at all. Is the result legitimate or do I rather have a problem loading the weights? I received the same warning message/ INFO. The data set is from the automotive domain. At what point can I expect the weights to change? Thank you very much for experience values.

ghost · 2020-06-19T19:17:00Z

@viva2202, I did the same here using directly the "run_language_modeling.py" script, but with 11k sequences (I continued pretraining using training data only), and then fine-tuned it using BertForSequenceClassification. Got 1.75% increase in accuracy compared to not continuing pretraining.

jplehmann mentioned this issue Mar 5, 2019

MRPC Score Lower than Expected #346

Closed

thomwolf closed this as completed Mar 6, 2019

Pawel-Kranzberg mentioned this issue May 29, 2019

weights not initialized when saving/loading utterworks/fast-bert#18

Open

yunju63 mentioned this issue Dec 27, 2019

BertForMultiLabelSequenceClassification not initialized utterworks/fast-bert#132

Open

TingNLP mentioned this issue Apr 13, 2021

What to do about this warning message: "Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification" #5421

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why the weights are not intialized ? #339

Why the weights are not intialized ? #339

lemo2012 commented Mar 3, 2019

jplehmann commented Mar 5, 2019 •

edited

Loading

jplehmann commented Mar 5, 2019 •

edited

Loading

thomwolf commented Mar 6, 2019

viva2202 commented May 31, 2020 •

edited

Loading

ghost commented Jun 19, 2020

Why the weights are not intialized ? #339

Why the weights are not intialized ? #339

Comments

lemo2012 commented Mar 3, 2019

jplehmann commented Mar 5, 2019 • edited Loading

jplehmann commented Mar 5, 2019 • edited Loading

thomwolf commented Mar 6, 2019

viva2202 commented May 31, 2020 • edited Loading

ghost commented Jun 19, 2020

jplehmann commented Mar 5, 2019 •

edited

Loading

jplehmann commented Mar 5, 2019 •

edited

Loading

viva2202 commented May 31, 2020 •

edited

Loading