Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why the weights are not intialized ? #339

Closed
lemo2012 opened this issue Mar 3, 2019 · 5 comments
Closed

Why the weights are not intialized ? #339

lemo2012 opened this issue Mar 3, 2019 · 5 comments

Comments

@lemo2012
Copy link

lemo2012 commented Mar 3, 2019

03/03/2019 14:13:01 - INFO - pytorch_pretrained_bert.modeling - Weights of BertForMultiLabelSequenceClassification not initialized from pretrained model: ['classifier.weight', 'classifier.bias']
03/03/2019 14:13:01 - INFO - pytorch_pretrained_bert.modeling - Weights from pretrained model not used in BertForMultiLabelSequenceClassification: ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias']

@jplehmann
Copy link
Contributor

jplehmann commented Mar 5, 2019

I was wondering this myself. It looks like there's some configuration mismatch -- some parameters found which aren't used, and a few expected that aren't found.

I'm not sure if this is expected, since the top-level task-specific classifier is correctly NOT pre-trained... or if it's something more.

(question about lower performance moved into new issue)

@jplehmann
Copy link
Contributor

jplehmann commented Mar 5, 2019

I dug up some related issues which confirms my guess above -- this kind of message is expected since the models are not yet find-tuned to the task.

#161
#180

@thomwolf
Copy link
Member

thomwolf commented Mar 6, 2019

Yes this is the expected behavior.
I don't want to make the warning messages says this is "all good" because in some case, depending on the model you are loading in, this could be an unwanted behavior (not loading all the weights).

@viva2202
Copy link

viva2202 commented May 31, 2020

Hello @thomwolf : I continued pre-training with bert-base-uncased without fine tuning on round about 22K sequences and the precision @ K for MaskedLM task did not change at all. Is the result legitimate or do I rather have a problem loading the weights? I received the same warning message/ INFO. The data set is from the automotive domain. At what point can I expect the weights to change? Thank you very much for experience values.

@ghost
Copy link

ghost commented Jun 19, 2020

@viva2202, I did the same here using directly the "run_language_modeling.py" script, but with 11k sequences (I continued pretraining using training data only), and then fine-tuned it using BertForSequenceClassification. Got 1.75% increase in accuracy compared to not continuing pretraining.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants