-
Notifications
You must be signed in to change notification settings - Fork 27k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why the weights are not intialized ? #339
Comments
I was wondering this myself. It looks like there's some configuration mismatch -- some parameters found which aren't used, and a few expected that aren't found. I'm not sure if this is expected, since the top-level task-specific classifier is correctly NOT pre-trained... or if it's something more. (question about lower performance moved into new issue) |
Yes this is the expected behavior. |
Hello @thomwolf : I continued pre-training with bert-base-uncased without fine tuning on round about 22K sequences and the precision @ K for MaskedLM task did not change at all. Is the result legitimate or do I rather have a problem loading the weights? I received the same warning message/ INFO. The data set is from the automotive domain. At what point can I expect the weights to change? Thank you very much for experience values. |
@viva2202, I did the same here using directly the "run_language_modeling.py" script, but with 11k sequences (I continued pretraining using training data only), and then fine-tuned it using BertForSequenceClassification. Got 1.75% increase in accuracy compared to not continuing pretraining. |
03/03/2019 14:13:01 - INFO - pytorch_pretrained_bert.modeling - Weights of BertForMultiLabelSequenceClassification not initialized from pretrained model: ['classifier.weight', 'classifier.bias']
03/03/2019 14:13:01 - INFO - pytorch_pretrained_bert.modeling - Weights from pretrained model not used in BertForMultiLabelSequenceClassification: ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias']
The text was updated successfully, but these errors were encountered: