We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TFGPT2LMHeadModel.save_pretrained
GPT2LMHeadModel.from_pretrained(..., from_tf=True)
transformers
@patrickvonplaten, @LysandreJik
Hello,
(My problem seems related to #5588)
I fine-tuned a TFGPT2LMHeadModel and saved it with .save_pretrained, giving me a tf_model.h5 and a config.json files. I try loading it with
TFGPT2LMHeadModel
.save_pretrained
tf_model.h5
config.json
model = transformers.GPT2LMHeadModel.from_pretrained( ".", from_tf=True, config="./config.json" ) ```. The path is fine. I get the following messages:
All TF 2.0 model weights were used when initializing GPT2LMHeadModel.
Some weights of GPT2LMHeadModel were not initialized from the TF 2.0 model and are newly initialized: ['transformer.h.0.attn.bias', 'transformer.h.0.attn.masked_bias', 'transformer.h.1.attn.bias', 'transformer.h.1.attn.masked_bias', 'transformer.h.2.attn.bias', 'transformer.h.2.attn.masked_bias', 'transformer.h.3.attn.bias', 'transformer.h.3.attn.masked_bias', 'transformer.h.4.attn.bias', 'transformer.h.4.attn.masked_bias', 'transformer.h.5.attn.bias', 'transformer.h.5.attn.masked_bias', 'transformer.h.6.attn.bias', 'transformer.h.6.attn.masked_bias', 'transformer.h.7.attn.bias', 'transformer.h.7.attn.masked_bias', 'transformer.h.8.attn.bias', 'transformer.h.8.attn.masked_bias', 'transformer.h.9.attn.bias', 'transformer.h.9.attn.masked_bias', 'transformer.h.10.attn.bias', 'transformer.h.10.attn.masked_bias', 'transformer.h.11.attn.bias', 'transformer.h.11.attn.masked_bias', 'transformer.h.12.attn.bias', 'transformer.h.12.attn.masked_bias', 'transformer.h.13.attn.bias', 'transformer.h.13.attn.masked_bias', 'transformer.h.14.attn.bias', 'transformer.h.14.attn.masked_bias', 'transformer.h.15.attn.bias', 'transformer.h.15.attn.masked_bias', 'transformer.h.16.attn.bias', 'transformer.h.16.attn.masked_bias', 'transformer.h.17.attn.bias', 'transformer.h.17.attn.masked_bias', 'transformer.h.18.attn.bias', 'transformer.h.18.attn.masked_bias', 'transformer.h.19.attn.bias', 'transformer.h.19.attn.masked_bias', 'transformer.h.20.attn.bias', 'transformer.h.20.attn.masked_bias', 'transformer.h.21.attn.bias', 'transformer.h.21.attn.masked_bias', 'transformer.h.22.attn.bias', 'transformer.h.22.attn.masked_bias', 'transformer.h.23.attn.bias', 'transformer.h.23.attn.masked_bias', 'transformer.h.24.attn.bias', 'transformer.h.24.attn.masked_bias', 'transformer.h.25.attn.bias', 'transformer.h.25.attn.masked_bias', 'transformer.h.26.attn.bias', 'transformer.h.26.attn.masked_bias', 'transformer.h.27.attn.bias', 'transformer.h.27.attn.masked_bias', 'transformer.h.28.attn.bias', 'transformer.h.28.attn.masked_bias', 'transformer.h.29.attn.bias', 'transformer.h.29.attn.masked_bias', 'transformer.h.30.attn.bias', 'transformer.h.30.attn.masked_bias', 'transformer.h.31.attn.bias', 'transformer.h.31.attn.masked_bias', 'transformer.h.32.attn.bias', 'transformer.h.32.attn.masked_bias', 'transformer.h.33.attn.bias', 'transformer.h.33.attn.masked_bias', 'transformer.h.34.attn.bias', 'transformer.h.34.attn.masked_bias', 'transformer.h.35.attn.bias', 'transformer.h.35.attn.masked_bias', 'transformer.h.36.attn.bias', 'transformer.h.36.attn.masked_bias', 'transformer.h.37.attn.bias', 'transformer.h.37.attn.masked_bias', 'transformer.h.38.attn.bias', 'transformer.h.38.attn.masked_bias', 'transformer.h.39.attn.bias', 'transformer.h.39.attn.masked_bias', 'transformer.h.40.attn.bias', 'transformer.h.40.attn.masked_bias', 'transformer.h.41.attn.bias', 'transformer.h.41.attn.masked_bias', 'transformer.h.42.attn.bias', 'transformer.h.42.attn.masked_bias', 'transformer.h.43.attn.bias', 'transformer.h.43.attn.masked_bias', 'transformer.h.44.attn.bias', 'transformer.h.44.attn.masked_bias', 'transformer.h.45.attn.bias', 'transformer.h.45.attn.masked_bias', 'transformer.h.46.attn.bias', 'transformer.h.46.attn.masked_bias', 'transformer.h.47.attn.bias', 'transformer.h.47.attn.masked_bias', 'lm_head.weight'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
It means that the conversion hasn't worked, right? Can I just use the model for generation? Should I change the way the model is saved ?
The text was updated successfully, but these errors were encountered:
Actually that's not an issue, this warning shouldn't be here. I'll open a PR to remove it shortly.
Sorry, something went wrong.
If you try generating text with it, you should get sensible results!
Great to hear, thanks.
Successfully merging a pull request may close this issue.
Environment info
transformers
version: 4.5.0Who can help
@patrickvonplaten, @LysandreJik
Information
Hello,
(My problem seems related to #5588)
I fine-tuned a
TFGPT2LMHeadModel
and saved it with.save_pretrained
, giving me atf_model.h5
and aconfig.json
files.I try loading it with
All TF 2.0 model weights were used when initializing GPT2LMHeadModel.
Some weights of GPT2LMHeadModel were not initialized from the TF 2.0 model and are newly initialized: ['transformer.h.0.attn.bias', 'transformer.h.0.attn.masked_bias', 'transformer.h.1.attn.bias', 'transformer.h.1.attn.masked_bias', 'transformer.h.2.attn.bias', 'transformer.h.2.attn.masked_bias', 'transformer.h.3.attn.bias', 'transformer.h.3.attn.masked_bias', 'transformer.h.4.attn.bias', 'transformer.h.4.attn.masked_bias', 'transformer.h.5.attn.bias', 'transformer.h.5.attn.masked_bias', 'transformer.h.6.attn.bias', 'transformer.h.6.attn.masked_bias', 'transformer.h.7.attn.bias', 'transformer.h.7.attn.masked_bias', 'transformer.h.8.attn.bias', 'transformer.h.8.attn.masked_bias', 'transformer.h.9.attn.bias', 'transformer.h.9.attn.masked_bias', 'transformer.h.10.attn.bias', 'transformer.h.10.attn.masked_bias', 'transformer.h.11.attn.bias', 'transformer.h.11.attn.masked_bias', 'transformer.h.12.attn.bias', 'transformer.h.12.attn.masked_bias', 'transformer.h.13.attn.bias', 'transformer.h.13.attn.masked_bias', 'transformer.h.14.attn.bias', 'transformer.h.14.attn.masked_bias', 'transformer.h.15.attn.bias', 'transformer.h.15.attn.masked_bias', 'transformer.h.16.attn.bias', 'transformer.h.16.attn.masked_bias', 'transformer.h.17.attn.bias', 'transformer.h.17.attn.masked_bias', 'transformer.h.18.attn.bias', 'transformer.h.18.attn.masked_bias', 'transformer.h.19.attn.bias', 'transformer.h.19.attn.masked_bias', 'transformer.h.20.attn.bias', 'transformer.h.20.attn.masked_bias', 'transformer.h.21.attn.bias', 'transformer.h.21.attn.masked_bias', 'transformer.h.22.attn.bias', 'transformer.h.22.attn.masked_bias', 'transformer.h.23.attn.bias', 'transformer.h.23.attn.masked_bias', 'transformer.h.24.attn.bias', 'transformer.h.24.attn.masked_bias', 'transformer.h.25.attn.bias', 'transformer.h.25.attn.masked_bias', 'transformer.h.26.attn.bias', 'transformer.h.26.attn.masked_bias', 'transformer.h.27.attn.bias', 'transformer.h.27.attn.masked_bias', 'transformer.h.28.attn.bias', 'transformer.h.28.attn.masked_bias', 'transformer.h.29.attn.bias', 'transformer.h.29.attn.masked_bias', 'transformer.h.30.attn.bias', 'transformer.h.30.attn.masked_bias', 'transformer.h.31.attn.bias', 'transformer.h.31.attn.masked_bias', 'transformer.h.32.attn.bias', 'transformer.h.32.attn.masked_bias', 'transformer.h.33.attn.bias', 'transformer.h.33.attn.masked_bias', 'transformer.h.34.attn.bias', 'transformer.h.34.attn.masked_bias', 'transformer.h.35.attn.bias', 'transformer.h.35.attn.masked_bias', 'transformer.h.36.attn.bias', 'transformer.h.36.attn.masked_bias', 'transformer.h.37.attn.bias', 'transformer.h.37.attn.masked_bias', 'transformer.h.38.attn.bias', 'transformer.h.38.attn.masked_bias', 'transformer.h.39.attn.bias', 'transformer.h.39.attn.masked_bias', 'transformer.h.40.attn.bias', 'transformer.h.40.attn.masked_bias', 'transformer.h.41.attn.bias', 'transformer.h.41.attn.masked_bias', 'transformer.h.42.attn.bias', 'transformer.h.42.attn.masked_bias', 'transformer.h.43.attn.bias', 'transformer.h.43.attn.masked_bias', 'transformer.h.44.attn.bias', 'transformer.h.44.attn.masked_bias', 'transformer.h.45.attn.bias', 'transformer.h.45.attn.masked_bias', 'transformer.h.46.attn.bias', 'transformer.h.46.attn.masked_bias', 'transformer.h.47.attn.bias', 'transformer.h.47.attn.masked_bias', 'lm_head.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
The text was updated successfully, but these errors were encountered: