Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

resize_token_embeddings not taken into account in save_pretrained for EncoderDecoderModel #11285

Closed
rahular opened this issue Apr 16, 2021 · 1 comment · Fixed by #11300
Closed

Comments

@rahular
Copy link
Contributor

rahular commented Apr 16, 2021

Environment info

  • transformers version: 4.5.0
  • Platform: Darwin-17.7.0-x86_64-i386-64bit
  • Python version: 3.7.2
  • PyTorch version (GPU?): 1.4.0 (False)
  • Tensorflow version (GPU?): not installed (NA)
  • Using GPU in script?: no
  • Using distributed or parallel set-up in script?: no

Who can help

@patrickvonplaten, @patil-suraj

Information

I am extending the embeddings of the decoder of an EncoderDecoderModel model. When I save it, the config does not reflect the new size. However, it works fine when I try doing the same for non EncoderDecoderModel models.

To reproduce

In [1]: model = t.EncoderDecoderModel.from_encoder_decoder_pretrained('bert-base-uncased', 'bert-base-uncased')

In [2]: model.decoder.bert.embeddings.word_embeddings
Out[2]: Embedding(30522, 768, padding_idx=0)

In [3]: model.decoder.resize_token_embeddings(30522+100)
Out[3]: Embedding(30622, 768)

In [4]: model.save_pretrained('test-bert')

Expected behavior

The updated embedding size should be saved in config.json

@cronoik
Copy link
Contributor

cronoik commented Apr 17, 2021

This is caused by the EncoderDecoderConfig which initializes independent objects (link) instead of utilizing the already existing ones.

You can fix that for the moment by calling:

model.config.decoder = model.decoder.config
model.config.encoder = model.encoder.config

PR will follow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants