-
Notifications
You must be signed in to change notification settings - Fork 27.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FIX] resize_token_embeddings #26102
[FIX] resize_token_embeddings #26102
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the catch! Could you add a test? 😉
I'm not so familiar with the transformers repo -- where should a test for this code go? |
SHould be part of this tests |
@ArthurZucker please confirm this test works, I couldn't run it myself since |
Sure, I'll test this, can you run |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For simplicity let's just use 168
as the target 😉 Tests are green locally
Make sure to run make style
and we can merge this!
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
@ArthurZucker Ready for merge |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job! 🚀
* fix roundup command * add test for resize_token_embeddings * Update tests/test_modeling_common.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * style --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
What does this PR do?
When
resize_token_embeddings(new_num_tokens, pad_to_multiple)
is called withnew_num_tokens
a multiple ofpad_to_multiple
, the model should be resized tonew_num_tokens
. Due to a math error, it is instead resized tonew_num_tokens+pad_to_multiple
. This PR fixes that bug.@ArthurZucker #25088