-
Notifications
You must be signed in to change notification settings - Fork 27k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix 29807, sinusoidal positional encodings overwritten by post_init() #29813
Fix 29807, sinusoidal positional encodings overwritten by post_init() #29813
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey! IMO we should rather place
if config.sinusoidal_pos_embds:
create_sinusoidal_embeddings(
n_pos=config.max_position_embeddings, dim=config.dim, out=self.position_embeddings.weight
)
in the _init_weights
function.
Also this should be made BC as older version did not have the requirement on _init_weights
being the only mode of initializing the weights
@ArthurZucker |
It does but you can check if the name is |
AFAIK |
Anyways, please look at my latest changes. They solve the problem, but not sure if this is optimal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perfect! Thanks for iterating
…#29813) * Check for requires_grad when initing weights * Add unit test * Move sinusoidal positional encoding generation after post_init() * Add modules to skip init list * Move create_sinusoidal_embeddings to _init_weights
What does this PR do?
Fixes #29807, sinusoidal positional encodings overwritten by post_init(). First time contributing, please let me know of any issues, comments.
Before submitting
Pull Request section?
to it if that's the case.
Sinusoidal positional encodings overwritten by post_init() #29807
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
@ArthurZucker, @younesbelkada, @amyeroberts