Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added unsqueeze_dim to apply_rotary_pos_emb #27117

Conversation

ShashankMosaicML
Copy link
Contributor

@ShashankMosaicML ShashankMosaicML commented Oct 28, 2023

Making the unsqueeze dimension parameterized in the apply_rotary_pos_emb function in modeling_llama.py

This PR introduces a new parameter to the apply_rotary_pos_emb function which allows the specification of the dimension along which to unsqueeze to make the cosine and sine rotary tensors broadcastable to the query and key tensors. This will make the function compatible with codebases that have different shapes for the query and key tensors without needing any back-and-forth transposing.

Fixes #26948

Before submitting

Who can review?

@gante , @ArthurZucker

Copy link
Member

@gante gante left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for opening the PR @ShashankMosaicML! And nice touch, the docstring :)

CI is failing because of the # Copied from ... statement, which is used by our internal tools and CI to maintain one file per model policy while enabling the propagation of bugfixes and upgrades.

In this particular case, there is also a bug on our end -- recently we've made Llama the base implementation, from which others are copied from. As such, you'll have to:

  1. Remove the # Copied from ... statement in L194
  2. Run make fix-copies in your terminal
  3. Push the changes

You may also need to run make fixup afterwards, which applies automatic code formatting.

@gante gante requested review from ArthurZucker and amyeroberts and removed request for ArthurZucker October 28, 2023 09:48
@ShashankMosaicML
Copy link
Contributor Author

Thank you @gante for reviewing and providing me the steps to fix the errors! However I still see some errors after following the steps, and on clicking the details for the error, I see the following. Could you please let me know how to fix this error? Thank you!
Screenshot 2023-10-28 at 10 16 16 PM

Copy link
Collaborator

@amyeroberts amyeroberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this!

Change to the function LGTM - there's just some addition changes in this diff to docstrings which need to be removed before merging.

src/transformers/models/beit/image_processing_beit.py Outdated Show resolved Hide resolved
src/transformers/models/blip/image_processing_blip.py Outdated Show resolved Hide resolved
src/transformers/models/llama/modeling_llama.py Outdated Show resolved Hide resolved
@ShashankMosaicML
Copy link
Contributor Author

Hi @amyeroberts , thanks for suggesting the changes. I have incorporated those, but some quality checks are still failing. Could you take a look?
Screenshot 2023-10-30 at 2 10 24 PM

@gante
Copy link
Member

gante commented Oct 31, 2023

@ShashankMosaicML running make fixup then committing the changes doesn't fix it?

@ShashankMosaicML
Copy link
Contributor Author

@gante , I think that worked! (I wasn't running make fixup properly earlier 😅 )

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

Copy link
Collaborator

@amyeroberts amyeroberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for iterating!

@amyeroberts amyeroberts merged commit 037fb7d into huggingface:main Nov 1, 2023
19 checks passed
@ShashankMosaicML ShashankMosaicML deleted the adding_unsqueeze_dim_to_apply_rotary_pos_emb branch November 1, 2023 15:47
EduardoPach pushed a commit to EduardoPach/transformers that referenced this pull request Nov 19, 2023
* added unsqueeze_dim to apply_rotary_pos_emb

* Added docstring

* Modified docstring

* Modified docstring

* Modified docstring

* Modified docstring

* Modified docstring

* ran make fix-copies and make fixup

* Update src/transformers/models/llama/modeling_llama.py

Accepting the proposed changes in formatting.

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* incorporating PR suggestions

* incorporating PR suggestions

* incorporating PR suggestions

* incorporating PR suggestions

* ..

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Making the unsqueeze dimension parameterized in the apply_rotary_pos_emb function in modeling_llama.py
4 participants