added unsqueeze_dim to apply_rotary_pos_emb #27117

ShashankMosaicML · 2023-10-28T00:53:00Z

Making the unsqueeze dimension parameterized in the apply_rotary_pos_emb function in modeling_llama.py

This PR introduces a new parameter to the apply_rotary_pos_emb function which allows the specification of the dimension along which to unsqueeze to make the cosine and sine rotary tensors broadcastable to the query and key tensors. This will make the function compatible with codebases that have different shapes for the query and key tensors without needing any back-and-forth transposing.

Fixes #26948

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline, Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case. Link to the issue
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

@gante , @ArthurZucker

gante

Thank you for opening the PR @ShashankMosaicML! And nice touch, the docstring :)

CI is failing because of the # Copied from ... statement, which is used by our internal tools and CI to maintain one file per model policy while enabling the propagation of bugfixes and upgrades.

In this particular case, there is also a bug on our end -- recently we've made Llama the base implementation, from which others are copied from. As such, you'll have to:

Remove the # Copied from ... statement in L194
Run make fix-copies in your terminal
Push the changes

You may also need to run make fixup afterwards, which applies automatic code formatting.

ShashankMosaicML · 2023-10-29T05:18:11Z

Thank you @gante for reviewing and providing me the steps to fix the errors! However I still see some errors after following the steps, and on clicking the details for the error, I see the following. Could you please let me know how to fix this error? Thank you!

amyeroberts

Thanks for working on this!

Change to the function LGTM - there's just some addition changes in this diff to docstrings which need to be removed before merging.

src/transformers/models/beit/image_processing_beit.py

src/transformers/models/blip/image_processing_blip.py

src/transformers/models/llama/modeling_llama.py

Accepting the proposed changes in formatting. Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

ShashankMosaicML · 2023-10-30T21:10:50Z

Hi @amyeroberts , thanks for suggesting the changes. I have incorporated those, but some quality checks are still failing. Could you take a look?

gante · 2023-10-31T16:31:53Z

@ShashankMosaicML running make fixup then committing the changes doesn't fix it?

ShashankMosaicML · 2023-10-31T17:40:26Z

@gante , I think that worked! (I wasn't running make fixup properly earlier 😅 )

HuggingFaceDocBuilderDev · 2023-10-31T18:30:16Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

amyeroberts

Thanks for iterating!

* added unsqueeze_dim to apply_rotary_pos_emb * Added docstring * Modified docstring * Modified docstring * Modified docstring * Modified docstring * Modified docstring * ran make fix-copies and make fixup * Update src/transformers/models/llama/modeling_llama.py Accepting the proposed changes in formatting. Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * incorporating PR suggestions * incorporating PR suggestions * incorporating PR suggestions * incorporating PR suggestions * .. --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

ShashankMosaicML added 7 commits October 27, 2023 17:17

added unsqueeze_dim to apply_rotary_pos_emb

5169f10

Added docstring

cc8375f

Modified docstring

41d612a

Modified docstring

d77e6a4

Modified docstring

522f2f6

Modified docstring

edcce34

Modified docstring

e01bd66

ShashankMosaicML marked this pull request as ready for review October 28, 2023 01:05

ShashankMosaicML mentioned this pull request Oct 28, 2023

Making the unsqueeze dimension parameterized in the apply_rotary_pos_emb function in modeling_llama.py #26948

Closed

gante approved these changes Oct 28, 2023

View reviewed changes

gante requested review from ArthurZucker and amyeroberts and removed request for ArthurZucker October 28, 2023 09:48

ran make fix-copies and make fixup

4e414cd

amyeroberts reviewed Oct 30, 2023

View reviewed changes

ShashankMosaicML and others added 5 commits October 30, 2023 12:31

Update src/transformers/models/llama/modeling_llama.py

5da90b6

Accepting the proposed changes in formatting. Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

incorporating PR suggestions

0e3bace

incorporating PR suggestions

c74b5b6

incorporating PR suggestions

fa3cd93

incorporating PR suggestions

437e78f

..

8f61844

amyeroberts approved these changes Nov 1, 2023

View reviewed changes

amyeroberts merged commit 037fb7d into huggingface:main Nov 1, 2023
19 checks passed

ShashankMosaicML deleted the adding_unsqueeze_dim_to_apply_rotary_pos_emb branch November 1, 2023 15:47

gante mentioned this pull request Feb 20, 2024

Llama: fix batched generation #29109

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

added unsqueeze_dim to apply_rotary_pos_emb #27117

added unsqueeze_dim to apply_rotary_pos_emb #27117

ShashankMosaicML commented Oct 28, 2023 •

edited by gante

Loading

gante left a comment •

edited

Loading

ShashankMosaicML commented Oct 29, 2023

amyeroberts left a comment

ShashankMosaicML commented Oct 30, 2023

gante commented Oct 31, 2023

ShashankMosaicML commented Oct 31, 2023

HuggingFaceDocBuilderDev commented Oct 31, 2023

amyeroberts left a comment

added unsqueeze_dim to apply_rotary_pos_emb #27117

added unsqueeze_dim to apply_rotary_pos_emb #27117

Conversation

ShashankMosaicML commented Oct 28, 2023 • edited by gante Loading

Making the unsqueeze dimension parameterized in the apply_rotary_pos_emb function in modeling_llama.py

Before submitting

Who can review?

gante left a comment • edited Loading

Choose a reason for hiding this comment

ShashankMosaicML commented Oct 29, 2023

amyeroberts left a comment

Choose a reason for hiding this comment

ShashankMosaicML commented Oct 30, 2023

gante commented Oct 31, 2023

ShashankMosaicML commented Oct 31, 2023

HuggingFaceDocBuilderDev commented Oct 31, 2023

amyeroberts left a comment

Choose a reason for hiding this comment

ShashankMosaicML commented Oct 28, 2023 •

edited by gante

Loading

gante left a comment •

edited

Loading