[`GPTNeoX`] Faster rotary embedding for GPTNeoX (based on llama changes) #25830

ArthurZucker · 2023-08-29T14:40:30Z

What does this PR do?

Fixes #25813 which indicates that ROPE is slow. It's a follow up of #22785, were ROPE was improved for Llama model.

HuggingFaceDocBuilderDev · 2023-08-29T15:24:33Z

The documentation is not available anymore as the PR was closed or merged.

…to improve-gpt-neox

amyeroberts

Thanks for adding this!

Only question is about the dtype casting in Idefics. Could you run slow tests and some checks on the effects of the model outputs when rope scaling is used?

src/transformers/models/idefics/modeling_idefics.py

gante

LGTM (and nice touch adding the copied from!)

LysandreJik · 2023-09-15T15:37:32Z

cc @StellaAthena FYI, this PR should greatly speed up the ROPE embeddings of the GPTNeoX model, similarly to how it was done for the LLaMa model.

Let us know if you want to review/have any comments!

ArthurZucker · 2023-09-18T13:52:59Z

cc @Narsil as this touches buffers that will no longer be persistent, will wait for you in case this is conflicting with TGI?

…ve-gpt-neox

…rs into improve-gpt-neox

…rs; branch 'main' of github.com:huggingface/transformers into improve-gpt-neox

…es) (huggingface#25830) * Faster rotary embedding for GPTNeoX * there might be un-necessary moves from device * fixup * fix dtype issue * add copied from statements * fox copies * oupsy * add copied from Llama for scaled ones as well * fixup * fix * fix copies

ArthurZucker added 3 commits August 29, 2023 14:33

Faster rotary embedding for GPTNeoX

eadf061

there might be un-necessary moves from device

f3a4b42

fixup

4fdbee6

ArthurZucker changed the title ~~Faster rotary embedding for GPTNeoX~~ [GPTNeoX] Faster rotary embedding for GPTNeoX (based on llama changes) Aug 30, 2023

ArthurZucker added 3 commits August 30, 2023 12:03

fix dtype issue

3934d84

Merge branch 'main' of https://github.com/huggingface/transformers in…

a1f513d

…to improve-gpt-neox

add copied from statements

6ac7956

ArthurZucker marked this pull request as ready for review August 31, 2023 14:15

ArthurZucker added 2 commits August 31, 2023 14:15

fox copies

5898157

oupsy

461f738

ArthurZucker requested review from amyeroberts and gante August 31, 2023 15:34

amyeroberts reviewed Aug 31, 2023

View reviewed changes

gante approved these changes Aug 31, 2023

View reviewed changes

ArthurZucker requested a review from LysandreJik September 15, 2023 14:31

ArthurZucker added 2 commits September 15, 2023 11:53

add copied from Llama for scaled ones as well

98aea0e

fixup

0cd1412

fxmarty mentioned this pull request Sep 16, 2023

Remove unnecessary unsqueeze - squeeze in rotary positional embedding #26162

Merged

LysandreJik approved these changes Sep 18, 2023

View reviewed changes

ArthurZucker added 7 commits October 4, 2023 09:55

Merge branch 'main' of github.com:huggingface/transformers into impro…

0d0f9cd

…ve-gpt-neox

Merge branch 'main' of github.com:huggingface/transformers into impro…

5a87a9b

…ve-gpt-neox

fix

f74f023

Merge branch 'main' of github.com:huggingface/transformers into impro…

6735cc0

…ve-gpt-neox

fix copies

c83bcfd

Merge branch 'improve-gpt-neox' of github.com:ArthurZucker/transforme…

35ebed7

…rs into improve-gpt-neox

Merge branch 'improve-gpt-neox' of github.com:ArthurZucker/transforme…

165c622

…rs; branch 'main' of github.com:huggingface/transformers into improve-gpt-neox

ArthurZucker merged commit 253f9a3 into huggingface:main Oct 5, 2023
3 checks passed

ArthurZucker mentioned this pull request Oct 16, 2023

Fixing RotaryEmbedding.forward to return float16 values in float16 precision mode. #24262

Closed

5 tasks

ArthurZucker deleted the improve-gpt-neox branch October 24, 2023 07:56

ArthurZucker mentioned this pull request Jan 19, 2024

[GPTNeoX] Fix BC issue with 4.36 #28602

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[`GPTNeoX`] Faster rotary embedding for GPTNeoX (based on llama changes) #25830

[`GPTNeoX`] Faster rotary embedding for GPTNeoX (based on llama changes) #25830

ArthurZucker commented Aug 29, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented Aug 29, 2023 •

edited

Loading

amyeroberts left a comment

gante left a comment

LysandreJik commented Sep 15, 2023

ArthurZucker commented Sep 18, 2023

[GPTNeoX] Faster rotary embedding for GPTNeoX (based on llama changes) #25830

[GPTNeoX] Faster rotary embedding for GPTNeoX (based on llama changes) #25830

Conversation

ArthurZucker commented Aug 29, 2023 • edited Loading

What does this PR do?

HuggingFaceDocBuilderDev commented Aug 29, 2023 • edited Loading

amyeroberts left a comment

Choose a reason for hiding this comment

gante left a comment

Choose a reason for hiding this comment

LysandreJik commented Sep 15, 2023

ArthurZucker commented Sep 18, 2023

[`GPTNeoX`] Faster rotary embedding for GPTNeoX (based on llama changes) #25830

[`GPTNeoX`] Faster rotary embedding for GPTNeoX (based on llama changes) #25830

ArthurZucker commented Aug 29, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented Aug 29, 2023 •

edited

Loading