Generate: support for left-padding on GPTNeoX and Llama #22382

gante · 2023-03-26T16:08:34Z

What does this PR do?

As the title indicates, adds left-padding support for GPTNeoX and Llama.

It adds the position_ids input, propagates all the way to the position embedding, and gathers the position embeddings given the value in position_ids. All slow tests are now passing in both models, including the newly added left-padding support test and the GPTNeoX integration test.

Also makes a few changes on Llama to make it more similar to other models 🤗

HuggingFaceDocBuilderDev · 2023-03-26T16:25:35Z

The documentation is not available anymore as the PR was closed or merged.

gante · 2023-03-26T16:37:10Z

The failing CI is fixed by #22383 :)

ArthurZucker

Works for me! I like the addition of the type hints 😉

ArthurZucker · 2023-03-27T11:13:38Z

src/transformers/models/llama/modeling_llama.py

@@ -649,18 +629,18 @@ class LlamaForCausalLM(LlamaPreTrainedModel):

    def __init__(self, config):
        super().__init__(config)
-        self.model = LlamaModel(config)
+        self.llama = LlamaModel(config)


This is breaking with regards to the chekpoints on the hub + the conversion script (renames using model.xxx) so if this is accepted, you can also update the checkpoints!

Yes this a is a big no no no no.

ArthurZucker · 2023-03-27T11:14:30Z

src/transformers/models/gptj/modeling_gptj.py

There doesn't seems to be changes to this file other than that should this be part of the PR?

Not necessarily -- it's a little typo (for which I probably wouldn't spend the time to open a PR 😅 )

ArthurZucker · 2023-03-27T11:15:39Z

src/transformers/models/gpt_neox/modeling_gpt_neox.py

Not entirely sure if this also applies this, but the cross_pt_flax test might end up failing as it happened when I tried to fix GPT-j 😉

ArthurZucker · 2023-03-27T11:16:08Z

tests/models/gpt_neox/test_modeling_gpt_neox.py

@@ -237,7 +237,7 @@ def test_feed_forward_chunking(self):
 @require_torch
 class GPTNeoXLanguageGenerationTest(unittest.TestCase):
    @slow
-    def test_lm_generate_codegen(self):
+    def test_lm_generate_gptneox(self):


the error originates from me, though :(

sgugger

Thanks for working on this. The change model->llama needs to be reverted as it will break all existing repos of Llama models on the Hub.

sgugger · 2023-03-27T13:26:10Z

src/transformers/models/llama/modeling_llama.py

-    base_model_prefix = "model"
+    base_model_prefix = "llama"


Absolutely not. We are not breaking all repos on the Hub with a Llama model.

sgugger · 2023-03-27T13:26:32Z

src/transformers/models/llama/modeling_llama.py

@@ -649,18 +629,18 @@ class LlamaForCausalLM(LlamaPreTrainedModel):

    def __init__(self, config):
        super().__init__(config)
-        self.model = LlamaModel(config)
+        self.llama = LlamaModel(config)


Yes this a is a big no no no no.

gante · 2023-03-27T14:32:33Z

@ArthurZucker @sgugger woopsie, I forgot that it affected the weight loading code -- I come from a place where weight names have to be specified 👼 Reverted (self.llama is self.model again)!

jquesnelle · 2023-03-29T04:57:40Z

It appears as if this may have broken FSDP. For example, as specified in the Alpaca repo, finetuning with --fsdp "full_sh ard auto_wrap" --fsdp_transformer_layer_cls_to_wrap LlamaDecoderLayer worked before this commit, but after it gives the error such as:

File "/home/fsuser/.local/lib/python3.8/site-packages/transformers/models/llama/modeling_llama.py", line 313, in forward
    hidden_states, self_attn_weights, present_key_value = self.self_attn(
  File "/home/fsuser/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
TypeError: forward() got an unexpected keyword argument 'position_ids'

Reverting the commit fixes it, although perhaps the problem is with accelerate not supporting position_ids? cc: @ArthurZucker

gante · 2023-03-29T08:19:46Z

@jquesnelle can you paste the full stack trace? It would allow us to find the root cause :D (maybe, as you mention, the problem is in accelerate... or maybe it comes from the Alpaca repo!)

…22382)

neggert · 2023-06-30T18:34:00Z

I'm seeing a pretty significant performance hit on RedPajama-7b-chat that I think is due to this change. I ran the PyTorch profiler and all of the repeat operators in apply_rotary_pos_emb are expensive and run mostly on CPU. Reverting to transformers 4.27.x resolves the performance issue.

ArthurZucker · 2023-07-03T06:55:59Z

You should try the main branch, #22785 removed the repeat solving this

gante added 4 commits March 26, 2023 13:52

tmp commit

6e45a06

tmp commit

ed35a56

llama and gptneox with position_ids as input

a70291e

fix test stuff

5cadf86

gante requested review from sgugger and ArthurZucker March 26, 2023 16:08

nit

3056392

ArthurZucker approved these changes Mar 27, 2023

View reviewed changes

sgugger reviewed Mar 27, 2023

View reviewed changes

Revert self.llama to self.model

eedf64c

sgugger approved these changes Mar 27, 2023

View reviewed changes

gante merged commit 7dcd870 into huggingface:main Mar 27, 2023

gante deleted the llama_gptneox_left_padding branch March 27, 2023 14:48

gante mentioned this pull request Mar 27, 2023

Support for corrected Llama qwopqwop200/GPTQ-for-LLaMa#89

Closed

jeffra mentioned this pull request Mar 27, 2023

[performance] ensure causal_mask is created directly on device #22378

Merged

ArthurZucker mentioned this pull request Mar 31, 2023

[WIP] Add GeoV model #22403

Closed

raghavanone pushed a commit to raghavanone/transformers that referenced this pull request Apr 5, 2023

Generate: support for left-padding on GPTNeoX and Llama (huggingface#…

791064c

…22382)

xloem pushed a commit to xloem/transformers that referenced this pull request Apr 9, 2023

Generate: support for left-padding on GPTNeoX and Llama (huggingface#…

57cae9e

…22382)

xloem pushed a commit to xloem/transformers that referenced this pull request Apr 10, 2023

Generate: support for left-padding on GPTNeoX and Llama (huggingface#…

4f1a4f1

…22382)

novice03 pushed a commit to novice03/transformers that referenced this pull request Jun 23, 2023

Generate: support for left-padding on GPTNeoX and Llama (huggingface#…

ca51a94

…22382)

guillaumekln mentioned this pull request Jul 6, 2023

batch execution transcribe in faster-whisper SYSTRAN/faster-whisper#59

Closed

younesbelkada mentioned this pull request Jul 7, 2023

Make correct padding for text generation with GPT-NEO #24694

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generate: support for left-padding on GPTNeoX and Llama #22382

Generate: support for left-padding on GPTNeoX and Llama #22382

gante commented Mar 26, 2023

HuggingFaceDocBuilderDev commented Mar 26, 2023 •

edited

Loading

gante commented Mar 26, 2023

ArthurZucker left a comment

ArthurZucker Mar 27, 2023

sgugger Mar 27, 2023

ArthurZucker Mar 27, 2023

gante Mar 27, 2023

ArthurZucker Mar 27, 2023

ArthurZucker Mar 27, 2023

gante Mar 27, 2023

sgugger left a comment

sgugger Mar 27, 2023

sgugger Mar 27, 2023

gante commented Mar 27, 2023

jquesnelle commented Mar 29, 2023

gante commented Mar 29, 2023

neggert commented Jun 30, 2023

ArthurZucker commented Jul 3, 2023

Generate: support for left-padding on GPTNeoX and Llama #22382

Generate: support for left-padding on GPTNeoX and Llama #22382

Conversation

gante commented Mar 26, 2023

What does this PR do?

HuggingFaceDocBuilderDev commented Mar 26, 2023 • edited Loading

gante commented Mar 26, 2023

ArthurZucker left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sgugger left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gante commented Mar 27, 2023

jquesnelle commented Mar 29, 2023

gante commented Mar 29, 2023

neggert commented Jun 30, 2023

ArthurZucker commented Jul 3, 2023

HuggingFaceDocBuilderDev commented Mar 26, 2023 •

edited

Loading