-
Notifications
You must be signed in to change notification settings - Fork 27k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generate: support for left-padding on GPTNeoX and Llama #22382
Conversation
The documentation is not available anymore as the PR was closed or merged. |
The failing CI is fixed by #22383 :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Works for me! I like the addition of the type hints 😉
@@ -649,18 +629,18 @@ class LlamaForCausalLM(LlamaPreTrainedModel): | |||
|
|||
def __init__(self, config): | |||
super().__init__(config) | |||
self.model = LlamaModel(config) | |||
self.llama = LlamaModel(config) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is breaking with regards to the chekpoints on the hub + the conversion script (renames using model.xxx
) so if this is accepted, you can also update the checkpoints!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes this a is a big no no no no.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There doesn't seems to be changes to this file other than that should this be part of the PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not necessarily -- it's a little typo (for which I probably wouldn't spend the time to open a PR 😅 )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not entirely sure if this also applies this, but the cross_pt_flax
test might end up failing as it happened when I tried to fix GPT-j 😉
@@ -237,7 +237,7 @@ def test_feed_forward_chunking(self): | |||
@require_torch | |||
class GPTNeoXLanguageGenerationTest(unittest.TestCase): | |||
@slow | |||
def test_lm_generate_codegen(self): | |||
def test_lm_generate_gptneox(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice catch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the error originates from me, though :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this. The change model->llama needs to be reverted as it will break all existing repos of Llama models on the Hub.
base_model_prefix = "model" | ||
base_model_prefix = "llama" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Absolutely not. We are not breaking all repos on the Hub with a Llama model.
@@ -649,18 +629,18 @@ class LlamaForCausalLM(LlamaPreTrainedModel): | |||
|
|||
def __init__(self, config): | |||
super().__init__(config) | |||
self.model = LlamaModel(config) | |||
self.llama = LlamaModel(config) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes this a is a big no no no no.
@ArthurZucker @sgugger woopsie, I forgot that it affected the weight loading code -- I come from a place where weight names have to be specified 👼 Reverted ( |
It appears as if this may have broken FSDP. For example, as specified in the Alpaca repo, finetuning with File "/home/fsuser/.local/lib/python3.8/site-packages/transformers/models/llama/modeling_llama.py", line 313, in forward
hidden_states, self_attn_weights, present_key_value = self.self_attn(
File "/home/fsuser/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
TypeError: forward() got an unexpected keyword argument 'position_ids' Reverting the commit fixes it, although perhaps the problem is with |
@jquesnelle can you paste the full stack trace? It would allow us to find the root cause :D (maybe, as you mention, the problem is in accelerate... or maybe it comes from the Alpaca repo!) |
I'm seeing a pretty significant performance hit on RedPajama-7b-chat that I think is due to this change. I ran the PyTorch profiler and all of the |
You should try the |
What does this PR do?
As the title indicates, adds left-padding support for GPTNeoX and Llama.
It adds the
position_ids
input, propagates all the way to the position embedding, and gathers the position embeddings given the value inposition_ids
. All slow tests are now passing in both models, including the newly added left-padding support test and the GPTNeoX integration test.Also makes a few changes on Llama to make it more similar to other models 🤗