-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow passing inputs_embeds instead of input_ids #757
Allow passing inputs_embeds instead of input_ids #757
Conversation
Resolves huggingface#727 Right now, there is an issue with a few PeftModelForXxx classes when users pass only inputs_embeds but not input_ids. First of all, the batch size was being derived on input_ids, now it is derived from inputs_embeds instead if input_ids is None. Furthermore, a few forward calls to the base model were not passing the inputs_embeds along, which resulted in errors down the line. These issues have been fixed now. During testing, I ran into a some problems. 1. For decoder models, the test would fail for GPTBigCodeForCausalLM. I'm not sure why, but that model is already excluded in another test, so I just excluded it for this test too. 2. For feature extraction tests, the test would fail for prefix tuning. Not sure exactly why. I thought I only needed pass the inputs_embeds to the self.base_model call for prefix tuning, but then I got the error > TypeError: DebertaV2Model.forward() got an unexpected keyword argument 'past_key_values' Therefore, I just excluded those tests for the time being. 3. Similarly, I wanted to add the tests to encoder-decoder models, but there I got the error: > ValueError: If no `decoder_input_ids` or `decoder_inputs_embeds` are passed, `input_ids` cannot be `None`. Please pass either `input_ids` or `decoder_input_ids` or `decoder_inputs_embeds`. So now, the tests are not run for encoder-decoder. It would be great if someone else with more expertise could ensure that those cases work correctly.
The documentation is not available anymore as the PR was closed or merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking great thanks a lot!
Regarding gptbigcode + prompt learning - I think because of the current attention mechanism that is used by that model (MQA) it is not possible to support prompt learning methods as it involves concatenating the past key values in a specific way that is currently not supported in PEFT. Not sure if the fix should go in PEFT or in transformers though, but for now I would advocate to skip it unless there are a lot of request for that model + prompt learning
I see, thanks for explaining. This also means that I can re-use the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @BenjaminBossan for fixing the issues, LGTM 🚀. Left a comment.
src/peft/peft_model.py
Outdated
@@ -75,6 +75,22 @@ | |||
} | |||
|
|||
|
|||
def _get_batch_size(input_ids: Optional[torch.Tensor], inputs_embeds: Optional[torch.Tensor]) -> int: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we move it to utils?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
I also found an error in the new test where I forgot to send the model to the torch device, which is also fixed now.
Forgot to send the model to the right torch device.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @BenjaminBossan! 🚀
Resolves #727
Description
Right now, there is an issue with a few
PeftModelForXxx
classes when users pass onlyinputs_embeds
but notinput_ids
. First of all, the batch size was being derived oninput_ids
, now it is derived frominputs_embeds
instead ifinput_ids
isNone
. Furthermore, inPeftModelForCausalLM
, theforward
calls to the base model was not passing theinputs_embeds
along, which resulted in errors down the line. These issues have been fixed now.Open issues
During testing, I ran into a some problems.
For decoder models, the test would fail for
GPTBigCodeForCausalLM
. I'm not sure why, but that model is already excluded in other tests, so I just excluded it for this test too.For feature extraction tests, the test would fail for prefix tuning. Not sure exactly why only here. I thought I only needed pass the
inputs_embeds
to theself.base_model call
for prefix tuning, but then I got the error below, which is why I just excluded those tests for the time being.So now, the tests are not run for encoder-decoder.
It would be great if someone else with more expertise could ensure that those cases work correctly.