Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow passing inputs_embeds instead of input_ids #757

Merged

Conversation

BenjaminBossan
Copy link
Member

Resolves #727

Description

Right now, there is an issue with a few PeftModelForXxx classes when users pass only inputs_embeds but not input_ids. First of all, the batch size was being derived on input_ids, now it is derived from inputs_embeds instead if input_ids is None. Furthermore, in PeftModelForCausalLM, the forward calls to the base model was not passing the inputs_embeds along, which resulted in errors down the line. These issues have been fixed now.

Open issues

During testing, I ran into a some problems.

  1. For decoder models, the test would fail for GPTBigCodeForCausalLM. I'm not sure why, but that model is already excluded in other tests, so I just excluded it for this test too.

  2. For feature extraction tests, the test would fail for prefix tuning. Not sure exactly why only here. I thought I only needed pass the inputs_embeds to the self.base_model call for prefix tuning, but then I got the error below, which is why I just excluded those tests for the time being.

TypeError: DebertaV2Model.forward() got an unexpected keyword argument
'past_key_values'

  1. Similarly, I wanted to add the tests to encoder-decoder models, but there I got the error:

ValueError: If no decoder_input_ids or decoder_inputs_embeds are
passed, input_ids cannot be None. Please pass either input_ids or decoder_input_ids or decoder_inputs_embeds.

So now, the tests are not run for encoder-decoder.

It would be great if someone else with more expertise could ensure that those cases work correctly.

Resolves huggingface#727

Right now, there is an issue with a few PeftModelForXxx classes when
users pass only inputs_embeds but not input_ids. First of all, the batch
size was being derived on input_ids, now it is derived from
inputs_embeds instead if input_ids is None. Furthermore, a few forward
calls to the base model were not passing the inputs_embeds along, which
resulted in errors down the line. These issues have been fixed now.

During testing, I ran into a some problems.

1. For decoder models, the test would fail for GPTBigCodeForCausalLM.
   I'm not sure why, but that model is already excluded in another test, so
   I just excluded it for this test too.

2. For feature extraction tests, the test would fail for prefix tuning.
   Not sure exactly why. I thought I only needed pass the inputs_embeds
   to the self.base_model call for prefix tuning, but then I got the error

> TypeError: DebertaV2Model.forward() got an unexpected keyword argument
'past_key_values'

   Therefore, I just excluded those tests for the time being.

3. Similarly, I wanted to add the tests to encoder-decoder models, but
   there I got the error:

> ValueError: If no `decoder_input_ids` or `decoder_inputs_embeds` are
passed, `input_ids` cannot be `None`. Please pass either `input_ids` or
`decoder_input_ids` or `decoder_inputs_embeds`.

   So now, the tests are not run for encoder-decoder.

It would be great if someone else with more expertise could ensure that
those cases work correctly.
@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Jul 27, 2023

The documentation is not available anymore as the PR was closed or merged.

Copy link
Contributor

@younesbelkada younesbelkada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking great thanks a lot!
Regarding gptbigcode + prompt learning - I think because of the current attention mechanism that is used by that model (MQA) it is not possible to support prompt learning methods as it involves concatenating the past key values in a specific way that is currently not supported in PEFT. Not sure if the fix should go in PEFT or in transformers though, but for now I would advocate to skip it unless there are a lot of request for that model + prompt learning

@BenjaminBossan
Copy link
Member Author

the current attention mechanism that is used by that model (MQA) it is not possible to support prompt learning methods

I see, thanks for explaining. This also means that I can re-use the skip_non_pt_mqa function instead of implementing a new one, I made that change.

Copy link
Contributor

@pacman100 pacman100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @BenjaminBossan for fixing the issues, LGTM 🚀. Left a comment.

@@ -75,6 +75,22 @@
}


def _get_batch_size(input_ids: Optional[torch.Tensor], inputs_embeds: Optional[torch.Tensor]) -> int:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we move it to utils?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

I also found an error in the new test where I forgot to send the model to the torch device, which is also fixed now.

Copy link
Contributor

@pacman100 pacman100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @BenjaminBossan! 🚀

@BenjaminBossan BenjaminBossan merged commit ec267c6 into huggingface:main Aug 2, 2023
@BenjaminBossan BenjaminBossan deleted the fix-727-passing-input-embeds branch August 2, 2023 14:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

'NoneType' object has no attribute 'shape'
4 participants