Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VeRA not working with Quantization #1921

Closed
2 of 4 tasks
Sharan1712 opened this issue Jul 10, 2024 · 6 comments
Closed
2 of 4 tasks

VeRA not working with Quantization #1921

Sharan1712 opened this issue Jul 10, 2024 · 6 comments

Comments

@Sharan1712
Copy link

Sharan1712 commented Jul 10, 2024

System Info

Latest stable versions of PEFT, Accelerate and Transformers.

I am working on Quantization + PEFT techniques. I plan to implement BnB 4bit NF Double quantization with the new PEFT Method VeRA. I try to keep target modules as "all-linear" just as I do in LoRA. But I keep getting an error saying the expected dimension is not the same. Here is the error:

File "/work/sshyamsu/quantization_peft/common/model_utils.py", line 229, in get_accelerate_model model = get_peft_model(model, peft_config) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/sshyamsu/miniconda3/envs/thesis311/lib/python3.11/site-packages/peft/mapping.py", line 145, in get_peft_model return PeftModel(model, peft_config, adapter_name=adapter_name) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/sshyamsu/miniconda3/envs/thesis311/lib/python3.11/site-packages/peft/peft_model.py", line 138, in __init__ self.base_model = cls(model, {adapter_name: peft_config}, adapter_name) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/sshyamsu/miniconda3/envs/thesis311/lib/python3.11/site-packages/peft/tuners/vera/model.py", line 102, in __init__ super().__init__(model, config, adapter_name) File "/home/sshyamsu/miniconda3/envs/thesis311/lib/python3.11/site-packages/peft/tuners/tuners_utils.py", line 165, in __init__ self._pre_injection_hook(self.model, self.peft_config[adapter_name], adapter_name) File "/home/sshyamsu/miniconda3/envs/thesis311/lib/python3.11/site-packages/peft/tuners/vera/model.py", line 162, in _pre_injection_hook self._init_vera_A_vera_B(config, adapter_name) File "/home/sshyamsu/miniconda3/envs/thesis311/lib/python3.11/site-packages/peft/tuners/vera/model.py", line 148, in _init_vera_A_vera_B first_linear_out_dim, first_linear_in_dim = self._find_first_dim(config) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/sshyamsu/miniconda3/envs/thesis311/lib/python3.11/site-packages/peft/tuners/vera/model.py", line 136, in _find_first_dim raise ValueError( ValueError: Multiple target layers with different dimensions were specified. VeRA only supports a single dimension size. Expected shape (8388608, 1), got (22544384, 1).

Who can help?

No response

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder
  • My own task or dataset (give details below)

Instruction tuning Dataset - Unnatural Instructions Core

Reproduction

Just quantizing the model using bnb and doing get_peft_model

Expected behavior

Should return a PEFT model with VeRA modules added to it

@Sharan1712
Copy link
Author

@BenjaminBossan @sayakpaul

@BenjaminBossan
Copy link
Member

When layers are quantized by bitsandbytes, their weights are packed into flat tensors. This is why you see shapes like (8388608, 1). Here is an example:

from transformers import AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained(model_id, load_in_4bit=True)
model.model.decoder.layers[0].self_attn.k_proj.weight.shape
# prints torch.Size([294912, 1])

The true shape of this weight is 768x768. If you check the quant state, you can see it:

model.model.decoder.layers[0].self_attn.k_proj.weight.quant_state.shape
# prints torch.Size([768, 768])

Note that if you calculate 768*768, you get 589824, which is twice 294912 that we saw above. This is because of the way that bitsandbytes packs 4bit data.

@Sharan1712
Copy link
Author

Okay thank you. Will the PEFT method in the future be able to handle this? Usually whenever a new PEFT metho comes out, I think people are very interested to see its combination with Quantization methods

@BenjaminBossan
Copy link
Member

Adding quantization support to VeRA would be quite nice. If we find that VeRA is well adopted and there is demand for quantization support, we can put some time into it. If there is a contribution of this feature by the community, that would be even better ;) It sounded to me like you're actually working on this feature, did I understand that right? If you want, you could create a draft PR, even if it does not work, and I can take a look.

@Sharan1712
Copy link
Author

Thanks Benjamin. That sounds interesting, maybe I can think about it in the future. Unfortunately right now I have a deadline on my research on Quantization+Peft.

Copy link

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants