-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
VeRA not working with Quantization #1921
Comments
When layers are quantized by bitsandbytes, their weights are packed into flat tensors. This is why you see shapes like from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained(model_id, load_in_4bit=True)
model.model.decoder.layers[0].self_attn.k_proj.weight.shape
# prints torch.Size([294912, 1]) The true shape of this weight is 768x768. If you check the quant state, you can see it: model.model.decoder.layers[0].self_attn.k_proj.weight.quant_state.shape
# prints torch.Size([768, 768]) Note that if you calculate 768*768, you get 589824, which is twice 294912 that we saw above. This is because of the way that bitsandbytes packs 4bit data. |
Okay thank you. Will the PEFT method in the future be able to handle this? Usually whenever a new PEFT metho comes out, I think people are very interested to see its combination with Quantization methods |
Adding quantization support to VeRA would be quite nice. If we find that VeRA is well adopted and there is demand for quantization support, we can put some time into it. If there is a contribution of this feature by the community, that would be even better ;) It sounded to me like you're actually working on this feature, did I understand that right? If you want, you could create a draft PR, even if it does not work, and I can take a look. |
Thanks Benjamin. That sounds interesting, maybe I can think about it in the future. Unfortunately right now I have a deadline on my research on Quantization+Peft. |
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. |
System Info
Latest stable versions of PEFT, Accelerate and Transformers.
I am working on Quantization + PEFT techniques. I plan to implement BnB 4bit NF Double quantization with the new PEFT Method VeRA. I try to keep target modules as "all-linear" just as I do in LoRA. But I keep getting an error saying the expected dimension is not the same. Here is the error:
File "/work/sshyamsu/quantization_peft/common/model_utils.py", line 229, in get_accelerate_model model = get_peft_model(model, peft_config) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/sshyamsu/miniconda3/envs/thesis311/lib/python3.11/site-packages/peft/mapping.py", line 145, in get_peft_model return PeftModel(model, peft_config, adapter_name=adapter_name) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/sshyamsu/miniconda3/envs/thesis311/lib/python3.11/site-packages/peft/peft_model.py", line 138, in __init__ self.base_model = cls(model, {adapter_name: peft_config}, adapter_name) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/sshyamsu/miniconda3/envs/thesis311/lib/python3.11/site-packages/peft/tuners/vera/model.py", line 102, in __init__ super().__init__(model, config, adapter_name) File "/home/sshyamsu/miniconda3/envs/thesis311/lib/python3.11/site-packages/peft/tuners/tuners_utils.py", line 165, in __init__ self._pre_injection_hook(self.model, self.peft_config[adapter_name], adapter_name) File "/home/sshyamsu/miniconda3/envs/thesis311/lib/python3.11/site-packages/peft/tuners/vera/model.py", line 162, in _pre_injection_hook self._init_vera_A_vera_B(config, adapter_name) File "/home/sshyamsu/miniconda3/envs/thesis311/lib/python3.11/site-packages/peft/tuners/vera/model.py", line 148, in _init_vera_A_vera_B first_linear_out_dim, first_linear_in_dim = self._find_first_dim(config) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/sshyamsu/miniconda3/envs/thesis311/lib/python3.11/site-packages/peft/tuners/vera/model.py", line 136, in _find_first_dim raise ValueError( ValueError: Multiple target layers with different dimensions were specified. VeRA only supports a single dimension size. Expected shape (8388608, 1), got (22544384, 1).
Who can help?
No response
Information
Tasks
examples
folderInstruction tuning Dataset - Unnatural Instructions Core
Reproduction
Just quantizing the model using bnb and doing get_peft_model
Expected behavior
Should return a PEFT model with VeRA modules added to it
The text was updated successfully, but these errors were encountered: