VeRA not working with Quantization #1921

Sharan1712 · 2024-07-10T17:16:52Z

System Info

Latest stable versions of PEFT, Accelerate and Transformers.

I am working on Quantization + PEFT techniques. I plan to implement BnB 4bit NF Double quantization with the new PEFT Method VeRA. I try to keep target modules as "all-linear" just as I do in LoRA. But I keep getting an error saying the expected dimension is not the same. Here is the error:

File "/work/sshyamsu/quantization_peft/common/model_utils.py", line 229, in get_accelerate_model model = get_peft_model(model, peft_config) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/sshyamsu/miniconda3/envs/thesis311/lib/python3.11/site-packages/peft/mapping.py", line 145, in get_peft_model return PeftModel(model, peft_config, adapter_name=adapter_name) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/sshyamsu/miniconda3/envs/thesis311/lib/python3.11/site-packages/peft/peft_model.py", line 138, in __init__ self.base_model = cls(model, {adapter_name: peft_config}, adapter_name) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/sshyamsu/miniconda3/envs/thesis311/lib/python3.11/site-packages/peft/tuners/vera/model.py", line 102, in __init__ super().__init__(model, config, adapter_name) File "/home/sshyamsu/miniconda3/envs/thesis311/lib/python3.11/site-packages/peft/tuners/tuners_utils.py", line 165, in __init__ self._pre_injection_hook(self.model, self.peft_config[adapter_name], adapter_name) File "/home/sshyamsu/miniconda3/envs/thesis311/lib/python3.11/site-packages/peft/tuners/vera/model.py", line 162, in _pre_injection_hook self._init_vera_A_vera_B(config, adapter_name) File "/home/sshyamsu/miniconda3/envs/thesis311/lib/python3.11/site-packages/peft/tuners/vera/model.py", line 148, in _init_vera_A_vera_B first_linear_out_dim, first_linear_in_dim = self._find_first_dim(config) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/sshyamsu/miniconda3/envs/thesis311/lib/python3.11/site-packages/peft/tuners/vera/model.py", line 136, in _find_first_dim raise ValueError( ValueError: Multiple target layers with different dimensions were specified. VeRA only supports a single dimension size. Expected shape (8388608, 1), got (22544384, 1).

Who can help?

No response

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder
My own task or dataset (give details below)

Instruction tuning Dataset - Unnatural Instructions Core

Reproduction

Just quantizing the model using bnb and doing get_peft_model

Expected behavior

Should return a PEFT model with VeRA modules added to it

The text was updated successfully, but these errors were encountered:

Sharan1712 · 2024-07-11T12:00:27Z

@BenjaminBossan @sayakpaul

BenjaminBossan · 2024-07-15T09:07:02Z

When layers are quantized by bitsandbytes, their weights are packed into flat tensors. This is why you see shapes like (8388608, 1). Here is an example:

from transformers import AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained(model_id, load_in_4bit=True)
model.model.decoder.layers[0].self_attn.k_proj.weight.shape
# prints torch.Size([294912, 1])

The true shape of this weight is 768x768. If you check the quant state, you can see it:

model.model.decoder.layers[0].self_attn.k_proj.weight.quant_state.shape
# prints torch.Size([768, 768])

Note that if you calculate 768*768, you get 589824, which is twice 294912 that we saw above. This is because of the way that bitsandbytes packs 4bit data.

Sharan1712 · 2024-07-15T09:30:37Z

Okay thank you. Will the PEFT method in the future be able to handle this? Usually whenever a new PEFT metho comes out, I think people are very interested to see its combination with Quantization methods

BenjaminBossan · 2024-07-15T10:10:57Z

Adding quantization support to VeRA would be quite nice. If we find that VeRA is well adopted and there is demand for quantization support, we can put some time into it. If there is a contribution of this feature by the community, that would be even better ;) It sounded to me like you're actually working on this feature, did I understand that right? If you want, you could create a draft PR, even if it does not work, and I can take a look.

Sharan1712 · 2024-07-16T15:11:32Z

Thanks Benjamin. That sounds interesting, maybe I can think about it in the future. Unfortunately right now I have a deadline on my research on Quantization+Peft.

github-actions · 2024-08-16T15:03:27Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

BenjaminBossan added the contributions-welcome label Jul 23, 2024

github-actions bot closed this as completed Aug 25, 2024

BenjaminBossan reopened this Aug 26, 2024

github-actions bot closed this as completed Sep 11, 2024

ZiadHelal mentioned this issue Sep 16, 2024

[Feature] Add Quantization Support for VeRA Method #2070

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

VeRA not working with Quantization #1921

VeRA not working with Quantization #1921

Sharan1712 commented Jul 10, 2024 •

edited

Loading

Sharan1712 commented Jul 11, 2024

BenjaminBossan commented Jul 15, 2024

Sharan1712 commented Jul 15, 2024

BenjaminBossan commented Jul 15, 2024

Sharan1712 commented Jul 16, 2024

github-actions bot commented Aug 16, 2024

VeRA not working with Quantization #1921

VeRA not working with Quantization #1921

Comments

Sharan1712 commented Jul 10, 2024 • edited Loading

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Sharan1712 commented Jul 11, 2024

BenjaminBossan commented Jul 15, 2024

Sharan1712 commented Jul 15, 2024

BenjaminBossan commented Jul 15, 2024

Sharan1712 commented Jul 16, 2024

github-actions bot commented Aug 16, 2024

Sharan1712 commented Jul 10, 2024 •

edited

Loading