how to train with 2bit quantization model? #10

duany049 · 2023-12-15T09:58:02Z

I found the implementation of 4-bit quantified , but I couldn't find a 2-bit one. Can you tell me how to implement a finereturn for a 2-bit quantization model

yxli2123 · 2023-12-19T17:22:30Z

Hi @duany049, we have moved our quantization framework into PEFT.

You can use the command here to obtain 2bit weights: https://github.com/yxli2123/LoftQ/tree/main#apply-loftq-and-save. Just change to --bits 2.

Keep in mind that we only provide 2-bit equivalent fp16 weights because 2-bit backend is not supported by bitsandbytes. If you have limited resource, we suggest you load the 2-bit equivalent fp16 weights in 4 bit by bitsandbytes, which saves 75% GPU compared to fp16.

duany049 · 2023-12-20T03:35:07Z

Thanks for you reply.
I changed --bits from 4 to 2 as you said. But the following exception was thrown.

  File "/data2/duan/miniconda3/envs/loftq/lib/python3.11/site-packages/peft/utils/loftq_utils.py", line 215, in loftq_init
    quantized_weight, max_abs, shape = quantizer.quantize_block(res)
                                       ^^^^^^^^^
UnboundLocalError: cannot access local variable 'quantizer' where it is not associated with a value

I fixed the problem by adding an new condition: num_bits == 2 in line 201, below is the code:

    if not is_bnb_4bit_available() or num_bits == 2:
        quantizer = NFQuantizer(num_bits=num_bits, device=device, method="normal", block_size=64)

Is my modification correct? Do I need to submit the code?

duany049 · 2023-12-20T05:48:41Z

Hi @duany049, we have moved our quantization framework into PEFT.

You can use the command here to obtain 2bit weights: https://github.com/yxli2123/LoftQ/tree/main#apply-loftq-and-save. Just change to --bits 2.

Keep in mind that we only provide 2-bit equivalent fp16 weights because 2-bit backend is not supported by bitsandbytes. If you have limited resource, we suggest you load the 2-bit equivalent fp16 weights in 4 bit by bitsandbytes, which saves 75% GPU compared to fp16.

I have fineturned 2bits llama2-7b with fakequantization, could I merge the adapter and 2bit model to a 2bit merged model？

yxli2123 · 2023-12-20T17:02:03Z

Hi @duany049, please install the up-to-date peft by pip install git+https://github.com/huggingface/peft.git. This issue has been resolved in the up-to-date version. https://github.com/huggingface/peft/blob/main/src/peft/utils/loftq_utils.py#L201

duany049 · 2023-12-25T07:02:56Z

Hi @duany049, we have moved our quantization framework into PEFT.

You can use the command here to obtain 2bit weights: https://github.com/yxli2123/LoftQ/tree/main#apply-loftq-and-save. Just change to --bits 2.

Keep in mind that we only provide 2-bit equivalent fp16 weights because 2-bit backend is not supported by bitsandbytes. If you have limited resource, we suggest you load the 2-bit equivalent fp16 weights in 4 bit by bitsandbytes, which saves 75% GPU compared to fp16.

Thank you for your reply. I have an another question：

Could I load the 2-bit equivalent fp16 weights in 2 bit by AutoGPTQ？
Will it help me save 87.5% of GPU compared to fp16？

yxli2123 · 2024-01-07T00:36:08Z

No, because I don't think AutoGPTQ and NF2 (a variant version of NF4) use the same quantization function.
No, since it uses NF4 on GPU. It can only save up to 75% of GPU compared to fp16, even if the values are mathematically equivalent to 2-bit values.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

how to train with 2bit quantization model? #10

how to train with 2bit quantization model? #10

duany049 commented Dec 15, 2023

yxli2123 commented Dec 19, 2023

duany049 commented Dec 20, 2023

duany049 commented Dec 20, 2023

yxli2123 commented Dec 20, 2023

duany049 commented Dec 25, 2023 •

edited

Loading

yxli2123 commented Jan 7, 2024

how to train with 2bit quantization model? #10

how to train with 2bit quantization model? #10

Comments

duany049 commented Dec 15, 2023

yxli2123 commented Dec 19, 2023

duany049 commented Dec 20, 2023

duany049 commented Dec 20, 2023

yxli2123 commented Dec 20, 2023

duany049 commented Dec 25, 2023 • edited Loading

yxli2123 commented Jan 7, 2024

duany049 commented Dec 25, 2023 •

edited

Loading