AdaLora + bnb not working #1113

BenjaminBossan · 2023-11-10T16:17:55Z

System Info

Who can help?

No response

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder
My own task or dataset (give details below)

Reproduction

The issue is this line:

peft/src/peft/tuners/adalora/bnb.py

Line 144 in 49ddefa

compute_dtype = lora_A.weight.dtype

In AdaLoRA, lora_A and lora_B are not ModuleDicts but ParameterDicts, so lora_A[adapter_name].weight.dtype does not exist, it should just be lora_A[adapter_name].dtype.

Furthermore, using AdaLoRA with 8bit bnb gives NaNs for me for opt-125m.

Expected behavior

AdaLoRA + bnb should work.

The text was updated successfully, but these errors were encountered:

This PR fixes a handful of issues with AdaLora, should resolve huggingface#1113. Description 1. lora_A.weight.device was called but for AdaLora, lora_A is a nn.Paramter, not an nn.Module, so the weight attribute does not exist. lora_A.device is sufficient. 2. For 8bit, an inplace operation failed because it was on a view. Now the operation is no longer inplace. 3. The loss term of the model output is not necessarily a torch tensor. In the test, it was a dict and did not contain an actual loss. Therefore, I added a check to make sure the loss is a torch tensor. Is there a better way? Notes Running pytest tests/test_gpu_examples.py -k adalora locally (with GPU) passes. Ideally, someone else can confirm, as normal unit tests won't catch this. If this is merged before huggingface#1115, skipping AdaLora tests in that PR can be removed.

This PR fixes a handful of issues with AdaLora, should resolve #1113. Description 1. lora_A.weight.device was called but for AdaLora, lora_A is a nn.Paramter, not an nn.Module, so the weight attribute does not exist. lora_A.device is sufficient. 2. For 8bit, an inplace operation failed because it was on a view. Now the operation is no longer inplace. 3. The loss term of the model output is not necessarily a torch tensor. In the test, it was a dict and did not contain an actual loss. Therefore, I added a check to make sure the loss is a torch tensor.

BenjaminBossan changed the title ~~AdaLora + 4bit bnb not working~~ AdaLora + bnb not working Nov 10, 2023

BenjaminBossan mentioned this issue Nov 10, 2023

Refactor base layer pattern #1106

Merged

BenjaminBossan mentioned this issue Nov 17, 2023

FIX: A few issues with AdaLora, extending GPU tests #1146

Merged

BenjaminBossan closed this as completed in #1146 Nov 17, 2023

younesbelkada mentioned this issue Dec 5, 2023

TST: Add regression tests 2 #1115

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AdaLora + bnb not working #1113

AdaLora + bnb not working #1113

BenjaminBossan commented Nov 10, 2023 •

edited

Loading

AdaLora + bnb not working #1113

AdaLora + bnb not working #1113

Comments

BenjaminBossan commented Nov 10, 2023 • edited Loading

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

BenjaminBossan commented Nov 10, 2023 •

edited

Loading