-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AdaLora + bnb not working #1113
Comments
BenjaminBossan
changed the title
AdaLora + 4bit bnb not working
AdaLora + bnb not working
Nov 10, 2023
BenjaminBossan
added a commit
to BenjaminBossan/peft
that referenced
this issue
Nov 17, 2023
This PR fixes a handful of issues with AdaLora, should resolve huggingface#1113. Description 1. lora_A.weight.device was called but for AdaLora, lora_A is a nn.Paramter, not an nn.Module, so the weight attribute does not exist. lora_A.device is sufficient. 2. For 8bit, an inplace operation failed because it was on a view. Now the operation is no longer inplace. 3. The loss term of the model output is not necessarily a torch tensor. In the test, it was a dict and did not contain an actual loss. Therefore, I added a check to make sure the loss is a torch tensor. Is there a better way? Notes Running pytest tests/test_gpu_examples.py -k adalora locally (with GPU) passes. Ideally, someone else can confirm, as normal unit tests won't catch this. If this is merged before huggingface#1115, skipping AdaLora tests in that PR can be removed.
BenjaminBossan
added a commit
that referenced
this issue
Nov 17, 2023
This PR fixes a handful of issues with AdaLora, should resolve #1113. Description 1. lora_A.weight.device was called but for AdaLora, lora_A is a nn.Paramter, not an nn.Module, so the weight attribute does not exist. lora_A.device is sufficient. 2. For 8bit, an inplace operation failed because it was on a view. Now the operation is no longer inplace. 3. The loss term of the model output is not necessarily a torch tensor. In the test, it was a dict and did not contain an actual loss. Therefore, I added a check to make sure the loss is a torch tensor.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
System Info
Who can help?
No response
Information
Tasks
examples
folderReproduction
The issue is this line:
peft/src/peft/tuners/adalora/bnb.py
Line 144 in 49ddefa
In AdaLoRA,
lora_A
andlora_B
are notModuleDict
s butParameterDict
s, solora_A[adapter_name].weight.dtype
does not exist, it should just belora_A[adapter_name].dtype
.Furthermore, using AdaLoRA with 8bit bnb gives NaNs for me for opt-125m.
Expected behavior
AdaLoRA + bnb should work.
The text was updated successfully, but these errors were encountered: