-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support multiple ranks and alphas for LoRA #873
Conversation
The documentation is not available anymore as the PR was closed or merged. |
Co-Authored-By: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>
Co-Authored-By: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for adding this so quickly.
There are a few things I'd like to see in relation to this PR, but given that it helps with the diffusers PR, I'm fine with adding those later to speed things up:
- unit tests
- an example in the docs or examples folder
- similar config options for other tuners where it makes sense, like AdaLoRA
- maybe: regex matching
Hi, @pacman100! This feature is a super useful addition! I am currently working on proper support of LoRA-C3Lier for Stable Diffusion. It's a special type of LoRA for which the Conv2d 3x3 layers may have different ranks and alphas. So this code will be super-useful for me to finish implementing this type of LoRA. Do you by chance know what is the ETA for this feature to get into main branch? |
@pacman100, @BenjaminBossan sorry to disturb you, I've been trying out the code from this branch, and I've faced and issue which is related to the usage of LoRAs with provided In short - if you need to create 2 adapters for model - first one without provided Here is a demo notebook which demonstrates how to reproduce this issue. As far as I see, there is a possible problem in this specific lines: peft/src/peft/tuners/lora/model.py Lines 164 to 166 in 95fa3c1
When the first adapter is initialized - the name of the module is passed as usual (e.g. "path.to.module"), but when the second adapter is added - the name of the module is nested in peft prefix (e.g. "base_model.model.path.to.module"), so that's why custom alphas and ranks are not propagated properly. |
@pacman100 @BenjaminBossan I've addressed this issue and worked on 3 out of 4 comments @BenjaminBossan left (all except AdaLoRA). Please let me know if there a way for me to somehow add my modifications to this PR. |
Hello @kovalexal, Thank you for actively working on this! Please raise a PR to merge your changes in this PR, we will be happy to quickly review it and incorporate the changes. |
Next week is the plan |
…ort of LoRA-C3Lier conversion (#937) * Fixed multirank multialpha for sequential loras, added tests, fixed docs * Refactored kohya_ss conversion script for proper support of LoRA-C3Lier * Fixed styling * Removed old comment from docstring
…ll the child classes
Co-Authored-By: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>
Co-Authored-By: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot Sourab! I left few minor comments as a first pass ! Let me know what do you think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work, this is a huge addition to PEFT.
I left a couple of comments, please take a look if they make sense. I have mostly reviewed the lora code closely, as I think IA³ and AdaLora just follow the same logic, so some of my comments apply there as well.
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, this LGTM, thanks for addressing the issues. I think there is one bug, however, please check my comment.
It is important that we also add tests for multiple active adapters, but if this is time critical, I'm okay with adding them later. Docs/examples can also be added later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for addressing the comments.
Let's not forget to add tests + docs + examples later.
I think using multiple adapters with dropout could be problematic, as more and more values will be set to 0. But I guess that's on the user to address, not sure if there is anything that we could do about it.
I believe the support for multiple active adapters should be used only during inference in eval mode. In that case, dropout won't be an issue. |
I see. But in theory, it would also work during training, right? And there isn't anything in the code right now that would prevent that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work @pacman100 , as always! Thanks!
Any clearly examples for this pr? |
Could anyone provide any example here? Thanks! |
Sorry that we haven't added any documentation yet. The idea is that you can do something like this: config = LoraConfig(
r=8,
rank_pattern={"foo": 16},
lora_alpha=10,
alpha_pattern={"bar": 20, "baz": 30},
) In this case, all LoRA layers would use rank 8, except for layers that match "foo", which would have rank 16. Similarly, LoRA alpha would be 10 by default except for "bar", which would be 20, and "baz", which would be 30. |
What does this PR do?
To Do: