-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH Argument to enable bias for LoRA B #2237
Merged
BenjaminBossan
merged 4 commits into
huggingface:main
from
BenjaminBossan:enh-lora-initialization-with-lora-b-bias
Nov 27, 2024
Merged
Changes from 3 commits
Commits
Show all changes
4 commits
Select commit
Hold shift + click to select a range
2ddbf88
ENH Argument to enable bias for LoRA B
BenjaminBossan b397352
Merge branch 'main' into enh-lora-initialization-with-lora-b-bias
BenjaminBossan 95e32e2
Fix failing Eva test
BenjaminBossan 0b7590c
Reviewer feedback: Fix bug in merging code
BenjaminBossan File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be
lora_B_bias
? I find that to be a bit more informative.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed I considered this. My main reasoning for going with the more generic
lora_bias
was that it leaves the door open for extending this argument in the future. Say, someone finds that LoRA works much better when also adding a bias to LoRA A, then we can adopt this argument to allow this too. Otherwise, we'd have to add a new argument (and we don't want to rename arguments for obvious reasons). LMK what you think of that reasoning.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that would still be preferrable over having a single argument for controlling the bias setup for LoRAs as I think it's still in its infancy.
Later it if it becomes a common standard to add biases for both LoRA matrices we can deprecate
lora_B_bias
andlora_A_bias
(if we introduce such an argument) to have a single argument calledlora_bias
.This is where I stand, but I am not too opinionated about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we care about reproducibility after upgrading PEFT? Then it seems detrimental to possibly merge control of A and B biases into one flag in the future and they should be separated into two flags from the start.
Otherwise, I think in terms of opportunity cost for experimentation on the user's side having two separate parameters (
lora_bias_A
,lora_bias_B
) is better. That said, having only one parameter appears to be simpler: let the implementation decide what the current best thing is for adding biases. So if you are just someone who wants to do LoRA best-current-practice it would be helpful to only have one flag. This becomes harder with two flags since there is no obvious 'no bias at all' vs. 'best-practice' setting. If we have simplicity first (and don't care about reproducibility after upgrading) then one parameter is the way to go, I think. What's the stance here?Ideally there would be another layer of abstraction, a more low-level abstraction, that has two bias parameters and one above that which decides what the best choice is at the moment. I.e.
BaseLoRA(..., lora_bias_A, lora_bias_B) -> LoRA(..., lora_bias)
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To clarify, my idea is that if we want to later add the possibility for a bias for LoRA A, the option would be something like
lora_bias="a"
, or for both,lora_bias="both"
. We should not change the meaning oflora_bias=True
, in order to ensure reproducibility, as you mentioned.If we find that the parameter gets overloaded, we can add the option for a sub-config, so
LoraConfig(..., lora_bias=LoraBiasConfig(bias_a=True, bias_b=True, ...))
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like
lora_bias
should be fine for now.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the feedback, I merged the PR as is.