-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LoraConfig conflict when using layers_to_transform
in LlamaModel
#2155
Comments
layers_to_transform
in LlamaModel
Thanks for reporting the issue. Indeed, the usage of The idea here is that if we have a lora_config = LoraConfig(
r = 8,
lora_alpha=16,
target_modules=["q_proj", "k_proj", "v_proj"],
layers_to_transform=[0, 31],
layers_pattern="layers",
lora_dropout=0,
bias = "none",
) However, as you noted, using The TODOs from this issue are:
For point 3, would you be interested in tackling this @JINO-ROHIT since you refactored that part in #2102? |
@BenjaminBossan yeap il be happy to work on this |
Addresses part of huggingface#2155. Also fix type annotations where appropriate.
Addreses part of huggingface#2155. Description So far, the layers_pattern argument would only work if there was a prefix to the pattern. As an example, if the module name is: decoder.layer.0.attn.to_q and we pass layers_pattern="layer", this would match. However, if the module name was: layer.0.attn.to_q it would not work. Usually, when we create a model with AutoModelForFoo.from_pretrained, the "layer" part would never be first. However, if we load a model directly, e.g. through LlamaModel.from_pretrained, there is actually no prefix. As a consequence, we get no match there. With this PR, the prefix is made optional, so that the second pattern also matches. Status I'm not sure yet if this should be merged, as it is technically backwards incompatible. Users can still target the desired modules by carefully crafting a regex for target_modules so that it only matches the desired layer indices. However, this is tedious and layers_pattern was introduced to avoid having to do this.
@Evan02580 I created a PR to improve the docs in #2157 and another PR to adapt the regex in #2158. For the latter, I'm unsure if we should proceed though, as technically this is a backwards-incompatible change. |
Addresses part of #2155. Also fix type annotations where appropriate.
Addresses part of huggingface#2155. Also fix type annotations where appropriate.
Addresses part of huggingface#2155. Also fix type annotations where appropriate.
Addresses part of huggingface#2155. Also fix type annotations where appropriate.
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. |
System Info
peft: 0.13.2
transformers: 4.43.1
Who can help?
@BenjaminBossan @sayakpaul
Information
Tasks
examples
folderReproduction
When I tried to use LoraConfig and aimed to apply lora in first and last layers like:
It came the problem that:
The similar thing happen if I use
layers_pattern
instead oftarget_modules
(but it should be my misunderstanding oflayers_pattern
):But this time the problem shoud be the problem of default value of
target_modules
.However, when I use
model = AutoModelForCausalLM.from_pretrained("meta-llama/Meta-Llama-3-8B", torch_dtype=torch.bfloat16, trust_remote_code=True)
instead, it made it.Expected behavior
I'm not sure if it was the problem of
LlamaModel
. And I do also confuse about the use oflayers_patten
, since of doc of LoRA mentioned:layers_to_transform
: List of layers to be transformed by LoRA. If not specified, all layers intarget_modules
are transformed.layers_pattern
: Pattern to match layer names intarget_modules
, iflayers_to_transform
is specified. By defaultPeftModel
will look at common layer pattern (layers
,h
,blocks
, etc.), use it for exotic and custom models.It should work with
layers_to_transform
, however, I didn'd find a suitable approach to use. Maybe some examples can be put inclass LoraConfig(PeftConfig)
?The text was updated successfully, but these errors were encountered: