Cannot implement LoRA on a custom model containing transformer encoder from pytorch #161

wsuSaiman · 2024-03-05T22:32:50Z

I am using LoRA for my custom model. Within the model i have trasnformer encoder block and a series of linear layers. I want to implement LoRA in the custom model specifically targeting the q,k and v projection weights on the self attention block. However, in the transformer encoder layer in pytorch, I cannot find any module names corresponding to the q, k and v projection.

class CustomModel(nn.Module):
   def __init__():
      self.embedding = nn.Linear(100,512)
      self.encoder_layer_1 = nn.TransformerEncoderLayer(d_model=self.model_dim, nhead=16)
      self.transformers_1 = torch.nn.TransformerEncoder(self.encoder_layer_1 , num_layers = 24)
      self.output = nn.Linear(200 * 512,200)

     def forward():

The code above represents the architecture of my custom model.

Has anybody implemented LoRA on the nn.Transformer on the q,k and v projection weights before? If yes, what is the module name for those weights?

The text was updated successfully, but these errors were encountered:

qianggegeshiyihao · 2024-11-21T09:19:21Z

I meet the same problem, I think first, nn.TransformerEncoderLayer uses MultiheadAttention, and MultiheadAttention uses
self.in_proj_weight = Parameter(torch.empty((3 * embed_dim, embed_dim), **factory_kwargs))
self.register_parameter('q_proj_weight', None)
self.register_parameter('k_proj_weight', None)
self.register_parameter('v_proj_weight', None)
If u use PEFT to find qkv, it only finds nn.Module(like nn.linear, nn.conv1d, etc), not Parameter, so even u print the model, u can not find in_proj_weight, u can only find out_proj that uses
self.out_proj = NonDynamicallyQuantizableLinear(embed_dim, embed_dim, bias=bias, **factory_kwargs).
Second, if u separate qkv and register them by nn.linear, it uses F.multi_head_attention_forward and in this function, it uses
q, k, v = _in_projection(query, key, value, q_proj_weight, k_proj_weight, v_proj_weight, b_q, b_k, b_v)
and then in _in_projection it
return linear(q, w_q, b_q), linear(k, w_k, b_k), linear(v, w_v, b_v)
so it goes wrong too. Even if u set in_proj_weight=None and use_separate_proj_weight=False and there's
if not use_separate_proj_weight:
assert in_proj_weight is not None, "use_separate_proj_weight is False but in_proj_weight is None"
so I think u can only write transformerencoder by urself, or find other repository.
I wish this can help u

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cannot implement LoRA on a custom model containing transformer encoder from pytorch #161

Cannot implement LoRA on a custom model containing transformer encoder from pytorch #161

wsuSaiman commented Mar 5, 2024

qianggegeshiyihao commented Nov 21, 2024

Cannot implement LoRA on a custom model containing transformer encoder from pytorch #161

Cannot implement LoRA on a custom model containing transformer encoder from pytorch #161

Comments

wsuSaiman commented Mar 5, 2024

qianggegeshiyihao commented Nov 21, 2024