-
Notifications
You must be signed in to change notification settings - Fork 694
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conv1d and Conv3d are not working #115
Comments
Good catch! I had Conv2d in mind when I first wrote it. Looks like we just need to instantiate lora_A and lora_B differently depending on the kind of convolution. Happy to review and merge it if someone wants to implement and test it. Otherwise, I'll do it in the near future. |
I have this idea but it will change the way current Conv2d LoRA works. We can treat convolution as a matmul with the input as a flattened "window". For example, for Conv2d, the input is a window with
There are two benefits with the above implementation. (1) kernel size doesn't need to be the same in all spatial dimensions, and (2) we can use convolution in the LoRA branch in the forward pass instead of merging weights, similar to Linear implementation (relevant issue - #54). The first convolution (with lora_A) is normal convolution, with the same kernel size, but the second convolution (with lora_B) will be point-wise (aka 1x1) convolution. I haven't tested it but from what I understand, it should work. The situation becomes slightly complicated when grouped convolution is involved ( Another way of using Let me know what you think @edwardjhu. Thank you! |
Hello, I also encountered the same problem. Is this improvement feasible in your subsequent [experiments? @gau-nernst @edwardjhu |
I have changed the initialization of lora_B parameters so that the new implementation works for more than 2d cases (Pull Request #157 ). I have tested it, and it works for 1d to 3d. Also, the Lora parameter's shape is the same as before in the 2d case. I didn't test the group case. Please let me know if the group case needs to be fixed. |
Class
ConvLoRA
currently only works for Conv2d. By inspecting the shape ofB @ A
, which is (out_channels // groups * kernel_size, in_channels * kernel_size), we can see that it is only compatible with Conv2d.For reference, weight shape for
The text was updated successfully, but these errors were encountered: