-
-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature requests: Add "AND gate" merge method #72
Comments
Very interesting suggestion. I will think about the implementation. |
wouldnt this basically spit out 1.5sd with only the most common concepts the 2 models have? what exactly would be a practical use of this? |
Basically, concept extraction in case of models A + (B AND C) replace weights in A with similar weights from B and C. In case of LoRA when you have multiple epochs you can do just like C = A AND B which wou purify these LoRAs from redundant things that are different in A and B |
This is the exact idea I have in mind for LORA merging. I noticed that when training 3 LORAs (for the same person) using 3 different models and then merging them together, the new LORA becomes very stable and flexible. I'm not saying it's the best, but it's something like:
I think that if we can extract the exact concept without it being polluted by other elements, then we can freely increase its strength to improve the quality & flexibility while also reduce the file size. |
is there any news on this one? i think this could massively increase lora usability and put it up on to dreambooth again |
I have created this https://github.com/JilekJosef/loli-diffusion-merger It's sort of fork of supermerger. However I have implemented AND gate for models only, and the calculation method used works at single value vs single value basis, at least tensor level should be implemented to make it more useable I believe. @Deathawaits4 |
This suggestion is similar to a weighted geometric average: def multiply_difference(a, b, c, alpha):
a = torch.complex(a - c, torch.zeros_like(a))
b = torch.complex(b - c, torch.zeros_like(b))
res = a**(1 - alpha) * b**alpha
return c + res.real if any parameter is 0 in A or B, then the corresponding parameter will be 0 in the output. If both parameters are close, then the output doesn't change much. |
Basically it's exact oposite to add difference method. Instead of checking if values (in models B and C) are different enough you check if they are similar enough and than you place these values to model A.
Purpose?
I was able to succesfully test the second point with 2 lora epochs. I used torch.norm(A-B) to determine how different they are which was a bit tricky to configure correctly (I didn't standardise the vectors first which was probably a part of the issue), there is probably better method than basic norm that I don't know about since it was first time I was doing somthing like this (I am used to web development in Java mostly). But like in the end I believe it proved the concept. To the first point I don't know how much it will work, but I believe it should't be hard for you to test it since it probably does not require anything more than slight modification of add difference method.
The text was updated successfully, but these errors were encountered: