Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature requests: Add "AND gate" merge method #72

Open
JilekJosef opened this issue May 4, 2023 · 7 comments
Open

Feature requests: Add "AND gate" merge method #72

JilekJosef opened this issue May 4, 2023 · 7 comments

Comments

@JilekJosef
Copy link

JilekJosef commented May 4, 2023

Basically it's exact oposite to add difference method. Instead of checking if values (in models B and C) are different enough you check if they are similar enough and than you place these values to model A.
Purpose?

  1. Extracting concepts that 2 models have in common to another
  2. Fixing model errors/merge without transfering error values (well, this is probably more of LoRA thing since you don't need the third model here and just preserve similar vectors and trash the rest)

I was able to succesfully test the second point with 2 lora epochs. I used torch.norm(A-B) to determine how different they are which was a bit tricky to configure correctly (I didn't standardise the vectors first which was probably a part of the issue), there is probably better method than basic norm that I don't know about since it was first time I was doing somthing like this (I am used to web development in Java mostly). But like in the end I believe it proved the concept. To the first point I don't know how much it will work, but I believe it should't be hard for you to test it since it probably does not require anything more than slight modification of add difference method.

@hako-mikan
Copy link
Owner

Very interesting suggestion. I will think about the implementation.

@zethfoxster
Copy link

wouldnt this basically spit out 1.5sd with only the most common concepts the 2 models have? what exactly would be a practical use of this?

@JilekJosef
Copy link
Author

wouldnt this basically spit out 1.5sd with only the most common concepts the 2 models have? what exactly would be a practical use of this?

Basically, concept extraction in case of models A + (B AND C) replace weights in A with similar weights from B and C. In case of LoRA when you have multiple epochs you can do just like C = A AND B which wou purify these LoRAs from redundant things that are different in A and B

@le-khang
Copy link

This is the exact idea I have in mind for LORA merging. I noticed that when training 3 LORAs (for the same person) using 3 different models and then merging them together, the new LORA becomes very stable and flexible. I'm not saying it's the best, but it's something like:

  • LORA A can produce good results around 9/10 if used with Model A, but with other models it can be a bit random (between 4-9/10).
  • LORA Merged ABC can produce good results across many models (around 7-8/10).
  • I tried to merge more LORAs to see what would happen, but in my experience, when using more than 4 LORAs up to 10 LORAs, the results average out to around 7-7.5/10.

I think that if we can extract the exact concept without it being polluted by other elements, then we can freely increase its strength to improve the quality & flexibility while also reduce the file size.

@Deathawaits4
Copy link

is there any news on this one? i think this could massively increase lora usability and put it up on to dreambooth again

@JilekJosef
Copy link
Author

I have created this https://github.com/JilekJosef/loli-diffusion-merger It's sort of fork of supermerger. However I have implemented AND gate for models only, and the calculation method used works at single value vs single value basis, at least tensor level should be implemented to make it more useable I believe. @Deathawaits4

@ljleb
Copy link

ljleb commented Feb 19, 2024

This suggestion is similar to a weighted geometric average:

def multiply_difference(a, b, c, alpha):
    a = torch.complex(a - c, torch.zeros_like(a))
    b = torch.complex(b - c, torch.zeros_like(b))
    res = a**(1 - alpha) * b**alpha
    return c + res.real

if any parameter is 0 in A or B, then the corresponding parameter will be 0 in the output. If both parameters are close, then the output doesn't change much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants