Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged lora model forgets lora when converted to ggml. (with llama-cpp-python, DOES NOT repro with ./main) #1631

Closed
richkcho opened this issue May 29, 2023 · 6 comments

Comments

@richkcho
Copy link

richkcho commented May 29, 2023

Prerequisites

Please answer the following questions for yourself before submitting an issue.

  • [ x ] I am running the latest code. Development is very rapid so there are no tagged versions as of now.
  • [ x ] I carefully followed the README.md.
  • [ x ] I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
  • [ x ] I reviewed the Discussions, and have a new bug or useful enhancement to share.

Expected Behavior

After merging a lora with a HF model, I can convert it to ggml with convert-pth-to-ggml and observe that the converted model behaves similarly to the original merged model.

Context:

  • Lora was generated by text-generation-webui
  • Lora was trained using GPTQ 4-bit training mokeypatch, another one was trained with QLoRA based code (both fail similarly on merge + convert)
  • Model merge was done using code very similar to this PR

Current Behavior

For some reason, the converted model behaves similarly to the base model without the merge.

Environment and Context

Using latest llama-cpp-python for inference, latest llama.cpp for conversion.

  • llama.cpp: master-3b126f6

  • llama-cpp-python: 1.55.0

  • The rest of the environment is using the dev commit for huggingface libraries, see the qlora blog post

  • Physical (or virtual) hardware you are using, e.g. for Linux:

AMD Ryzen Threadripper 3970X 32-Core Processor
WSL 2

  • Operating System, e.g. for Linux:

Linux 5.10.16.3-microsoft-standard-WSL2 #1 SMP Fri Apr 2 22:23:49 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

  • SDK version, e.g. for Linux:
Python 3.10.9
GNU Make 4.2.1
g++ (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0

Failure Information (for bugs)

See above - the converted model is missing some state.

Steps to Reproduce

Please provide detailed steps for reproducing the issue. We are not sitting in front of your screen, so the more detail the better.

  1. Train a simple lora via qlora method or GPTQ in textui
  2. Merge lora into original model, see above for script example.
  3. Verify merged model behavior.
  4. Convert model to ggml
  5. Observe ggml has forgotten what the lora has learned.

Failure Logs

Unfortunately, this failure is not in such a way that produces failure logs, it is in model behavior.

@richkcho richkcho changed the title [User] Insert summary of your issue or enhancement.. Merged lora model forgets lora when converted to ggml. May 29, 2023
@KerfuffleV2
Copy link
Collaborator

I'd guess it's not actually merging the LORA into the existing tensors but instead saving them under a different name or just saving the LORA stuff in the model to be merged when it gets loaded. I don't think there's another way the GGML conversion could have the result you're describing.

The conversion scripts here would probably need to be adapted to look for the correct tensor names in the model (and/or merge if necessary). Or you'd need to merge it in a different way so that it actually ends up like a normal non-LORA model as far as stuff like the tensor names go.

@richkcho
Copy link
Author

That's my best guess as well, but I have no idea why this would happen or how to understand what I'm looking at when I inspect the models tensors. (I think others have managed to successfully merge loras into base models and convert to ggml) I'm hoping someone already has that knowledge and can chime in as to what I need to do, or what I need to look for.

@FNsi
Copy link
Contributor

FNsi commented May 29, 2023

Maybe try #1531 also see that peft PR

Also, 4q Lora target

check the code

bnb.nn.linear4bit

Try change you Lora.py like the PR mentioned about embedding or make it work.

Btw. If it actually work, you need to load in 4bit to merge it I suppose.

@richkcho
Copy link
Author

richkcho commented May 29, 2023

Maybe try #1531 also see that peft PR

that peft PR did not fix the issue here, unfortunately.

Btw. If it actually work, you need to load in 4bit to merge it I suppose.

This will not work, peft will error out in merge and unload. (see this line)

@FNsi
Copy link
Contributor

FNsi commented May 29, 2023

Maybe try #1531 also see that peft PR

that peft PR did not fix the issue here, unfortunately.

Btw. If it actually work, you need to load in 4bit to merge it I suppose.

This will not work, peft will error out in merge and unload. (see this line)

😅then u have to check inside that, like very earlier aplaca script. But I am afraid that may still won't work for q Lora "bnb.nn.linear4bit".

And I guess ggml's lora function is not working for your q Lora too.

@richkcho
Copy link
Author

And I guess ggml's lora function is not working for your q Lora too.

Strangely enough that does seem to work, which lead me to my next test...

both the merged ggml and the lora DOES work with ./main, but not with llama-cpp-python. So either something is wrong with my python test, my llama-cpp-python install, the shared object or something else. I should've looked into this earlier. Thanks for all the help and suggestions everyone!

@richkcho richkcho changed the title Merged lora model forgets lora when converted to ggml. Merged lora model forgets lora when converted to ggml. (WITH llama-cpp-python, DOES NOT REPRO WITH ./main) May 30, 2023
@richkcho richkcho changed the title Merged lora model forgets lora when converted to ggml. (WITH llama-cpp-python, DOES NOT REPRO WITH ./main) Merged lora model forgets lora when converted to ggml. (with llama-cpp-python, DOES NOT repro with ./main) May 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants