Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate Lora fine-tuned model with HF #2025

Closed
GLorenzo679 opened this issue Nov 19, 2024 · 9 comments
Closed

Integrate Lora fine-tuned model with HF #2025

GLorenzo679 opened this issue Nov 19, 2024 · 9 comments

Comments

@GLorenzo679
Copy link

Hello, I'm having some issues when using a Lora fine-tuned model with HuggingFace from_pretrained().
I saw some discussions in the issues, and this one #933 caught my attention.
It's suggested to do this in order to load the fine-tuned model:

from transformers import AutoModelForCausalLM
from peft import PeftModel

# hub ID of the base model from the above fine-tune
model_id = "meta-llama/Llama-2-7b-hf" 

# output_dir from tune command
checkpoint_dir = "/my/output/dir" 

model = AutoModelForCausalLM.from_pretrained(model_id)
peft_model = PeftModel.from_pretrained(model, checkpoint_dir)

Isn't it possible to load the model (AutoModelForCausalLM.from_pretrained(model_id)) from a previously downloaded model (the one that you get with tune download), instead of having to download it again from HF?

If I try to load the previously downloaded model I get this error:

venv/lib/python3.12/site-packages/transformers/models/auto/auto_factory.py", line 514, in from_pretrained
    pretrained_model_name_or_path = adapter_config["base_model_name_or_path"]
                                    ~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^
KeyError: 'base_model_name_or_path' 

Looking at the code that generates the error, it seems like the transformers library is checking if peft the package is installed. If it is so, it looks for the adapter_config.json file (in my case the one generated with tune run) which is missing the base_model_name_or_path field.

@ebsmothers
Copy link
Contributor

Hi @GLorenzo679 thanks for creating this issue. We recommended the first path because it allows you to easily specify an HF model hub ID for the base model, and this is not very easy to save in the adapter config in the torchtune training loop (this is because tune download is a separate command, so we don't actually have access to the HF model hub ID of the model inside the training loop). However, I realized that we may be able to point to the local downloaded path instead. Can you try to patch the change in #2026 and see if this works for you? I just did it in our single-device LoRA recipe for now, so you can copy-paste to whatever recipe you were using. In my test, after making this change the key error goes away. Please let me know if you see the same.

@GLorenzo679
Copy link
Author

Thanks for the fast reply.
Your fix solved the problem. I got another error though.

venv/lib/python3.12/site-packages/transformers/modeling_utils.py", line 555, in load_state_dict
    with safe_open(checkpoint_file, framework="pt") as f:
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: No such file or directory: "models/codeLlama-7B/model-00001-of-00002.safetensors"

I solved this by deleting the model.safetensors.index.json file in the model folder.

@GLorenzo679
Copy link
Author

GLorenzo679 commented Nov 21, 2024

I have an update on this issue.
It seems that AutoModelForCausalLM.from_pretrained is loading the base model (the safetensors files) and not the base+adapter model generated with torch run.
Is there a way to use a fine-tuned model generated with torch run in huggingface?
I found on the documentation (https://pytorch.org/torchtune/stable/tutorials/e2e_flow.html#using-torchtune-checkpoints-with-other-libraries) that it's recommended to create a new state_dict with the corresponding index file.
This process is a bit tedious though, isn't there a more straightforward way?

I leave attached my model folder:
model

@felipemello1
Copy link
Contributor

related: #2048

@felipemello1
Copy link
Contributor

felipemello1 commented Nov 22, 2024

"This process is a bit tedious though"

@GLorenzo679, i agree 100%. I am working on fixing it, so this is invisible to the user. Sorry about that. Meanwhile, to unblock you, please do it manually and delete from the folder the other .safetensor files. You may have to delete the safetensors.index.json too.

@GLorenzo679
Copy link
Author

Thanks for pointing out that related issue!
I locally changed my save_checkpoint method as described in #2048 and now I can properly load the fine-tuned model.
The from_pretrained() loads safetensors as the default behaviour.
To load the .bin files I just needed to add the use_safetensors=False argument in the from_pretrained() method.
With the other fix suggested by @ebsmothers I can now say that it's possible to load LoRA fine-tuned models in HF.

@felipemello1
Copy link
Contributor

felipemello1 commented Dec 6, 2024

hey folks, PR is merged: #2074

Now it should be much easier to use vllm/huggingface. Instructions are in the or description

We will update the docs soon. Let us know if you find any issues and thanks for your patience :).

@joecummings
Copy link
Contributor

Instructions are in the readme.

@felipemello1 Where in the README are the instructions? I don't see them.

@felipemello1
Copy link
Contributor

Sorry, I meant to say that they are in the pr description -.-

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants