We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
peft 0.11
No response
examples
I currently have the original LLM weights (llama3-8B) and the corresponding LoRA weights. When loading them, I use the following script:
from typing import List import torch from transformers import AutoModelForCausalLM, AutoTokenizer from peft import PeftModel import json class load_test: def __init__(self) -> None: pass def _load_model_tokenizer(self,args): self.model = AutoModelForCausalLM.from_pretrained(args.rp_path, torch_dtype=torch.bfloat16).to(self.device) if args.rp_lora_path != '': self.model = PeftModel.from_pretrained(self.model,args.rp_lora_path, adapter_name="default11").to(self.device) self.model.set_adapter("default11") torch.cuda.empty_cache() lt = load_test() lt.device = 'cuda:1' lt._load_model_tokenizer(args)
The expected behavior is that all weights are loaded onto the same device (cuda:1), but there is still some occupation on cuda:0.
All weights are loaded onto the specified device.
Additionally, I have checked the previous issues, but none of them provided a solution.
The text was updated successfully, but these errors were encountered:
Could you pass the torch_device argument, i.e.:
torch_device
PeftModel.from_pretrained(self.model,args.rp_lora_path, adapter_name="default11", torch_device=self.device)
For me, that solved the issue, otherwise, PEFT guesses what device to load the PEFT weights on.
This is hard to find as the argument is not documented. I created a PR (#1843) to do that.
Strangely, I also had to remove the torch.cuda.empty_cache() in my tests, otherwise memory would be assigned to cuda:0, no idea how that's possible.
torch.cuda.empty_cache()
cuda:0
Sorry, something went wrong.
您能否传递该torch_device参数,即: PeftModel.from_pretrained(self.model,args.rp_lora_path, adapter_name="default11", torch_device=self.device) 对我来说,这解决了这个问题,否则,PEFT 会猜测在哪个设备上加载 PEFT 重量。 由于没有记录该参数,因此很难找到。我创建了一个 PR ( #1843 ) 来做到这一点。 奇怪的是,在我的测试中我还必须删除torch.cuda.empty_cache(),否则内存将被分配给cuda:0,不知道这是怎么可能的。
您能否传递该torch_device参数,即:
对我来说,这解决了这个问题,否则,PEFT 会猜测在哪个设备上加载 PEFT 重量。
由于没有记录该参数,因此很难找到。我创建了一个 PR ( #1843 ) 来做到这一点。
奇怪的是,在我的测试中我还必须删除torch.cuda.empty_cache(),否则内存将被分配给cuda:0,不知道这是怎么可能的。
Thanks for your help! It's actually work. The torch.cuda.empty_cache() still cause allocate bad.
No branches or pull requests
System Info
peft 0.11
Who can help?
No response
Information
Tasks
examples
folderReproduction
I currently have the original LLM weights (llama3-8B) and the corresponding LoRA weights. When loading them, I use the following script:
The expected behavior is that all weights are loaded onto the same device (cuda:1), but there is still some occupation on cuda:0.
Expected behavior
All weights are loaded onto the specified device.
Additionally, I have checked the previous issues, but none of them provided a solution.
The text was updated successfully, but these errors were encountered: