merge_and_unload issue? #868

Remmet577 · 2023-08-27T03:16:29Z

System Info

I am using the latest dev version of transformers, accelerate and peft (installed via !pip install -q -U git+) installed via Google Colab.

This worked a few days ago, but now when I attempt to merge an adapter back into the base model and then save to hub, the size is much smaller than that of the base model and it can't be loaded (generates error "Cannot copy out of meta tensor; no data!" when I attempt to do so).

The function I am using to merge the PeftModel back into the base model is:
(Code begins here)

def merge_adapter(lora_id, merged_id):
    config = PeftConfig.from_pretrained(lora_id)

    model_id = config.base_model_name_or_path
    model = AutoModelForCausalLM.from_pretrained(
        model_id,
        torch_dtype=dtype,
        device_map="auto",
        offload_folder="offload"
        )

    adapter = PeftModel.from_pretrained(
        model,
        lora_id,
        torch_dtype=dtype,
        device_map="auto",
        offload_folder="offload"
    )

    model = adapter.merge_and_unload(progressbar=True)
    tokenizer = AutoTokenizer.from_pretrained(model_id) #, use_fast=False)

    model.save_pretrained(
        merged_id,
        push_to_hub=True,
        repo_id=merged_id,
        private=True,
        #max_shard_size="4GB"
    #,
    )

    tokenizer.save_pretrained(
        merged_id,
        push_to_hub=True,
        repo_id=merged_id,
        private=True,
    )

The base model is 'meta-llama/Llama-2-7b-hf', which has two bin files, 1 @ 9.98GB and 1 @ 3.5GB. Previously when I ran this code, the merged model would be the same size. Now it only produces a single file @ 1.07GB.

This may be an error with the library, although I am not seeing any bug reports to indicate this. It may also be an error with the training code in the first place, my upload code, the HF library or anything else.

If anyone has any solutions, please let me know. Otherwise, if this is a bug, I guess this is my first bug report.

Who can help?

No response

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder
My own task or dataset (give details below)

Reproduction

This is the Colab I have been using. The data and LoRAs are private, but the two I am playing with are LLama 2 7B QLoRA fine-tuned on 1) a chat dataset and 2) a big hunk of raw text split into paragraphs (a piece of fanfiction).

https://colab.research.google.com/drive/15g2NU2wJ9fOvY3PJCCN5dVDYV8KSXbeS?usp=sharing

Expected behavior

The merged model should be created and be the same size as the base model, and I should be able to load it using AutoModelForCausalLM.from_pretrained as I was able to a few days ago.

The text was updated successfully, but these errors were encountered:

zhurui-xiaozhuzaizai · 2023-09-01T09:23:54Z

i meet the same issue.
when i merge lora to llama model, and the save it .the model size is smaller than before

BenjaminBossan · 2023-09-01T12:47:09Z

I think the issue is that the layers that are on "meta" device are not properly handled in this case. Their weights are not loaded and the lora weights are also just on meta device. The result you see probably only contains the weights that were actually loaded, missing all the meta device weights.

Honestly, I'm not completely sure how best to handle this situation. Maybe @pacman100 or @younesbelkada have a good idea.

zhurui-xiaozhuzaizai · 2023-09-07T09:20:48Z

@BenjaminBossan ,yes.
when i checked. i find that some weight load on cpu , some on gpu . and the cpu weight is the meta device weights.
how should i change the meta device to cpu or GPU ?

BenjaminBossan · 2023-09-07T12:06:19Z

Sorry, I'm really not sure what a solution would look like here. The reason why the weights are on meta device is that they needed to be offloaded for lack of memory, so it's not as simple as just loading everything. Hopefully one of the others can shine some light on this.

xhwang22 · 2023-09-18T19:05:20Z

This issue may occur when the GPU is occupied by other processes. device="auto" may not be able to load models into GPUs, so try to ensure that device is free or set device="cpu" instead.

indiejoseph · 2023-09-24T09:24:23Z

I've encountered this issue, the model was trained and merge with the same GPU(3090 24GB), VRAM is sufficient for lora training and inferencing, but after merged i got this issue

lukasld · 2023-09-24T19:18:49Z

This issue may occur when the GPU is occupied by other processes. device="auto" may not be able to load models into GPUs, so try to ensure that device is free or set device="cpu" instead.

Setting device_map="cpu" from auto sovled the issue for me

younesbelkada · 2023-09-25T12:52:02Z

I think that merge and unload is not supported for models that are offloaded into disk / cpu as accelerate puts the offloaded weights on the meta device. This is a bug we should look into

xiaobai52HZ · 2023-10-17T10:28:24Z

how to merge ptuning model?

younesbelkada · 2023-10-18T21:39:23Z

Hi @xiaobai52HZ I think that sadly currently merging p-tuning models is not supported. cc @pacman100 @BenjaminBossan

BenjaminBossan · 2023-10-18T21:46:06Z

Indeed.

chiragjn · 2023-10-19T06:01:35Z

Can we add boolean arguments to infer device map to disable all off-loading?
I still want the benefits of using multiple GPUs but raise an error if the weights cannot fit with GPUs alone

BenjaminBossan · 2023-10-23T11:15:45Z

Can we add boolean arguments to infer device map to disable all off-loading?

Offloading is not performed by PEFT. If you want to have this feature, please consider opening an issue in accelerate.

krzysiekpodk · 2023-10-25T18:06:07Z

This issue may occur when the GPU is occupied by other processes. device="auto" may not be able to load models into GPUs, so try to ensure that device is free or set device="cpu" instead.

This fixed the issue for me as well

github-actions · 2023-11-19T15:03:59Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

younesbelkada · 2023-11-20T09:30:21Z

Closing this issue ! Feel free to re-open in case you think your concerns are not being addressed

SuperBruceJia · 2023-11-21T05:30:37Z

Still facing the problem:

trainer.train()

# Save the adapter
trainer.save_model(saver_dir + '/adapter')

# Retrieve the model
model = trainer.model.base_model

# Loading the adapter
model = PeftModel.from_pretrained(model, saver_dir + '/adapter', torch_dtype=torch.float16, device_map="auto")
  
# Merge the base model and the adapter
model = model.merge_and_unload()

# Save the overall model
model.save_pretrained(saver_dir)

It seems that the base model will be saved without the LoRA adapter.

{
  "_name_or_path": "meta-llama/Llama-2-7b-hf",
  "architectures": [
    "LlamaModel"
  ],
  "attention_bias": false,
  "bos_token_id": 1,
  "eos_token_id": 2,
  "hidden_act": "silu",
  "hidden_size": 4096,
  "initializer_range": 0.02,
  "intermediate_size": 11008,
  "max_position_embeddings": 4096,
  "model_type": "llama",
  "num_attention_heads": 32,
  "num_hidden_layers": 32,
  "num_key_value_heads": 32,
  "pad_token_id": 2,
  "pretraining_tp": 1,
  "rms_norm_eps": 1e-05,
  "rope_scaling": null,
  "rope_theta": 10000.0,
  "tie_word_embeddings": false,
  "torch_dtype": "float16",
  "transformers_version": "4.34.0",
  "use_cache": true,
  "vocab_size": 32001
}

younesbelkada · 2023-11-21T10:28:39Z

This is expected @indiejoseph
If you call merge_and_unload() it will merge lora adapters into the base model, unload the lora layers and return the base transformers model.

BenjaminBossan mentioned this issue Sep 1, 2023

merged model size not right, some data not existed after use merge_and_unload #894

Closed

4 tasks

younesbelkada self-assigned this Sep 25, 2023

younesbelkada added the bug Something isn't working label Sep 25, 2023

blbadger mentioned this issue Oct 15, 2023

Change assignment during sub-module device alignment huggingface/accelerate#2058

Closed

4 tasks

blaze7451 mentioned this issue Oct 18, 2023

Quality of inference severely changed after merge_and_unload #1035

Closed

4 tasks

chiragjn mentioned this issue Oct 23, 2023

Question/Feature Request - Model Pipeline Parallelism on multiple GPUs without offloading? huggingface/accelerate#2074

Closed

4 tasks

blbadger mentioned this issue Oct 29, 2023

Align offloaded parameters before merging #1063

Closed

blbadger mentioned this issue Nov 9, 2023

Extend save_pretrained to offloaded models huggingface/transformers#27412

Merged

4 tasks

younesbelkada closed this as completed Nov 20, 2023

blbadger mentioned this issue Nov 27, 2023

Extend merge_and_unload to offloaded models #1190

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

merge_and_unload issue? #868

merge_and_unload issue? #868

Remmet577 commented Aug 27, 2023 •

edited by BenjaminBossan

Loading

zhurui-xiaozhuzaizai commented Sep 1, 2023

BenjaminBossan commented Sep 1, 2023

zhurui-xiaozhuzaizai commented Sep 7, 2023

BenjaminBossan commented Sep 7, 2023

xhwang22 commented Sep 18, 2023

indiejoseph commented Sep 24, 2023

lukasld commented Sep 24, 2023

younesbelkada commented Sep 25, 2023

xiaobai52HZ commented Oct 17, 2023

younesbelkada commented Oct 18, 2023

BenjaminBossan commented Oct 18, 2023

chiragjn commented Oct 19, 2023 •

edited

Loading

BenjaminBossan commented Oct 23, 2023

krzysiekpodk commented Oct 25, 2023

github-actions bot commented Nov 19, 2023

younesbelkada commented Nov 20, 2023

SuperBruceJia commented Nov 21, 2023

younesbelkada commented Nov 21, 2023

merge_and_unload issue? #868

merge_and_unload issue? #868

Comments

Remmet577 commented Aug 27, 2023 • edited by BenjaminBossan Loading

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

zhurui-xiaozhuzaizai commented Sep 1, 2023

BenjaminBossan commented Sep 1, 2023

zhurui-xiaozhuzaizai commented Sep 7, 2023

BenjaminBossan commented Sep 7, 2023

xhwang22 commented Sep 18, 2023

indiejoseph commented Sep 24, 2023

lukasld commented Sep 24, 2023

younesbelkada commented Sep 25, 2023

xiaobai52HZ commented Oct 17, 2023

younesbelkada commented Oct 18, 2023

BenjaminBossan commented Oct 18, 2023

chiragjn commented Oct 19, 2023 • edited Loading

BenjaminBossan commented Oct 23, 2023

krzysiekpodk commented Oct 25, 2023

github-actions bot commented Nov 19, 2023

younesbelkada commented Nov 20, 2023

SuperBruceJia commented Nov 21, 2023

younesbelkada commented Nov 21, 2023

Remmet577 commented Aug 27, 2023 •

edited by BenjaminBossan

Loading

chiragjn commented Oct 19, 2023 •

edited

Loading