Skip to content

OOM with Phi-3-mini (3.8B) on 83.5GB RAM due to LoftQ #1708

@adamamer20

Description

@adamamer20

System Info

System: Linux-6.1.58+-x86_64-with-glibc2.35 / Google Colab
peft: 0.10.0
transformers: 4.40.1
accelerate: 0.30.0
Python 3.10.12
RAM: 83.5 GB
GPU: A100 40GB
CPU: Intel(R) Xeon(R) CPU @ 2.20GHz

Who can help?

No response

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder
  • My own task or dataset (give details below)

Reproduction

from peft import LoraConfig, LoftQConfig, get_peft_model, prepare_model_for_kbit_training

from transformers import (
    AutoModelForCausalLM,
    AutoTokenizer,

checkpoint_path = "microsoft/Phi-3-mini-4k-instruct"
# checkpoint_path = "microsoft/Phi-3-mini-128k-instruct"
model_kwargs = dict(
    use_cache=False,
    trust_remote_code=True,
    attn_implementation="flash_attention_2", 
    torch_dtype=th.bfloat16,
    device_map="auto",
)
peft_config = LoraConfig(
    r = 8,
    lora_alpha = 32,
    lora_dropout = 0.05,
    bias = "none",
    task_type =  "CAUSAL_LM",
    target_modules = "all-linear",
    modules_to_save = None,
    loftq_config = LoftQConfig(loftq_bits=8),
    init_lora_weights="loftq",
    use_rslora = True, 
)

model = AutoModelForCausalLM.from_pretrained(checkpoint_path, **model_kwargs)
model = prepare_model_for_kbit_training(model).to("cpu")
th.cuda.empty_cache()
gc.collect()

Until here everything is fine. The model size is about 7GB and it's loaded onto RAM.

However when I try to get the PEFT model :

model = get_peft_model(model, peft_config)

This leads to a crash of the entire system, despite having plenty of space.

Note that when I remove LoftQ, the problem does not occur:

peft_config = LoraConfig(
    r = 8,
    lora_alpha = 32,
    lora_dropout = 0.05,
    bias = "none",
    task_type =  "CAUSAL_LM",
    target_modules = "all-linear",
    modules_to_save = None,
    use_rslora = True, 
)

Expected behavior

The model should comfortably fit in the RAM.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions