Skip to content

Getting Dora Model Is Very Slow #1593

@mallorbc

Description

@mallorbc

System Info

Package Version


accelerate 0.29.0.dev0
aiohttp 3.9.3
aiosignal 1.3.1
annotated-types 0.6.0
appdirs 1.4.4
async-timeout 4.0.3
attrs 23.2.0
bitsandbytes 0.43.0
certifi 2024.2.2
charset-normalizer 3.3.2
click 8.1.7
datasets 2.18.0
deepspeed 0.14.0+ce78a632
dill 0.3.8
docker-pycreds 0.4.0
docstring_parser 0.16
einops 0.7.0
exceptiongroup 1.2.0
filelock 3.13.3
flash-attn 2.5.6
frozenlist 1.4.1
fsspec 2024.2.0
gitdb 4.0.11
GitPython 3.1.42
hjson 3.1.0
huggingface-hub 0.22.1
idna 3.6
iniconfig 2.0.0
Jinja2 3.1.3
markdown-it-py 3.0.0
MarkupSafe 2.1.5
mdurl 0.1.2
mpmath 1.3.0
multidict 6.0.5
multiprocess 0.70.16
networkx 3.1
ninja 1.11.1.1
numpy 1.24.4
nvidia-cublas-cu12 12.1.3.1
nvidia-cuda-cupti-cu12 12.1.105
nvidia-cuda-nvrtc-cu12 12.1.105
nvidia-cuda-runtime-cu12 12.1.105
nvidia-cudnn-cu12 8.9.2.26
nvidia-cufft-cu12 11.0.2.54
nvidia-curand-cu12 10.3.2.106
nvidia-cusolver-cu12 11.4.5.107
nvidia-cusparse-cu12 12.1.0.106
nvidia-nccl-cu12 2.19.3
nvidia-nvjitlink-cu12 12.4.99
nvidia-nvtx-cu12 12.1.105
packaging 24.0
pandas 2.0.3
peft 0.10.1.dev0
pillow 10.2.0
pip 24.0
pluggy 1.4.0
protobuf 3.20.1
psutil 5.9.8
py-cpuinfo 9.0.0
pyarrow 15.0.2
pyarrow-hotfix 0.6
pydantic 2.6.4
pydantic_core 2.16.3
Pygments 2.17.2
pynvml 11.5.0
pytest 8.1.1
python-dateutil 2.9.0.post0
pytz 2024.1
PyYAML 6.0.1
regex 2023.12.25
requests 2.31.0
rich 13.7.1
safetensors 0.4.2
scipy 1.10.1
sentencepiece 0.2.0
sentry-sdk 1.43.0
setproctitle 1.3.3
setuptools 69.2.0
shtab 1.7.1
six 1.16.0
smmap 5.0.1
sympy 1.12
text-generation 0.7.0
tokenizers 0.15.2
tomli 2.0.1
torch 2.2.1
torchaudio 2.2.1
torchvision 0.17.1
tqdm 4.66.2
transformers 4.40.0.dev0
triton 2.2.0
trl 0.8.1
typing_extensions 4.10.0
tyro 0.7.3
tzdata 2024.1
urllib3 2.2.1
wandb 0.16.5
wheel 0.43.0
xxhash 3.4.1
yarl 1.9.4

python 3.11

I have tested this on both a dual A100 and dual 3090 system. Using the same docker image.

Who can help?

@pacman100 @younesbelkada @sayakpaul

When calling the get_peft_model method with config that has use_dora=True the time to get a model is VERY long(several minutes). Meanwhile, if I just use a regular Lora model, I get the model almost immediately. I also do not have this issue when using a QDora model oddly enough.

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder
  • My own task or dataset (give details below)

Reproduction

model_name = "mistralai/Mistral-7B-v0.1"
model = AutoModelForCausalLM.from_pretrained(model_name, token=access_token,use_flash_attention_2=True)
peft_config = LoraConfig(
            task_type=TaskType.CAUSAL_LM, inference_mode=False, r=args.lora_rank, lora_alpha=args.lora_alpha, lora_dropout=args.lora_dropout,target_modules=target_modules,modules_to_save=modules_to_save,use_dora=args.dora
        )
model = get_peft_model(model, peft_config)

I removed some stuff to keep it simple. If you want to see a more complete example on how I am running this, please see the code here

Expected behavior

I would expect Dora to load as quickly as Lora, or at least not several orders of magnitude slower.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions