-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Description
System Info
Package Version
accelerate 0.29.0.dev0
aiohttp 3.9.3
aiosignal 1.3.1
annotated-types 0.6.0
appdirs 1.4.4
async-timeout 4.0.3
attrs 23.2.0
bitsandbytes 0.43.0
certifi 2024.2.2
charset-normalizer 3.3.2
click 8.1.7
datasets 2.18.0
deepspeed 0.14.0+ce78a632
dill 0.3.8
docker-pycreds 0.4.0
docstring_parser 0.16
einops 0.7.0
exceptiongroup 1.2.0
filelock 3.13.3
flash-attn 2.5.6
frozenlist 1.4.1
fsspec 2024.2.0
gitdb 4.0.11
GitPython 3.1.42
hjson 3.1.0
huggingface-hub 0.22.1
idna 3.6
iniconfig 2.0.0
Jinja2 3.1.3
markdown-it-py 3.0.0
MarkupSafe 2.1.5
mdurl 0.1.2
mpmath 1.3.0
multidict 6.0.5
multiprocess 0.70.16
networkx 3.1
ninja 1.11.1.1
numpy 1.24.4
nvidia-cublas-cu12 12.1.3.1
nvidia-cuda-cupti-cu12 12.1.105
nvidia-cuda-nvrtc-cu12 12.1.105
nvidia-cuda-runtime-cu12 12.1.105
nvidia-cudnn-cu12 8.9.2.26
nvidia-cufft-cu12 11.0.2.54
nvidia-curand-cu12 10.3.2.106
nvidia-cusolver-cu12 11.4.5.107
nvidia-cusparse-cu12 12.1.0.106
nvidia-nccl-cu12 2.19.3
nvidia-nvjitlink-cu12 12.4.99
nvidia-nvtx-cu12 12.1.105
packaging 24.0
pandas 2.0.3
peft 0.10.1.dev0
pillow 10.2.0
pip 24.0
pluggy 1.4.0
protobuf 3.20.1
psutil 5.9.8
py-cpuinfo 9.0.0
pyarrow 15.0.2
pyarrow-hotfix 0.6
pydantic 2.6.4
pydantic_core 2.16.3
Pygments 2.17.2
pynvml 11.5.0
pytest 8.1.1
python-dateutil 2.9.0.post0
pytz 2024.1
PyYAML 6.0.1
regex 2023.12.25
requests 2.31.0
rich 13.7.1
safetensors 0.4.2
scipy 1.10.1
sentencepiece 0.2.0
sentry-sdk 1.43.0
setproctitle 1.3.3
setuptools 69.2.0
shtab 1.7.1
six 1.16.0
smmap 5.0.1
sympy 1.12
text-generation 0.7.0
tokenizers 0.15.2
tomli 2.0.1
torch 2.2.1
torchaudio 2.2.1
torchvision 0.17.1
tqdm 4.66.2
transformers 4.40.0.dev0
triton 2.2.0
trl 0.8.1
typing_extensions 4.10.0
tyro 0.7.3
tzdata 2024.1
urllib3 2.2.1
wandb 0.16.5
wheel 0.43.0
xxhash 3.4.1
yarl 1.9.4
python 3.11
I have tested this on both a dual A100 and dual 3090 system. Using the same docker image.
Who can help?
@pacman100 @younesbelkada @sayakpaul
When calling the get_peft_model method with config that has use_dora=True the time to get a model is VERY long(several minutes). Meanwhile, if I just use a regular Lora model, I get the model almost immediately. I also do not have this issue when using a QDora model oddly enough.
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examplesfolder - My own task or dataset (give details below)
Reproduction
model_name = "mistralai/Mistral-7B-v0.1"
model = AutoModelForCausalLM.from_pretrained(model_name, token=access_token,use_flash_attention_2=True)
peft_config = LoraConfig(
task_type=TaskType.CAUSAL_LM, inference_mode=False, r=args.lora_rank, lora_alpha=args.lora_alpha, lora_dropout=args.lora_dropout,target_modules=target_modules,modules_to_save=modules_to_save,use_dora=args.dora
)
model = get_peft_model(model, peft_config)I removed some stuff to keep it simple. If you want to see a more complete example on how I am running this, please see the code here
Expected behavior
I would expect Dora to load as quickly as Lora, or at least not several orders of magnitude slower.