-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Accelerate Error #2216
Comments
Again, please state info about your env as I asked in the other issue. Bits and bytes and other libraries similar will init CUDA on import. You need to hide this import inside your training function so it gets imported after you’ve launched your notebook launcher. Later versions of accelerate will warn if this happens |
Thank you, that makes sense, but I had removed bitsandbytes and rebooted. Not sure why it says there is no default config, because I walked through the config wizard and successfully had accelerate working.
|
I reconfigured accelerate, but I'm still getting the same error.
|
Can you try installing from accelerate main? |
Still experiencing the error. I rebooted as well. I'm also not running any other notebooks or processes which would use CUDA. I don't see anything in the code that would initialize CUDA before the
|
|
That shouldn't be the case/shouldn't be happening 👀 I ran the notebook launcher just fine, can you give me the output from As all the imports in Accelerate are very cuda-careful for this exact reason |
nevermind!
this was definitely the case with accelerate-0.21.0. after a pip update to accelerate-0.25.0: gone. sorry for the distraction edit: import torch
display(torch.cuda.is_initialized())
from peft import (
get_peft_config,
get_peft_model,
get_peft_model_state_dict,
set_peft_model_state_dict,
LoraConfig,
PeftType,
PrefixTuningConfig,
PromptEncoderConfig,
)
display(torch.cuda.is_initialized()) output
|
Yes iirc I opened an issue on the peft side for this. Nothing we can do, they have to do things about that :) (So just import it during your training func) |
Yes, we should revisit this in PEFT! |
Following the above regarding moving accelerate to the loop and after rebooting my machine to clear the RAM on the GPUs I am still getting the error message. Should I be moving peft to the loop as well?
Error Message:
|
Yes, please test that as well and let us know if it solves the problem. |
Nevermind, I see, moving both modules into the loop solved it. Thank you! |
My notebook was working and then stopped working when I duplicated my notebook to test out a small revision that required bitsandbytes. I installed bitsandbytes to the same virtual environment, which I wouldn't expect to cause any issues. I went back to the original notebook and it no longer ran successfully. I'm now getting the error below. I have since uninstalled bitsandbytes and restarted the kernel. I'm not sure what happened and I cannot track down someone experiencing this issue on stackoverflow or elsewhere.
LoRA-multi-gpu-working.zip
Error:
The text was updated successfully, but these errors were encountered: