just import of peft leads to use of cuda and introduces cuda context that makes forks not possible #559

pseudotensor · 2023-06-09T07:42:36Z

Older peft did not do this, but newer does, at least since 3714aa2 and likely earlier.

This means if import in global scope as normal, no forks are possible anymore in python, ruining multiprocessing etc.

The use of cuda should be lazy on-demand and not forced when importing peft. That is, cuda should only be introduced when model itself is on cuda, not just from an import of peft.

jon@pseudotensor:~/h2ogpt$ python
Python 3.10.11 (main, Apr 20 2023, 19:02:41) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from peft import PeftModel

===================================BUG REPORT===================================
Welcome to bitsandbytes. For bug reports, please run

python -m bitsandbytes

 and submit this information together with your error trace to: https://github.com/TimDettmers/bitsandbytes/issues
================================================================================
bin /home/jon/miniconda3/envs/h2ollm/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda121.so
/home/jon/miniconda3/envs/h2ollm/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py:149: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('/usr/lib/jvm/default-java/jre/lib/amd64/server'), PosixPath('/opt/clang+llvm-4.0.0-x86_64-linux-gnu-ubuntu-16.04/lib'), PosixPath('/home/jon/lib'), PosixPath('/opt/rstudio-1.0.136/bin'), PosixPath('/usr/local/cuda/extras/CUPTI/lib64')}
  warn(msg)
CUDA SETUP: CUDA runtime path found: /usr/local/cuda/lib64/libcudart.so
CUDA SETUP: Highest compute capability among GPUs detected: 7.5
CUDA SETUP: Detected CUDA version 121
CUDA SETUP: Loading binary /home/jon/miniconda3/envs/h2ollm/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda121.so...
>>>

Because of this, after such an import, things like this work:

from concurrent.futures import ProcessPoolExecutor


def go():
    from transformers import AutoModelForCausalLM
    model = AutoModelForCausalLM.from_pretrained("h2oai/h2ogpt-oig-oasst1-512-6_9b", load_in_8bit=True, device_map={"": 'cuda'})
    assert model is not None


with ProcessPoolExecutor(max_workers=1) as executor:
    ret = executor.submit(go).result()

But this fails:

from peft import PeftModel
from concurrent.futures import ProcessPoolExecutor


def go():
    from transformers import AutoModelForCausalLM
    model = AutoModelForCausalLM.from_pretrained("h2oai/h2ogpt-oig-oasst1-512-6_9b", load_in_8bit=True, device_map={"": 'cuda'})
    assert model is not None


with ProcessPoolExecutor(max_workers=1) as executor:
    ret = executor.submit(go).result()

with:

E               RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method

The text was updated successfully, but these errors were encountered:

pseudotensor · 2023-06-09T07:52:01Z

Local scope import isn't good work-around:
h2oai/h2ogpt@762cdcd

because still if peft used, then same problems. Need cuda isolated to when model is put onto cuda and only then.

younesbelkada · 2023-06-09T12:45:31Z

hi @pseudotensor
Thanks for raising the issue, could protecting the import of bnb related modules in PEFT with if torch.cuda.is_available() solve your issue?
Per my understanding the command

from peft import PeftModel

always led to importing bnb related modules if bnb is installed.

pseudotensor · 2023-06-09T19:46:52Z

It is definitely new behavior. I can’t just wrap the import.

younesbelkada · 2023-06-12T11:58:30Z

Thanks, I see, would you be able to share from which commit exactly this happens? I will also investigate on my side and let you know

pseudotensor · 2023-06-12T18:23:28Z

I don't know exact one, but the above code can be used to bisect.

pseudotensor · 2023-06-30T05:52:09Z

@younesbelkada Any update? Seems should be easy to fix.

younesbelkada · 2023-07-06T07:07:26Z

Hi @pseudotensor
I still didn't find time to properly address this,
Also would love to hear @BenjaminBossan 's thoughts here in case I missed few things

BenjaminBossan · 2023-07-06T09:20:23Z

I ran a git bisect and the offending commit is this one: d75746b

So it is indeed the top level import of bnb that causes the issue. Commenting out the import fixes the it.

With the current state of the code base, it might be possible to prevent any top level imports of bnb, but it wouldn't be trivial. I do wonder, however, if it would be possible for bnb to make a change to avoid the issue. I don't know that library well, so maybe someone else can comment on that?

pseudotensor · 2023-07-06T09:25:20Z

Hi, but with that change, peft now depends fully on bitsandbytes, even though it is just one component. bitsandbytes is not trivial to install on every system, e.g. windows or mac. So this limits peft quite a bit unless fixed inside peft itself.

younesbelkada · 2023-07-06T09:26:44Z

I agree on

bitsandbytes is not trivial to install on every system, e.g. windows or mac.

We should probably think of making bitsandbytes an optional dependency, and this should fix this issue. wdyt @pacman100 ?

pseudotensor · 2023-07-06T09:28:51Z

But that won't solve the issue here, because bitsandbytes messes up multiprocessing due to cuda import. I should be able to use CUDA and have bitsandbytes, but not have import of peft necessarily load bitsandbytes globally.

bitsandbytes should only be loaded when the model object requires it, not at global import time.

pseudotensor · 2023-07-06T09:31:42Z

A simple fix for peft is to put the Linear8bitLt class in a separate file and only import it locally. Then it will be inside LoraModel and never cause any problems, because it is not imported until the LoraModel is created (i.e. model-time import).

BenjaminBossan · 2023-07-06T09:32:59Z

A simple fix for peft is to put the class in a separate file and only import it locally

As I mentioned, I think it should be possible to load bnb lazily, what you suggest is one possibility. But it isn't a trivial change, for instance, we would have to ensure in our test suite that nothing breaks when bnb is installed vs not installed.

I should be able to use CUDA and have bitsandbytes

Assuming you do have bnb, then this is irrelevant, right?

bitsandbytes is not trivial to install on every system, e.g. windows or mac

That last part should be fixed by making bnb an optional dependency, right?

IMO if bnb could do something about this issue, it would still be a win (but I don't know how trivial it is for them to fix it).

pseudotensor · 2023-07-06T09:34:49Z

I presume you have tests of LORA that uses bitsandbytes, so such local imports would be tested.

In order to test that bitsandbytes is no longer breaking anything, just add the trivial repro I provided in this issue to your testing.

I don't see these as non-trivial.

BenjaminBossan · 2023-07-06T09:39:37Z

Maybe I'm missing something, but what I mean is that to test that the dependency on bnb is indeed optional, we have to create an env in our test setup (or mock the existing one) to pretend that bnb is not installed and run all the tests (except the bnb-specific ones) to ensure that they still pass without bnb installed. Otherwise, we might have code that does depend on bnb when it shouldn't, but if bnb is installed in the test env, we wouldn't notice it. Does that make sense?

BenjaminBossan · 2023-07-06T09:42:50Z

Btw as a workaround until we have a fix, you should be able to patch builtins.__import__, intercept it if bnb is trying to be imported, and return a mock object.

pseudotensor · 2023-07-06T10:05:08Z

Maybe I'm missing something, but what I mean is that to test that the dependency on bnb is indeed optional, we have to create an env in our test setup (or mock the existing one) to pretend that bnb is not installed and run all the tests (except the bnb-specific ones) to ensure that they still pass without bnb installed. Otherwise, we might have code that does depend on bnb when it shouldn't, but if bnb is installed in the test env, we wouldn't notice it. Does that make sense?

This would be an improvement over even the situation prior to when the problem started and would be a nice to have. The fixes I suggested I think are critical.

pseudotensor · 2023-07-06T10:06:35Z

Btw as a workaround until we have a fix, you should be able to patch builtins.__import__, intercept it if bnb is trying to be imported, and return a mock object.

I still want to use bitsandbytes, I just don't want it imported early and contaminate the global scope with a CUDA context making forking and doing CUDA tasks not possible (original issue I reported).

BenjaminBossan · 2023-07-06T10:16:56Z

I still want to use bitsandbytes, I just don't want it imported early and contaminate the global scope with a CUDA context making forking and doing CUDA tasks not possible (original issue I reported).

Ah I see, in that case this suggestion wouldn't actually work.

github-actions · 2023-07-30T15:04:04Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

pseudotensor · 2023-08-18T20:12:14Z

Any updates? Shouldn't be closed IMO.

younesbelkada · 2023-08-18T21:16:09Z

Yes, re-opened we're still discussing this internally

github-actions · 2023-11-12T15:03:39Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

BenjaminBossan · 2023-12-07T15:40:09Z

Sorry that this took so long. #1230 should have fixed the issue.

github-actions bot closed this as completed Aug 8, 2023

younesbelkada reopened this Aug 18, 2023

huggingface deleted a comment from github-actions bot Sep 13, 2023

huggingface deleted a comment from github-actions bot Oct 18, 2023

github-actions bot closed this as completed Nov 20, 2023

BenjaminBossan mentioned this issue Dec 7, 2023

Lazy import of bitsandbytes #1230

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

just import of peft leads to use of cuda and introduces cuda context that makes forks not possible #559

just import of peft leads to use of cuda and introduces cuda context that makes forks not possible #559

pseudotensor commented Jun 9, 2023 •

edited

Loading

pseudotensor commented Jun 9, 2023 •

edited

Loading

younesbelkada commented Jun 9, 2023

pseudotensor commented Jun 9, 2023

younesbelkada commented Jun 12, 2023

pseudotensor commented Jun 12, 2023

pseudotensor commented Jun 30, 2023

younesbelkada commented Jul 6, 2023

BenjaminBossan commented Jul 6, 2023

pseudotensor commented Jul 6, 2023

younesbelkada commented Jul 6, 2023 •

edited

Loading

pseudotensor commented Jul 6, 2023 •

edited

Loading

pseudotensor commented Jul 6, 2023 •

edited

Loading

BenjaminBossan commented Jul 6, 2023

pseudotensor commented Jul 6, 2023 •

edited

Loading

BenjaminBossan commented Jul 6, 2023

BenjaminBossan commented Jul 6, 2023

pseudotensor commented Jul 6, 2023

pseudotensor commented Jul 6, 2023

BenjaminBossan commented Jul 6, 2023

github-actions bot commented Jul 30, 2023

pseudotensor commented Aug 18, 2023 •

edited

Loading

younesbelkada commented Aug 18, 2023

github-actions bot commented Nov 12, 2023

BenjaminBossan commented Dec 7, 2023

just import of peft leads to use of cuda and introduces cuda context that makes forks not possible #559

just import of peft leads to use of cuda and introduces cuda context that makes forks not possible #559

Comments

pseudotensor commented Jun 9, 2023 • edited Loading

pseudotensor commented Jun 9, 2023 • edited Loading

younesbelkada commented Jun 9, 2023

pseudotensor commented Jun 9, 2023

younesbelkada commented Jun 12, 2023

pseudotensor commented Jun 12, 2023

pseudotensor commented Jun 30, 2023

younesbelkada commented Jul 6, 2023

BenjaminBossan commented Jul 6, 2023

pseudotensor commented Jul 6, 2023

younesbelkada commented Jul 6, 2023 • edited Loading

pseudotensor commented Jul 6, 2023 • edited Loading

pseudotensor commented Jul 6, 2023 • edited Loading

BenjaminBossan commented Jul 6, 2023

pseudotensor commented Jul 6, 2023 • edited Loading

BenjaminBossan commented Jul 6, 2023

BenjaminBossan commented Jul 6, 2023

pseudotensor commented Jul 6, 2023

pseudotensor commented Jul 6, 2023

BenjaminBossan commented Jul 6, 2023

github-actions bot commented Jul 30, 2023

pseudotensor commented Aug 18, 2023 • edited Loading

younesbelkada commented Aug 18, 2023

github-actions bot commented Nov 12, 2023

BenjaminBossan commented Dec 7, 2023

pseudotensor commented Jun 9, 2023 •

edited

Loading

pseudotensor commented Jun 9, 2023 •

edited

Loading

younesbelkada commented Jul 6, 2023 •

edited

Loading

pseudotensor commented Jul 6, 2023 •

edited

Loading

pseudotensor commented Jul 6, 2023 •

edited

Loading

pseudotensor commented Jul 6, 2023 •

edited

Loading

pseudotensor commented Aug 18, 2023 •

edited

Loading