-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
just import of peft leads to use of cuda and introduces cuda context that makes forks not possible #559
Comments
Local scope import isn't good work-around: because still if peft used, then same problems. Need cuda isolated to when model is put onto cuda and only then. |
hi @pseudotensor from peft import PeftModel always led to importing bnb related modules if bnb is installed. |
It is definitely new behavior. I can’t just wrap the import. |
Thanks, I see, would you be able to share from which commit exactly this happens? I will also investigate on my side and let you know |
I don't know exact one, but the above code can be used to bisect. |
@younesbelkada Any update? Seems should be easy to fix. |
Hi @pseudotensor |
I ran a So it is indeed the top level import of bnb that causes the issue. Commenting out the import fixes the it. With the current state of the code base, it might be possible to prevent any top level imports of bnb, but it wouldn't be trivial. I do wonder, however, if it would be possible for bnb to make a change to avoid the issue. I don't know that library well, so maybe someone else can comment on that? |
Hi, but with that change, peft now depends fully on bitsandbytes, even though it is just one component. bitsandbytes is not trivial to install on every system, e.g. windows or mac. So this limits peft quite a bit unless fixed inside peft itself. |
I agree on
We should probably think of making bitsandbytes an optional dependency, and this should fix this issue. wdyt @pacman100 ? |
But that won't solve the issue here, because bitsandbytes messes up multiprocessing due to cuda import. I should be able to use CUDA and have bitsandbytes, but not have import of peft necessarily load bitsandbytes globally. bitsandbytes should only be loaded when the model object requires it, not at global import time. |
A simple fix for peft is to put the Linear8bitLt class in a separate file and only import it locally. Then it will be inside LoraModel and never cause any problems, because it is not imported until the LoraModel is created (i.e. model-time import). |
As I mentioned, I think it should be possible to load bnb lazily, what you suggest is one possibility. But it isn't a trivial change, for instance, we would have to ensure in our test suite that nothing breaks when bnb is installed vs not installed.
Assuming you do have bnb, then this is irrelevant, right?
That last part should be fixed by making bnb an optional dependency, right? IMO if bnb could do something about this issue, it would still be a win (but I don't know how trivial it is for them to fix it). |
I presume you have tests of LORA that uses bitsandbytes, so such local imports would be tested. In order to test that bitsandbytes is no longer breaking anything, just add the trivial repro I provided in this issue to your testing. I don't see these as non-trivial. |
Maybe I'm missing something, but what I mean is that to test that the dependency on bnb is indeed optional, we have to create an env in our test setup (or mock the existing one) to pretend that bnb is not installed and run all the tests (except the bnb-specific ones) to ensure that they still pass without bnb installed. Otherwise, we might have code that does depend on bnb when it shouldn't, but if bnb is installed in the test env, we wouldn't notice it. Does that make sense? |
Btw as a workaround until we have a fix, you should be able to patch |
This would be an improvement over even the situation prior to when the problem started and would be a nice to have. The fixes I suggested I think are critical. |
I still want to use bitsandbytes, I just don't want it imported early and contaminate the global scope with a CUDA context making forking and doing CUDA tasks not possible (original issue I reported). |
Ah I see, in that case this suggestion wouldn't actually work. |
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. |
Any updates? Shouldn't be closed IMO. |
Yes, re-opened we're still discussing this internally |
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. |
Sorry that this took so long. #1230 should have fixed the issue. |
Older peft did not do this, but newer does, at least since 3714aa2 and likely earlier.
This means if import in global scope as normal, no forks are possible anymore in python, ruining multiprocessing etc.
The use of cuda should be lazy on-demand and not forced when importing peft. That is, cuda should only be introduced when model itself is on cuda, not just from an import of peft.
Because of this, after such an import, things like this work:
But this fails:
with:
The text was updated successfully, but these errors were encountered: