-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
QLoRA / bnb.nf4 quantization causes issues in recent PyTorch Lightning/Fabric versions #1604
Comments
Not related to the Gemma 2 branch, also occurs in main. |
Doesn't seem to be related to bitsandbytes and lightning fabric versions (issue also occurs with bnb 0.41.3 and lightning 0.2.2). Maybe something in LitGPT has changed. |
Not only QLoRA. |
I am not sure what's changed that could be causing this, we have bitsandbytes and lightning/fabric pinned. |
It's caused by PyTorch-Lightning. pip install lightning==2.3.0.dev20240428 which is the package that the repo used before. |
This kind of issues needs to be caught by tests. |
Ohhh, so basically #1579. We can revert to an older version, but the question is whether there's something that needs to be updated in PyTorch-Lightning (in case this was an accidental change) or LitGPT (so that we can support newer PTL versions moving forward). |
Added a quick PR to add a test and revert the lightning version until we have more time to investigate #1605 |
It's not really fixed. Downgrading the version is possible to avoid the problem, but isn't it conceivable that at some point LitGPT might want to support newer versions of Lightning? What happens then? I think in such situations at least we should open a ticket on the library in question (lightning in this case). Plus the stack trace hints at bitsandbytes being involved, so we'd also need to collect the bnb version used. These are all essential steps that would help us resolve these issues efficiently. |
Yes, I just realized this too and reopened a few seconds before you posted. Let me prepare an issue for the PyTorch Lightning issue tracker. |
See issue: Lightning-AI/pytorch-lightning#20119 |
With the fix Lightning-AI/pytorch-lightning#20121 you can try updating the lightning package to the nightly produced next Sunday or once the next regular release is done. |
Sounds great, thanks. I will make a reminder to test this on Sunday/Monday! |
Bug description
Either I'm doing something dumb or QLoRA seems to be broken. Tried it with different models:
LoRA (fine)
QLoRA from config file (not fine)
QLoRA without config file
What operating system are you using?
Unknown
LitGPT Version
The text was updated successfully, but these errors were encountered: