-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GPU/OpenCL] Can use at most 8.3GB GPU memory #4480
Comments
Got the same issue. I'm using C API. When the training dataset is too large, the program would just crash when I call 'LGBM_BoosterCreate()' With 7.5GB data, it works fine:
If I add more data (estimate ~8.9GB), it would crash before the last line:
And it happens either when I increase number of features or number of data points, as long as the total size is more than ~8GB. |
I have this same problem!! My company has 2 types GPU: V100-32G and A100-40G When the X_train is more than shape=(850w, 1000), LGB-GPU will have the same problem. It seems like OOM but I'm not sure. The GPU memory usage about 8.3G same as you. |
I see this bug is fixed with #4928 |
Thank you so much for getting back and sharing this observation! |
May I ask which release has this fix? I use v3.3.5 version, the problem can still be reproduced |
It will be in release v4.0.0. You can follow #5153 to be notified when that release is published. |
This issue has been automatically locked since there has not been any recent activity since it was closed. |
Hi, thanks for the package! I noticed a similar issue as #3899. I am using LightGBM with a version of 3.2.1, and an NVIDIA Tesla V100-SXM2-16GB (16GB memory). By running the following code, we can see the first run only takes 8299MB GPU memory, which means the second one should fit, since the number of data points only slightly increases. However, it turns out we get the following error message. Could someone let me know if there is an internal memory limit in the LightGBM library? Thanks very much for the help!
Code:
Error message:
The text was updated successfully, but these errors were encountered: