-
-
Notifications
You must be signed in to change notification settings - Fork 8.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Xgboost GPU models do not release memory after training #3045
Comments
Can you try calling bst.__delete__() after each round? Python is garbage collected so it may keep the booster object around. If the error persists after this then it may be a bug. |
Closing as no response. Can reopen if the issue persists. |
Sorry, I haven't gotten around to test it on the original system. I will give it a try and see what happens |
Okay. I have called
This raises a question - is there any way to purge the data off the GPU but keep the trained model? P.S. By no means an expert in how the things are handled in this amazing package. I will understand if it is necessary to keep the training data after |
Saving the model, deleting the booster then loading the model again should achieve this. |
Sounds good, thanks for the help! |
I am having what appears to be the same problem, but using R. I'm not sure what the equivalent of "deleting the booster" in R would be, since what is returned in R is considered a model object. There also does not appear to be a close match to the Since this is a closely-related issue, I'm hoping to piggyback on this ticket rather than opening a nearly-duplicate ticket. |
@jpbowman01 "deleting the booster" in R would be rm(bst)
gc() |
Have the same problem.
|
@aliyesilkanat typo above. it needs to be bst.del() |
nonetheless not working for me. single process. applying .del(). also seeing in nvidia-smi that the GPU mem is being cleared. still always running into this issue even predictably. compiled with different nvidia drivers, GCCs, linux headers, cmake. dont understand why this issue is closed. |
se-I, I had the same problem and was able to solve it using garbage collect gc.collect() after the del() command. |
I also have this problem on a windows machine, with xgboost 0.7 and
does not seem to release the GPU memory (but a kernel restart does:). |
Trying to call
I run my models with |
Xgboost doesn't release gpu memory after training/predicting the model on large data.
Every further rerun of .fit causes more memory allocation until eventual crash of the kernel because GPU memory is out of bounds.
Environment info
Operating System: Ubuntu 16.04 on PowerPC
Compiler:
Package used (python/R/jvm/C++): python
xgboost
version used:If installing from source, please provide
git rev-parse HEAD
) 84ab74fIf you are using python package, please provide
Steps to reproduce
Following code should not cause issues but causes out of memory issues if you run it twice. You might have to decrease the repeat number for data depending on the GPU memory you have (16gb on my side)
The text was updated successfully, but these errors were encountered: