-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Calling multithreaded functions sets global number of OMP threads #4705
Comments
Ah, just wrote a comment in the related issue! 🙂 Could you please share your expert opinion in #4607 on the chosen design? |
Related: dmlc/xgboost#7537. |
@david-cortes if you use the default value (-1), the |
@guolinke : But the issue still persists: if one wants to pass a number of threads that's different from the OMP default, it will alter the current OMP default. Also scikit-learn's glossary mentions that negative I think users would typically expect scikit-learn-compatible packages to follow that kind of conventions. |
I see. A quick solution is to have a context manager. For example, when start running LightGBM, it records current omp nthreads, then when exiting LightGBM, it resets the omp nthreads by the previous recorded values. |
Just wanted to mention that this problem is quite nasty when working on AutoML systems (I develop AutoGluon), as LightGBM can alter the OpenMP thread count at inference time, slowing down torch neural network models running on CPU dramatically if ensembling them together. It would be great if this was fixed. We currently use a context manager to avoid the issue, but it slows down inference speed in small-batch size scenarios. |
Thanks for the comment @Innixma ! That's important context. We've really been struggling from a lack of maintainer attention and availability here, which is why this has been open so long. Are you interested in working on this problem and submitting a proposal in a pull request? That is the best way to ensure that it's fixed in the next release of LightGBM. |
Hi @jameslamb, thanks for your attention! I'm sorry to hear that LightGBM is struggling to obtain prioritization. It is an amazing package and foundational to many AutoML systems. There is a reason the only model I have more than 1 of in our ensemble is LightGBM, and we use it 3 times! Unfortunately, to say that my C++ skills are rusty would be an understatement, and I'd expect the rest of the AutoGluon developers would say the same. We currently have a work-around recently implemented into AutoGluon to fix the majority of the slow down (2.5x faster ensemble inference). The work-around mostly boils down to avoiding the use of a context manager entirely and always specifying This solution is a bit shaky, since it depends on the assumption that LightGBM will default to the current OpenMP thread limit and that the current OpenMP thread limit is set to a value that all model types are efficient with (not really possible in reality as some models prefer thread-count and others prefer physical-count, but hopefully it is close enough to not be too slow). Also, optimal OpenMP thread limit differs based on inference batch size, but that is a separate dimension of optimization. Thankfully, it is pretty easy to benchmark, and for most cases I think the work-around generally solves the problem on our end. |
Assigning this to myself, as I'm actively working on it right now. |
Calling lightgbm functions which use multithreading will forcibly change the configured number of openmp threads in the whole processes from where lightgbm was called, which is a bad practice (for example, depending on how numpy was built, it might change the number of threads that matrix multiplications in numpy will use afterwards). This can be checked in lines such as this:
LightGBM/src/c_api.cpp
Line 2144 in a2b60e8
The fix is easy but requires some design changes in the overall code structure: don't set the number of openmp threads, pass them instead as a parameter to the openmp pragmas.
Example:
The text was updated successfully, but these errors were encountered: