-
-
Notifications
You must be signed in to change notification settings - Fork 8.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix perf gap in thread safe prediction #6696
Fix perf gap in thread safe prediction #6696
Conversation
I thought we have prediction cache enabled? |
For subsampling case we still don't have prediction caching (seems it's still in progress: #6683, major complexity is related to multiclass classification case as for each group own indices subset is generated.) |
@ShvetsKS Would you like to take a look into the GPU implementation of subsampling? Or my WIP rewrite for CPU hist? The subsampling doesn't have to conflict with cache. |
Last time your found it to be slower than master branch, which is expected as I don't have time to incooperate many recent optimization into it. But I believe my rewrite can at least offer some insight on how to implement some of the features in a different way. |
Current changes are applicable even for inference stage as it's better to initialize |
4e8ed8c
to
198747f
Compare
Codecov Report
@@ Coverage Diff @@
## master #6696 +/- ##
=======================================
Coverage 81.56% 81.56%
=======================================
Files 13 13
Lines 3759 3759
=======================================
Hits 3066 3066
Misses 693 693 Continue to review full report at Codecov.
|
198747f
to
1d4e4d1
Compare
problem with continuous-integration/travis-ci/pr is related with server node |
Sorry for the long wait. Still on vacation, but will try to look into it soon as possible. |
As in #6648 local
RegTree::FVec
storage was introduced to preserve thread safety each training iteration requires buffer initialization in case of subsampling.this PR introduces threading for local buffers initialization.
But as still some gaps are exist is it possible to have not thread safe
PredictDMatrix
call during training?