-
-
Notifications
You must be signed in to change notification settings - Fork 8.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[dask] Add a dummy sample to infer output shape. #6645
Conversation
This PR also fixes a performance issue in predict function that the booster might got serialized multiple times in |
This PR should significantly improve the performance for prediction. |
#6648 needs to be merged first. |
Codecov Report
@@ Coverage Diff @@
## master #6645 +/- ##
==========================================
+ Coverage 81.01% 81.12% +0.10%
==========================================
Files 13 13
Lines 3703 3703
==========================================
+ Hits 3000 3004 +4
+ Misses 703 699 -4
Continue to review full report at Codecov.
|
c3cc599
to
593c8e6
Compare
This is for inferring shape with direct prediction (without DaskDMatrix). There are a few things that requires known output shape before carrying out actual prediction, including dask meta data, output dataframe columns. * Infer output shape based on local prediction. * Remove set param in predict function as it's not thread safe nor necessary as we now let dask to decide the parallelism. * Simplify prediction on `DaskDMatrix`.
0a417a8
to
4a40f80
Compare
This is for inferring shape with direct prediction (without DaskDMatrix).
There are a few things that requires known output shape before carrying out
actual prediction, including dask meta data, output dataframe columns.
we now let dask to decide the parallelism.
DaskDMatrix
.A small part extracted from #6638 with added test and remove redundant serialization.