-
-
Notifications
You must be signed in to change notification settings - Fork 8.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mgpu predictor using explicit offsets #4438
Conversation
8472e27
to
024ca7a
Compare
024ca7a
to
5c98525
Compare
@RAMitchell thanks for your review. this 'pr' can optimize away those copies for the batch prediction which the other 'pr' does. but, i think this 'pr' still has to work for smaller batch sizes. i have provided some comments to this effect earlier. |
@RAMitchell @sriramch I ran some experiments on these two PRs using 1 billion rows, written to a tmpfs directory on a GCP VM with 4x T4 GPUs. Timing of the This PR: #4437: Looks like this PR is slightly faster. |
@canonizer @RAMitchell @sriramch @trivialfis all the comments have been addressed. Please take another look. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rest lgtm...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM aside from one comment
@hcho3 looks like the build machine ran out of disk space?
|
7b407f4
to
5994a21
Compare
@rongou I will go ahead and expand disk space for the slave workers. |
de96c65
to
0b8bff5
Compare
@rongou @RAMitchell I think this PR should be part of 0.90, since it's a follow up to #4284. Can we merge this now? |
Alternative approach to #4437. Uses explicit offsets of slide over the out prediction vector. I feel this is slightly cleaner, but I'm ok with either approach.
@canonizer @RAMitchell @sriramch