-
Notifications
You must be signed in to change notification settings - Fork 314
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize tree-building for neptune/opencl (gpu2 flag). #1422
Conversation
Okay, I moved @vmx's CI work over to this branch and adapted it to run lifecycle tests for both I've also verified that both the I think this is ready to go once the rest of CI passes. |
let layer_data: &mut Vec<_> = &mut layer_data; | ||
|
||
// gather all layer data in parallel. | ||
s.spawn(move |_| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you may want to spawn in the iteration.The current implemention seems doesn't paralleled actually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right that this spawn
seems unnecessary — not sure if it was always like that or got introduced in a refactor at some point. The change you suggest doesn't seem to help though (in my benchmark at least): the bottleneck is in the conversion that follows. Since the goal now is to release with minimal changes, and since all options (status quo, actually spawning in loop, just removing the whole scope/spawn) seem to perform equivalently, I think we should just leave as-is for now — to minimize change and risk, given that this has already been tested/benchmarked/etc quite a bit. It can be removed completely in some future cleanup that aims to improve further.
let layer_data: &mut Vec<_> = &mut layer_data; | ||
|
||
// gather all layer data in parallel. | ||
s.spawn(move |_| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@porcuquine lets remove the unneeded spawn
at least from the new code path
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have this in testing now with a new issue prepped for the various improvements noted here. If my testing goes ok, I'll merge this and open that issue.
This PR supersedes #1420 — since we have decided to keep
neptune/opencl
behind thegpu2
feature flag for now.@vmx Can you rebase your CI work on this branch and continue here? I made one modification to the line that actually runs the lifecycle tests. The previous version didn't set the flags, but the new one should correctly specify
gpu2
.