-
Notifications
You must be signed in to change notification settings - Fork 101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement pure OpenCL batch hashing. #78
Conversation
93a8f10
to
2d6acaa
Compare
Benchmarks on RTX3090
|
f39156c
to
678b26c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
two small notes, nice work
CHANGELOG.md
Outdated
@@ -7,6 +7,9 @@ and this project adheres to [Semantic Versioning](https://book.async.rs/overview | |||
|
|||
## Unreleased | |||
|
|||
## 2.4.1 - 2021-1-15 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn’t this go under unreleased technically?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll change it. I was originally hoping to just release this version immediately after.
src/batch_hasher.rs
Outdated
GPU(GPUBatchHasher<A>), | ||
#[cfg(not(feature = "gpu"))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn’t this be not any gpu or opencl?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm... I think that whole variant can be removed now. My latest does that and seems to build fine on macos now.
79c97b9
to
569db95
Compare
This PR implements GPU batch hashing in pure OpenCL, implemented in the new
proteus
module. (Proteus and Triton are both moons of Neptune, hence the naming.)This work is intended to introduce no change of behavior when the
gpu
feature flag is provided. If instead, theopencl
feature flag is provided, a newBatcherType
,BatcherType::OpenCL
can be used instead ofBatcherType::GPU
.This implementation provides the following benefits when compared to the extant
neptune-triton
GPU implementation:neptune-triton
code path).Once the
opencl
feature has been tested and stabilized, it should be made the default, for all of these reasons.Historical context: although replacing
neptune-triton
is an obvious next step now, its replacement benefits from the design which went into the Rust interface toneptune-triton
, to the development ofrust-gpu-tools
andcl-ff-gen
(neither of which existed at the time of the initial GPU implementation), and from the significant learning which went intoneptune-triton
's development. Although the current result is simpler, the path to it was not obvious from the outset.Speedup is ~2x on column tree building. See
gbench
output using the same 2080Ti for both methods.gbench
withgpu
feature:gbench
withopencl
feature: