-
Notifications
You must be signed in to change notification settings - Fork 248
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[External] Adding ankerl
unordered_dense
#12861
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea of finally using a hash table with better memory layout, but sorry in advance: I'm going to be a picky ass here. This is a very basic data structure and there's a mountain of options to choose from so I don't want us to make a poor decision.
I'd be wary of any advertising based on benchmarks someone does on their own libraries. I skimmed through the comparison you linked, and am missing some things:
- I didn't find the source code of the tests
- He didn't provide any scaling studies
- lack of hardware diversity
I ask you to write benchmarks using google's library for a few hash map implementations and run it on some different hardware.
What to compare
Specifically I'd like the following implementations compared:
std::unordered_map
for reference- this lib
- google's dense and sparse hash tables (https://github.com/sparsehash/sparsehash)
tsl::robin_map
What to benchmark
As for what to benchmark, we're almost exclusively inserting/searching and practically never erasing anything from existing tables, do I'd like to see
- an insertion benchmark
- a search benchmark
- with integer keys
- with
std::string
keys. Specifically, longer ones that don't benefit from short-string-optimizations (make sure that
sizeof(std::string) < key.size()
) - concurrent search with all physical cores participating
What's important is that you run this with different sizes so we can get an idea of how these operations scale.
Hardware
I'm interested in benchmarks running on
- a decent desktop with an x86-based CPU
- a NUMA cluster if you have access to one (if you don't I can run it on one)
- some shitty laptop or a raspberry pi (optional, not super important but good to know because we have a lot of student users)
- a Mac with an M-chip (optional, I'm just curious. I can run it on my machine if you don't have access to one)
I know this is a lot of work, but I think it's absolutely necessary for such a basic data structure.
If you are not familiar with google's benchmark framework, I can shoot you an example with std::unordered_map
and you can build on top of that for the other implementations.
This one is super slow at least for moderate sizes in my own tests. |
I would like to have something standarized in Kratos instead of just hand made each time. It is not possible to reuse our GTest infrastructure? |
Maybe, but that just highlights the dangers of taking devs benchmarks of their own libs at face value. The author of |
I'm all for something standardized, but GTest is definitely not a benchmarking library. Google's framework is pretty simple and very popular, but I'm open to other suggestions. |
Yes, in fcat this was the first one I tried and it was the slowest one of all I tried. |
Maybe we can at least add a cmake loop to compile the benchmarks ... |
I'd put the benchmarks in a different repo, similar to how we deal with examples. |
Usually code of benchmakes is not very different from tests. Examples are huge in comparison. |
Tonadd my two cents to Mate'@ comments:
|
Merging master after #12867, I will write a benchmark... |
📝 Description
Introduction
Adding
ankerl
unordered_dense
, which provides top performance hashsed iterators: https://martin.ankerl.com/2022/08/27/hashmap-bench-01/. MIT license and header only.Initial testing during efforts to modernize
data_value_container, at the end current brute force solution for small number of variables is faster, but I found that
ankerl` in my tests was the faster solution.The using them to replace our current containers (not used anywhere).
Key Features
Performance:
Robin Hood Hashing:
Template Customization:
Hashing Algorithm:
wyhash
, a fast and high-quality hashing algorithm.API Compatibility:
std::unordered_map
andstd::unordered_set
.extract
for moving data andreplace
for bulk updates.Exception Safety:
Modular and Extensible:
C++17 and Higher:
std::optional
,std::tuple
, and advanced template metaprogramming.Main Components
Hashing:
hash
template, supporting standard types, strings, and custom objects.Buckets:
Bucket
structure stores metadata (distance
andfingerprint
) and an index into the value container.Data Storage:
segmented_vector
orstd::vector
to store data contiguously.segmented_map
orsegmented_set
) improves memory management for large datasets.Load Factor:
Transparent Lookup:
std::string_view
forstd::string
keys).Iterators:
std::unordered_map
.🆕 Changelog
ankerl
unordered_dense