-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MCLEAN-109] Add METRO hash implementation #58
Conversation
Hi, gnodet I don't mind adding a new hashing algorithms, but wouldn't make it the default. Hash speed is highly dependent on operating system, as far as I remember in our tests on Linux, Memory Mapped version was faster, and on Windows it was slower (that's why we use both). It may also depend on the file system, CPU, etc |
I have a couple more comments to your benchmark.
In this benchmark XX with Memory Mapped file was 6 - 29% slower With random generated files 128KB .. 128MB, 16 files for each size (total size ~4 GB, which is more realistic suite in our case)
In this benchmark XX with Memory Mapped file was 132 - 170% faster You can adjust benchmark according to your load, but main problem is that test run takes a lot of time
code generated by GPT-4 © |
It would be nice to provide some guidelines for choosing an algorithm. The fact they are missing is the original driver for this PR... |
XX algorithm is enough for most cases. |
maybe would be good to add a comment regarding usage of |
@gnodet happy to merge it? |
Do you recall why you indicate some requirements on jdk17+ ? The tests seem to run fine without this flag, so I'm not sure it's actually required. |
it's not requirement on jdk17. I'm just saying with 1.0.1 using
I'm using https://github.com/eclipse/jetty.project.git branch |
This PR provides new METRO / METRO_MM hash algorithms based on Zero-Allocation-Hashing Metro hash implementation.
Given it's the fastest, the default is changed to it and the website modified accordingly.
This is not very clear when/if people should change.
Also, I would have assumed the
MM
versions of the hashes provide better performances, but the added perf test does not seem to indicate thesame: