-
-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
md5 is faster than fnv1a #91
Comments
Well md5 has a 128bit digest and fnv1a has a 32 bit digest. Thats why I looked for crc32 and hashCode as they are also have 32 bit digests. Here with the in SO-Thread mentioned murmurhash btw.: renamed hashCode to djb2 murmurhash x 766,803 ops/sec ±0.41% (194 runs sampled)
|
If a new algorithm is needed because of speed. I will go for |
Do you have a good performant implemention of murmur2 you recommend? |
The 32bit digest is great for this case. Adding bytes to the response will slow down transfers long term. |
No, I vote on the algorithm. But I never use it in my application. The comparison is pretty solid and I believe it is a good choice Refs #3 |
Well my recommendation is not to set md5 as default hashing algorithm. It is just, that it seems odd, that the Weak entity tag is 50% slower than using md5. Even hashing it through md5 and then just taking 4 bytes should be faster than fnv1a. So I assume that at the time fastify-etag was developed md5/sha1 was slower in nodejs than doing the fnv1a hashing. So I am actualy suggesting to replace fnv1a with a faster weak hashing algorithm. Currently the crc32 implementation seems to be the fastest on my machine. According to the comparison of @climba03003 murmur can be faster. So lets see if there is somewhere already a faster implementation than the one i found |
Here is benchmark using the You can see the in 32b / 2k,
|
When are we sending JSON data of 2MBs? It does not seem a usual case. Anyway, I'm happy to add murmur2 to the options (why not murmur3? https://en.m.wikipedia.org/wiki/MurmurHash) |
I have a local branch with integrated murmur2 and jbl2. Crc32 is loaded from the package crc-32. Do you want it? |
I prefer not to take an additional dependency. How about adding an option to specify a function to compute this hash? |
I benchmarked on node v18.13.0, M1 Pro:
|
@kurtextrem |
I'd go for:
Also, since fnv1a is very slow, I'd say we should remove it. Since sha1 isn't drastically slower (apart from the 2 MB benchmark) in Buffer scenarios, I think setting the default to sha1 is actually the most sensible action to take, as we don't need additional dependencies. |
Opinions? |
It would be either one of them. I still see |
Since murmur2 would introduce another dependency, I still see sha1 as the best option as sensible default (for real-world usage) as it's dependency free (available in the crypto module). |
I don't know how you are running those benchmarks, can we get a PR to verify/check the results? These results make sense: for small payloads running a pure-JS implementation outperforms all the native ones. For large payloads, running a native implementation outperforms the pure JS ones. My take is we should optimize for the 2KB - 256KB range, which usually is the outpuf of bundlers, looking at the results above, using sha1 seems the best option. |
@mcollina I can't make a PR because, 1) I fucked up the source code by having the prettier config from my company running 2) I did changes to the index in order to be able to benchmark the other algo's. However you can verify the results by running it from my fork: https://github.com/kurtextrem/fastify-etag. Here's the map for the algo's in my fork: https://github.com/kurtextrem/fastify-etag/blob/master/index.mjs#L37 But -- I'd definitely be happy to submit a correct PR for changing the default. |
Here are the result from my benchmark machine:
In Node v18 it seels that |
I benchmarked 2KB and new 256KB on Node 18.13 on the same M1 Pro. I've just realized two things however:
|
Our benchmarks align with this benchmark: https://medium.com/@chris_72272/what-is-the-fastest-node-js-hashing-algorithm-c15c1a0e164e & https://blog.shevlyagin.com/2021/10/28/fastest-node-js-hashing-algorithm-for-large-strings/. I took the benchmark to my home PC:
So I'd say sha1 is the overall winner (unless you use a bigger buffer). |
Just want to remark, that sha1 should always result in about 52 chars etag (160 bits = 40 bytes + 1/3 base64 overhead). |
I see 28 chars (withouit |
Hmm. Well i have no objection tbh. |
Higher performance at the cost of higher collision rate |
@jimmiehansson |
@Uzlopak You don't need any password for cache deception or smuggling. Either way, there are a number of well known surfaces of attacks prone against hash algorithms with higher collision rates, e.g. https://en.wikipedia.org/wiki/Preimage_attack |
What is the attack vector? by guessing an etag, you don't get back the body, and crafting a collision also doesn't tell you anything else that might be security related |
Exactly: It is not like etag is used to request a cached request. The contrary, if etag matches you dont get more info. If the etag does not match you get the response. There is no attack vector. |
I do see your points, but I wasn't so much referring to the contents being exposed as sensitive, but the possibility of exploitation to the cache. |
Mainly the abuse of the request handlers state at the time of the cache-hit/miss by claiming higher CPU as part of a flood. Vector or not, its perfectly doable |
Ok, sounds good with me! |
I'm curious, have you tried with the hash function since this patch: https://github.com/nodejs/node/blob/main/doc/changelogs/CHANGELOG_V20.md#crypto-implement-cryptohash? |
Prerequisites
Issue
We use by default fnv1a for weak hashing. But md5 is faster than fnv1a
Here is a benchmark with an impememtation of the Java hashCode algorithm and crc32
result:
node benchmarks/weakHash.js
crc32 x 1,314,889 ops/sec ±0.19% (195 runs sampled)
md5 x 293,537 ops/sec ±2.10% (175 runs sampled)
hashCode x 1,019,507 ops/sec ±0.40% (190 runs sampled)
fnv1a x 154,034 ops/sec ±0.11% (195 runs sampled)
The text was updated successfully, but these errors were encountered: