Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why this numbers are called 'primes' ? #1

Open
funny-falcon opened this issue May 10, 2016 · 5 comments
Open

Why this numbers are called 'primes' ? #1

funny-falcon opened this issue May 10, 2016 · 5 comments

Comments

@funny-falcon
Copy link

https://github.com/vnmakarov/mum-hash/blob/master/mum.h#L312-L320

There is no any guarantee, than _mum_next_factor will return prime number.
Neither it is guaranteed to be odd (but it looks like algorithm doesn't depends on it).
If generator behind rand is a good random generator, then _mem_next_factor even
allowed to return zero.

@funny-falcon
Copy link
Author

funny-falcon commented May 10, 2016

Currently, it does not provide resistance from SIC ("seed independent collisions", or "hash flood"), if default _mum_primes are used.

Seed has no impact on result of neighbor input scrambling https://github.com/vnmakarov/mum-hash/blob/master/mum.h#L209 , so it is possible to choose next block that will cancel diff from previous block. Yes, it is cpu consuming operation, but it should be done only once for known _mum_primes.
Also, since lo(x*C) + hi(x*C) is not bijective (1-to-1) function, even one-block collision could be found.

(and, it also leads to conclusion: while MUM is (probably) resistant to first preimage attack, it is certainly not resistant to second preimage attack (for known _mum_primes). Worse than that, this attack doesn't depends on seed, only on _mum_primes).

@funny-falcon
Copy link
Author

funny-falcon commented May 10, 2016

As an example, there is fallback hash functions for Golang:
https://github.com/golang/go/blob/master/src/runtime/hash32.go
https://github.com/golang/go/blob/master/src/runtime/hash64.go

They looks to be safer against SIC/"hash flood", although multipliers are constants.
And they have impressive performance.

@vnmakarov
Copy link
Owner

Thank you for your feedback, Yura.

The initial constants are prime numbers. I started with them thinking about prime number factorization as a hard problem (although it is not crypto-level hard for only 64-bit numbers). The code changing them was added later but the names were not changed. The names of the multiplication constants are misleading. I'll modify the names.

Actually I tested mum hash on smhasher with different multiplications constants (including randomly generated). So you are right the number primeness is not important. You are right that randomly generated numbers can be even or zero. I should add the code preventing this.

As for hash flood. I wrote the function for my purposes (an interpreter). I am going to use it the following way: get random constant at the interpreter work beginning and then use the function. I don't think it is even necessary to add a code in hash tables to recognize a denial attack.

Also I don't pretend that MUM hash function is a crypto-level hash function. lo(x*C) + hi (x*C) = a probably can be found on modern equipment but it still will require a lot of time. It would be interesting for me, how much time you will need to find X for C=0X258c4af23da09e33 and a=0X11d934b3c798d46d

Thanks for adding references for hash functions in GO. I'll investigate them. I am getting new references frequently after releasing the code. There are too many hash functions to try. But I tried yours too.

@funny-falcon
Copy link
Author

funny-falcon commented May 10, 2016

But I tried yours too.

:-) thanks. Mine are not fast, though I believe they are safe.
Go's are really fast and look to be safe.

@funny-falcon
Copy link
Author

And hash table does not need code for recognition of "dos", if hash function is resistant to "hash-flood" (seed independent collisions). No one will try to attack such function, cause it is waste of time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants