-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace ISAAC with HC-128 #53
Comments
That's a good argument. The other thing that jumps out from your results is that there is only one reason to use 32-bit Isaac over the 64-bit version: memory usage (which is in any case still poor). At some point in time |
HC-128 is at least 13 years old and appears to be designed for 32-bit hardware. Most comparable generators are also 32-bit and of similar age, if not older. Can the arithmetic be parallelised well for modern CPUs? Are there any more recent stream cyphers we should consider? |
Good questions. For cryptographic RNG's it seems sensible to not use something very recent, because there has not been enough time to analyse it. The variant of HC-128 I worked from is from 2009, and has seen a few small improvements. It produces different results from the one in the eSTREAM contest, mostly for performance it seems. It is written to take good advantage of instruction level parallelism, much more than ISAAC. SIMD does not make sense here, because what it does is mostly reading from different parts of an array. Something like seven times. The other ciphers from eSTREAM also seem interesting. HC-128 is the said to be clearly fastest one, although I haven't seen benchmarks comparing them. I don't know how optimal our implementation of ChaCha is. LLVM seems to be able to vectorize it... HC128 is 4~6x faster. It seems all (?) cryptographic RNG's / ciphers get designed for 32 bit, so they work on all sorts of devices well. I don't expect a good 64-bit RNG to get the necessary analysis soon. But I am just an outsider doing some reading... Interestingly HC-128 with just 32-bit arithmetic can rival the performance of 64-bit ISAAC. I can give updated benchmarks on my implementation of HC-128:
x86_64 (all numbers in Mb/s):
I have hold off making a PR because I use unsafe to do unchecked indexing, and would like to find a trick to avoid bounds checking with just safe code. But with every variant I tried, LLVM just can't figure it out. I think I will just make a PR and leave that as an exercise for later. |
I would like to voice my support for making thread_rng a full-strength cryptographic PRNG. I do not have an opinion on which specific algorithm should be used, though. In another context (backing up the C library's |
That is a good suggestion, and already what ISAAC does, and HC-128 to a smaller extend. The small amount of logic to read the next result, and generate new ones if necessary is already 20~40% of the time it takes to generate the next random number. |
I measured the initialization time weeks ago, but forgot to post it. Initialization time (ns):
|
But 2482 + 4347 >> 4686, and anyway 2482 is not much different that 2636? Those numbers make no sense. Anyway, I don't think we have to worry too much since |
Sloppy cope & paste, sorry for the confusion. The column 'OS' is my estimate of the times spend outside the control of the RNG init function, by measuring the difference between initializing from The impact of reseeding the RNG in This matches the benchmarks surprisingly well. Maybe it is better to set the reseeding threshold of But I suppose any threshold that is less than the number of rounds it takes to recover the internal state of an RNG ('crack' it) is good. HC-128 claims there is no better algorithm than a brute-force search of 2^128 values. Such numbers do not drop quickly, an improvement that could crack it in half the time would still take 2^127 tries. A quick look the much weaker (and not actually recognised as cryptographically secure) RC-4 shows it has an attack that can recover the state after 2^26 rounds. I think we can safely say that for an actual cryptographic RNG the number should never drop that low. 2^26 rounds of 4 bytes means reseeding after generating 128 MB. That seems like a nicer compromise to me between not reseeding, and reseeding unnecessary often. With HC-128 running at its best speed it still means reseeding 14 times per second. |
Should we also consider Kravatte? I don't know a lot about it, but it's supposed to be a pseudo-random function from the Keccak team. |
Nice find! I think there are quite a few cryptographically good RNG's to choose from. I saw replacing ISAAC with HC-128 as a simple step because they are so similar. They follow roughly the same method, both are based on indirect array indexing. And they have roughly the seems performance, reportedly much faster than the alternatives. I probably sell it short by saying this, but HC-128 can be seen as an improved variant of ISAAC. So it seems like a not too controversial change. Exploring / collecting more |
@Lokathor can I get you to do some more Rasperry Pi benchmarks? HC-128 is merged into upstream master now; it would be nice to see the result of |
yeah, I'll get to this when I have a free moment. |
|
Thanks @Lokathor. Looks like @pitdicker was right that ISAAC64 is faster than the 32-bit version everywhere; unfortunately HC-128 doesn't perform well. Comparison on my laptop (Haswell):
|
Wow! I did not expect such a large difference between ISAAC and HC-128. I wonder what causes it?
If the problem is code size or register pressure HC-128 can be optimized some more for this target. But if the performance difference is just because of the larger tables or the number of instructions this is it... |
I should be careful not to claim too much... Actually I didn't notice it before your comment #53 (comment) 😄.
|
Implemented: rust-random#277 |
Okay, this is somewhat funny. After putting quite some effort into optimising our implementation of ISAAC, I now propose we replace it with HC-128.
Just like ISAAC and RC-4 before it, HC-128 is an array based stream cipher that we use as an RNG. Comparison:
Security / predictability
ISAAC is designed to be usable for applications that need a cryptographically secure PRNG. But it does not really meet the current standards for one. This makes ISAAC overkill for applications that just need a statistically good PRNG, and not good enough for anything that needs guarantees for security.
HC-128 is designed by Hongjun Wu, one of the experts in the field. It is well received, and selected as one of the "stream ciphers suitable for widespread adoption" by eSTREAM. A very comprehensive analysis of the current state of attacks / known weaknesses of HC-128 is given in "Some Results On Analysis And Implementation Of HC-128 Stream Cipher" by Shashwat Raizada.
HC-128 has no known weaknesses that are easier to exploit than doing a brute-force search of 2^128. Is makes clear security promises, and also has a clear story around initialization.
Performance
HC-128 is an 32-bit algorithm. It performs usually 30% faster than the 32-bit ISAAC, and mostly the same as the 64-bit ISAAC-64. Only for
fill_bytes
ISAAC-64 is about 45% faster. Note that this makes it still as fast as our implementation until a week ago.The results here may improve a little bit, because I have not been able yet to have to compiler optimise out all bounds checks. It now does one per
u32
.x86 (all numbers in Mb/s):
x86_64 (all numbers in Mb/s):
For the time it takes to initialise I don't yet have good numbers. With the unoptimised routine from hc128 it takes about as much time as ISAAC-64. It can probably be improved by ~30%.
Memory
ISAAC uses two arrays: one to hold 256 state words, and one to hold 256 results. HC-128 needs two arrays of 512 words to hold its state. As an optimisation we include a 16-word array of results.
Memory usage (excl. counters etc.):
Overall HC-128 seems like a clear improvement over ISAAC. For ISAAC it is not really clear what its place is in the world. HC-128 is a super-fast cryptographic PRNG that works similar.
The text was updated successfully, but these errors were encountered: