-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve Random (performance, APIs, ...) #47085
Conversation
Note regarding the This serves as a reminder for when your PR is modifying a ref *.cs file and adding/modifying public APIs, to please make sure the API implementation in the src *.cs file is documented with triple slash comments, so the PR reviewers can sign off that change. |
Would it be clearer to break out the two impls into their own files? |
With respect to algorithm, it looks like we didn't really do a "survey" of the PRNG options (as we did eg for hashing algorithms a while ago). Since we aren't implementing a seeded API here, which we can always change later, perhaps that doesn't matter very much so long as we're reasonably confident that it's not "worse" since folks seem to be mostly wanting more performance and asking for more randomness. Looking around at other stacks and utility libraries like Boost, there doesn't seem to be convergence on anything, and where there is, it's not clear whether it's particularly fast. (Or their limitations are accepted, eg., java.Util.Random.nextLong documentation says "this algorithm will not return all possible long values"). I was a bit perturbed by the link in the original issue to a criticism of xoshiro256** by another PRNG purveyor, but on her site the discussion of it seems to conclude its flaws are unlikely to be easily discoverable. So I guess, that algorithm seems good enough for me. |
There are some standard PRNG test frameworks that run their own 'battery of tests' against lots of PRNGs, and which publish results, e.g. https://github.com/lemire/testingRNG and possibly (related to hash functions, but a PRNG is essentially serial application of a hash function): https://github.com/rurban/smhasher I've been suggesting folks use xoshiro256** for a while now, but based on those results I do wonder if wyrand would be a better choice. I don't think there's a problem with using xoshiro256** to be honest (the tests are quite exhaustive!), but since wyrand is available and appears to be faster, and passes more of those tests, maybe it's worth a little bit of effort looking into that a bit more. |
|
I am cool with this. |
76d70ae
to
5b69e72
Compare
I believe the rejection sampling loops aren't quite right. Or rather, they can be tightened up to run less loops. The issue relates to the calculation of bits:
As an example let's take As 8 is a power of two (2^3 == 8), we ought to be generating 3 bits and just returning without having to loop a second time. However, the bits expression above gives a result of 4. I.e. Log2(8) = 3, and then we add one. I believe the correct/better way is to use a method such as:
(from https://github.com/colgreen/Redzen/blob/master/Redzen/MathUtils.cs ... and recently fixed, so just be aware if you have an out of date local copy) Now you can do:
Which more than halves the number of rejection sampling loops, on average, when maxValue is a power of two. |
Specifically for powers of two, yes, the bound could be tighter. |
5b69e72
to
5d4dc64
Compare
@stephentoub @tannergooding How does this PR interact with #23198 and #41457? Does it resolve those issues? |
From my perspective it resolves both. (With the caveat that those issues still exist for a type derived from Random or constructing a Random with a specific seed.) |
Tweaked things a bit further and added a 32-bit implementation. It’s now looking pretty reasonable on Windows. Still need to measure on Linux. 64-bit:
32-bit:
|
src/libraries/System.Private.CoreLib/src/System.Private.CoreLib.Shared.projitems
Show resolved
Hide resolved
- Changes the algorithm used by `new Random()` to be one that's smaller and faster and produces better results - Refactors the implementation to make it internally pluggable - Moves the existing implementation to be used when it's necessary for back compat - Adds NextInt64 and NextSingle methods
e3288ae
to
a8ce8a5
Compare
Linux 64-bit:
|
src/libraries/System.Private.CoreLib/src/System/Random.Xoshiro128StarStarImpl.cs
Outdated
Show resolved
Hide resolved
Just curious, why are the array-filler overloads so much faster on 32 bit vs old impl, when the Next() like methods are slower? I don't see it. Next | master_x86\corerun.exe | 8.975 ns | 1.00 | - |
Because they previously were calling Next per byte and now aren't. |
https://github.com/dotnet/runtime/pull/47085/files#diff-92b19116b63bd5daf93516a0908297653d54d08263b754c8a1bbb2930e14db1bR167 |
That's the LegacyImpl. Xoshiro128StarStarImpl's is: |
I have no more feedback, LGTM but I haven't gone through everything so I can't sign off this moment. |
844b3fa
to
972b5e3
Compare
src/libraries/System.Private.CoreLib/src/System/Random.ImplBase.cs
Outdated
Show resolved
Hide resolved
A XorShift-style algorithm for For some reason, it is not widely known that really good algorithms for random numbers exist. This family of algorithms is of high quality and super fast. Great, that this is finding its way into .NET. As far as I understand, things like the Mersenne twister and other popular algorithms are strongly inferior. For reference, I'm linking my thoughts on this: #6203 (comment) |
I think so, yes. Also, the original xor-shift PRNG as devised by George Marsaglia was a good fast PRNG for its time (early 2000s?), but these new 'xoshiro' derivatives from it pass more statistical tests while maintaining good performance. There might be an argument in future for switching again to a wyrand PRNG (or a derivative if one emerges), or something else, but I think using xoshiro is the right decision today; it gives a nice performance boost and a much better quality of randomness, i.e. fixes all of the raised issues with the old System.Random to the best of my knowledge. Ultimately if you want really good random noise you flip to the slower crypto random generators of course, so there's a kind of spectrum of PRNGs, and trade offs, and this work as at a sweet spot I think between performance and RNG quality. |
new Random()
to be one that's smaller and fasternew Random()
in the futureFixes #26741
Fixes #23198
Fixes #41457
cc: @tannergooding, @danmosemsft, @colgreen
Based on looking at lots of Random usage,
new Random()
is by far the most common way Random is used in real apps. A seed is generally only specified when either a) it's test code and a repeatable sequence is desired or b) sometimes trying to workaround the poor default seed quality we had in .NET Framework, where the seed was based on Environment.TickCount and thus lots of Randoms created in the same quantum would end up with the same seed. There's also a smattering a derived types, often to add thread-safety with each override locking around a call to the base or some such thing. As such, I focused on optimizingnew Random()
(where no guarantees are made about the sequence or what methods may or may not delegate to other methods) and moved the existing implementation to be used only if a seed is provided or if a derived type is being instantiated.A few open issues:
Related to #46890
cc: @tarekgh, @noahfalk