-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
faster rand!(::MersenneTwister, ::UnitRange) #25047
Conversation
This sampler was based on the idea that generating a UInt32 value is cheaper than generating a UInt64 one. Historically, this is based on the performance characteristics of MersenneTwister. Here, we refine this idea, knowing that this RNG produces natively 52 bits of entropy. So the thresholds determining how native values are generated each time are 52 bits and 104 bits. In particular, this becomes much more efficient for small (of length <= 2^52) Int128 ranges, for which previously 3 native values had to be generated instead of one.
This could make it slightly faster in some cases, by avoiding a branch, but the main idea so to prepare the unification of code between this Sampler and SamplerRangeInt
f07f17d
to
5bb6b4d
Compare
@nanosoldier |
@rfourquet You have a bunch of PRs open - and its unclear (at least to me) what is the right order to merge (all or some of them). Maybe good to discuss it on slack to co-ordinate and get the reviews going. I think the first thing we should do is get things into stdlib, then apply any API changing PRs. The performance ones can be applied at a later point or in 1.1 as well. |
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan |
@ViralBShah Yes sorry to flood github with last minute work! I'm pushing those performance PR now because they are breaking, so would be good to have them for 1.0. Besides this one, I am preparing a last one today... but yes we can discuss on slack, thanks! |
The benchmarks are consistent: while computing |
I will merge in few hours if no objection, in the meantime I will rebase the excision PR on top of this one, so that I have only one rebase to do. My other RNG PRs are easier to rebase after the excision and require a triage decision anyway. |
Ses commits message for details. The main performance improvement is for
[U]Int128
, e.g.a = zeros(Int128, 1000); rand!(a, Int128(1):Int128(10))
sees a 3x speed-up.I plan also to unify further the 2 samplers for ranges, they look now very similar, but this wouldn't need to change the streams, so can be done in a non-breaking way later.