-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MersenneTwister: more efficient integer generation with caching #25058
Conversation
cb2a525
to
450ead6
Compare
@nanosoldier |
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan |
I tested systematically all reasonable values for the size of the cache, and it appeared that there is a threshold (501 Now, this change had a consequence I didn't think about: we test that Now, I have a big quesion mark in my head: if I use |
@jrevels, @andreasnoack ^ any ideas about the nanosoldier results? |
Keep in mind that if you run the BaseBenchmarks suite yourself, you technically should be tuning the benchmarks for your machine and Julia build rather than use the tuning parameters cached in the repo. Those parameters were tuned specifically with Nanosoldier in mind. It's not a problem for all benchmarks, but can be for some. Also, unless one employs low-level randomization schemes between experiments, benchmark performance will theoretically depend on prior machine state, such that running a set of benchmarks in one sequence can produce different results than running the set in a different sequence. Again, some benchmarks are more vulnerable to these effects than others. Can you show an example where the manual |
Ok great info thanks!
For example
I even deleted all the lines in this file and added a simplified:
But this doesn't change anything. |
Could we create fixed fake |
I don't relly understand, but I will guess your are more talking to Jarrett ;-) |
450ead6
to
e6ccbb1
Compare
I pushed what is hopefully a fix for the regressions, except for @nanosoldier |
Looks like Nanosoldier gave a 400 for your request. @nanosoldier |
...and mine too. @nanosoldier |
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan |
e6ccbb1
to
d32e267
Compare
At least the regressions affecting
And if you have even more time to spend, you can checkout instead the branch at #25197 ("rf/rand/cache-float") which contains this PR plus more |
|
Thanks so much @maleadt ! this confirms at least that I don't hallucinate with performance improvents :) |
d32e267
to
2a293e2
Compare
I just checked on yet another computer which also shows similar speed improvements, so I will merge when this turn green. |
2a293e2
to
4f78032
Compare
@rfourquet Is it correct that this PR changed the values of 0.6julia> srand(1); rand(Int)
-2795697449271335806 0.7-julia> srand(1); rand(Int)
6592875438459116351 Just wanted to confirm since some package tests rely on reproducibility here so we'd need to find a workaround in the packages if the change is intended. |
See also #17426, in particular Simon's comment:
|
@andreasnoack yes the change is intended, sorry for the inconvenience! This should not change anymore before 1.0 (but I may propose a last change for generation from a range). |
@rfourquet is there a way to seed 0.7 such that it will reproduce the results from 0.6? |
The only way I imagine to get the same numbers would be to create a dummy |
This is why we need a way to dispatch to provide a random number generator providing the entropy and a |
This is the last "performance" PR I wanted in 1.0. It depends on the PR on which it is based off of for getting better performance. But the gains are less significative than when I first implemented it some 3 years ago. Hence RFC, to discuss the trade-off.
The idea is that, as
MersenneTwister
is natively efficient to randomize an array, we can store an array of integers as aMersenneTwister
field and randomize it; each time a random integer is requested, we give one of the cached values. This is exactly like we do already forFloat64
values. So do we wantMersenneTwister
to be heavier by some kB for better performance? I don't see any drawback, but I'm not sure. I played a little bit with the size of the cache, the "best" one so far is 10kB, with speed-ups as follows on my machine (Int==Int64
):Int8
-> +20%Int16
-> +15%Int32
-> +20%Int64
-> +60%Int128
-> +30%This is WIP: MersenneTwister is untouched, the new algorithm is used when instanciating a helper RNG like
r = IntCached{625}(MersenneTwister()); rand(r, Int)
, where625
is the size of the cache in terms ofUInt128
integers. It will be very fast to remove the WIP if we decide so.