-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
attempt faster randn implementation #5105
Comments
Getting this right would be cool. I've written some (not always correct) hash functions in Julia that might also be nice performance tests at some point. |
Not necessary. |
I think this would be a fun project, even if not a necessary project. Probably does not require an open issue. |
Why not have an open up-for-grabs issue? Do we need a separate repo for issues that Jeff doesn't want to have open but that are nice to have around somewhere? This is fairly unlikely for Viral to forget, but an issue is an excellent place to discuss what would be required. |
I just think this is too minor to put effort into. Of course anybody is welcome to do it, but I'd like to avoid wasting time. |
I think that there is potential for a much faster @StefanKarpinski Is this the kind of thing that a Hacker School student might like to try? |
We could have an open issue in |
Fine by me. |
@ViralBShah, where do you see the potential performance gain coming from? Better algorithms? Inlining? |
If there is a real performance reason, this is fine. I thought it was just
|
Does anyone have experience with mprng? |
I think there is potential to experiment with number if ziggurat levels, using ziggurat recursively in the tail, and autotuning for the architecture. These may or may not help. The other main thing to do is to generate multiple random numbers simultaneously using simd types. A translation of the C code would be a waste of time, but I am thinking of a ground up implementation that is faster. |
If you want to do parallel rng, Random123 seems like the sanest approach. |
Yes, when I do attempt this, I would really like to try using Random123. |
We have had a nice bump due to the transition from C to Julia. Further improvements will potentially happen from better vectorization in the codegen, but closing this for now. |
Well, and the day after, @rfourquet gives us a nice speedup (#9126). Seems like people are doing interesting things in this space, and the potential for SIMD is much better now with some of the stuff we are discussing and recent updates to the compiler. http://arxiv.org/pdf/1403.6870.pdf I will reopen this issue. |
Another interesting thread at Wilmott: http://www.wilmott.com/messageview.cfm?catid=44&threadid=95982&STARTPAGE=2&FTVAR_MSGDBTABLE= In a numpy issue: Also, a paper and code here that talks about performance optimizations, with code for vectorized implementations: |
I downloaded the source code linked in the arxiv paper you referenced, you can get the perfs on your machine via: |
|
Just for the record, since parallel RNGs have been mentionned above... I've been playing a bit with http://www.pcg-random.org/ Now as for In addition I have found it very hard to get anything involving unsigned ints to vectorize with |
try |
Is the |
I believe that @ViralBShah experimented with Random123 and found the performance to be less good than hoped. |
Not sure about conversion section but last time I checked this method is not documented. |
I think it was @andreasnoack |
I made a Julia implementation long time ago, but couldn't get anywhere near the speed they adverticed. I should try with Cxx.jl and see how fast it really is. |
Currently, our
randn
implementation is in C:https://github.com/JuliaLang/julia/blob/master/deps/random/randmtzig.c
For the longest time, I have wanted to have a pure julia implementation, but back then, it was not quite fast enough for something so crucial. We have now come a long way. I feel that there is a chance today that the julia
randn
implementation can be as fast as the C implementation.Also, once it is a pure julia implementation, we can try different numbers of Ziggurat levels, and even SIMD types. We may be able to beat most
randn
implementations out there, but there is no way to find out without attempting to do so.Our perf benchmark has a pure julia implementation, but it is not a translation of the C implementation that is used internally:
https://github.com/JuliaLang/julia/blob/master/test/perf/kernel/ziggurat.jl
https://github.com/JuliaLang/julia/blob/master/test/perf/kernel/ziggurat.c
I am opening this issue mainly to discuss possible implementation strategies for a fast pure julia
randn
.The text was updated successfully, but these errors were encountered: