Replies: 7 comments
-
This was on my list of things to look at, because I noticed the increase in fan speed when the random number sequence test is running. The current PRSG implementation is in fact very efficient if you care about correlation between successive numbers, as it generates 32 new bits at a time, not just one. But for this application, that's probably not necessary. |
Beta Was this translation helpful? Give feedback.
-
This looks like a good alternative: https://en.wikipedia.org/wiki/Xorshift |
Beta Was this translation helpful? Give feedback.
-
That's indeed fewer shifts than a 64-bit Fibonacci LFSR, but more than a Galois LFSR :) Xorshift may mix the bits better; the choice depends on whether we need that, or a higher duty cycle for the memory writes we're interested in, and thereby faster turnaround time for |
Beta Was this translation helpful? Give feedback.
-
Agreed, if we don't care that the random sequence is actually a walking bit pattern with only one random new bit per word, the Galois LFSR is fine. |
Beta Was this translation helpful? Give feedback.
-
I've rewritten the code to in-line the random number generation and keep its state in a local variable. Using the xorshift algorithm roughly halved the time taken for test 8 on my main machine. Changing that to a Galois LFSR algorithm made no noticeable further improvement, so I've left it with the xorshift algorithm. Feel free to improve on that! |
Beta Was this translation helpful? Give feedback.
-
That's good :) |
Beta Was this translation helpful? Give feedback.
-
I did test it, because there used to be hand written assembler for all the tests. The compiler generated code was just as fast. That saved having to write 64-bit variants of all the assembler code. I could always beat a compiler when writing 68000 assembler. With a super-scalar, out-of-order, speculatively executing processor, it's a lot harder. |
Beta Was this translation helpful? Give feedback.
-
In
tests/mov_inv_random.c::test_mov_inv_random()
,tests/test_helper.c::random()
is being called in a loop, and itself callstests/test_helper.c::prsg()
, which contains a software implementation of a Fibonacci LFSR: on every CPU, a memory read + multiple shifts + a memory write. That's 20+ instructions (more on 32-bit CPUs), with a control flow change, for generating a random value, dwarfing the memory write caused by the test's loop.I think that RNG can be made faster at low positive or negative size cost:
prsg()
, a Galois LFSR is usually both faster and smaller than a Fibonacci LFSR, which is more efficient for a hardware implementation;test_mov_inv_random()
, the LFSR could be moved inline inside the loop, which would allow keeping it in a CPU register and directly taking advantage of the carry flag (for a Galois LFSR, at least).WDYT ?
Beta Was this translation helpful? Give feedback.
All reactions