-
Notifications
You must be signed in to change notification settings - Fork 385
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(p/demo): PCG pseudo-random number generator #1993
Conversation
That's great! Ideally, we should complete porting |
Thanks! I'm continuing to work on supporting the full functionality of |
@notJoon even though its still in draft, amazing effort 👏 I'm not the expert on this but, since we are talking blockchains and determinism - I assume you plan to use this package just as a demo package, since the values should be guessable with enough effort from an attacker's perspective. Is this correct? |
It is almost impossible to predict random values without knowing the initial state and seed value. However, I have not yet conducted the NIST SP 800-22 tests (well known test suite for PRNG), and a lot of verification is needed to ensure complete security. Therefore, I made it at the demo level. |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #1993 +/- ##
==========================================
- Coverage 48.44% 46.72% -1.72%
==========================================
Files 409 492 +83
Lines 61965 69614 +7649
==========================================
+ Hits 30019 32530 +2511
- Misses 29446 34371 +4925
- Partials 2500 2713 +213 ☔ View full report in Codecov by Sentry. |
Over the past few days, I created a test framework based on the NIST SP-800-22 paper to verify the random numbers generated in the implementation and performed the verification. I set the significance level to 0.01 (If the result is greater than this value, it means it is random), and for the remaining values, I used the recommended input values from the paper and ran the tests. The results are as follows: (The universal test was run separately as it requires a large input size)
While these tests can demonstrate that the generated random numbers have statistical randomness, they do not guarantee cryptographic security. Although it is difficult to reverse-engineer the seed using only the generated values without knowing the initial state or seed, there is still a possibility, however small. Nevertheless, Go's Currently, the initial seed value is set to be specified by the user, as hiding it remains a challenge. If it becomes possible to utilize the operating system's entropy source in the future, this aspect could be improved. |
Thanks for reply @thehowl. I implemented this as a temporary measure when the random package was not available, so I think it may not have much significance now. Also, there was still some ambiguity because still couldn't access the entropy source. |
Description
Random number generation package using the PCG algorithm
Overview
I used the PCG (Permuted Congruential Generator) algorithm to support most of the methods provided in
math/rand
. This algorithm is a pseudo-random number generator (PRNG) that generates the same random number sequence whenever the initial seed (state) is given, exhibiting fully deterministic characteristics. It also makes patterns and long period1 difficult to predict through the permutation process.The implementation basically uses a linear congruential method but modifies the results through additional bitwise operations such as XOR. This approach improves the statistical quality of random numbers and reduces predictability.
Compatibility
I tried to match the methods provided in the
math/rand
package as much as possible, but there are a few differences. The table below compares the main functions between each implementation function name and includes descriptions.Seed
Seed
Int
Uint32
Uint64
Intn
Uint32n
Uint64n
Float64
Float64
Perm
Perm
Read
Read
Shuffle
Shuffle
NormFloat64
ExpFloat64
Performance and Benchmarks
Currently, direct benchmarking is not possible in Gno, so I measured using Go. The Go implementation is named
MathRand
, and PCG is namedPCG
in the benchmark identifiers. Please refer to the following link for detailed benchmark code.Based on the benchmarks I conducted using Go implementations, the PCG algorithm generally outperforms the standard
math/rand
package in terms of generating single random numbers. PCG32 and PCG64 are approximately twice as fast as theirmath/rand
counterparts. When generating random numbers within a given range, PCG also demonstrates superior performance, with PCG32 and PCG64 being nearly 3 times faster thanmath/rand
'sIntn
method. In the case of shuffling, PCG32 and PCG64 exhibit better performance compared tomath/rand
, with PCG32 being the fastest.However, when it comes to generating random floating-point numbers, PCG implementation is slightly slower than Go's
math/rand
package. Overall, the PCG algorithm provides improved performance in most scenarios while maintaining high-quality random number generation.Generate Single Random Number
Generate Singe Number In Given Range
Read
Shuffle
Float
Generating random floating-point numbers is slightly slower than Go's implementation.
Limitations
However, the current PCG implementation does not guarantee a completely uniform distribution. This is because when applying chi-square2 test, the p-value3 is smaller than the significance level, leading to the rejection of the null hypothesis. Nevertheless, it exhibits a certain level of uniformity, so it is believed that this will not pose a significant problem for a simple purpose (.
Below are the distributions of random numbers generated by PCG and
math/rand
under the same conditions. The blue color (top) is PCG. See the link for the code to generate the distribution graph.Footnotes
In a PRNG, the priod refers to the number of generated values before the sequence starts repeating itself. It is a crucial property that determines the quality and effectiveness of the PRNG algorithm. A longer period indicates that the generated sequence will have more randome and less predictable pattern. The PCG is designed to have an extremely long period, typically in the order of
2^64
or higher ref. ↩The chi-square test is a stastical hypothesis test that measures the goodness of fit between the observed and expected frequencies under the assumption of a specific distribution. In this context, it is used to assess the uniformity of the generated random numbers by comparing their distribution to the expected uniform distribution. ↩
The p-value is the probability of observing a test stastic as extreme as the one calculated from the sample data. assuming the null hypothesis(
H0
) istrue
. In hypothesis testing, if the p-value is smaller than the predetermined significance level (e.g., 0.05 mostly), theH0
is rejected, indicating that the observed data is unlikely to have occurred by chance alone. The alternative hypothesis(H1
), in this case, would state the generated random numbers do not follow a uniform distribution. ↩