-
Notifications
You must be signed in to change notification settings - Fork 13.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tracking Issue for deterministic random number generation #131606
Comments
ACP: rust-lang/libs-team#394 Tracking issue: rust-lang#131606 The version implemented here uses ChaCha8 as RNG. Whether this is the right choice is still open for debate, so I've included the RNG name in the feature gate to make sure people need to recheck their code if we change the RNG. Also, I've made one minor change to the API proposed in the ACP: in accordance with [C-GETTER](https://rust-lang.github.io/api-guidelines/naming.html#getter-names-follow-rust-convention-c-getter), `get_seed` is now named `seed`.
https://go.dev/blog/randv2 might be relevant here |
ACP: rust-lang/libs-team#394 Tracking issue: rust-lang#131606 The version implemented here uses ChaCha8 as RNG. Whether this is the right choice is still open for debate, so I've included the RNG name in the feature gate to make sure people need to recheck their code if we change the RNG. Also, I've made one minor change to the API proposed in the ACP: in accordance with [C-GETTER](https://rust-lang.github.io/api-guidelines/naming.html#getter-names-follow-rust-convention-c-getter), `get_seed` is now named `seed`.
If a counter-based PRNG is chosen instead, exposing the current state (in addition or instead of?) the original seed would be valuable |
Could the source also support seeding from a u64 (e.g. adopting how rand expands the seed), so that there is one well-supported approaches for this very common use case? |
Since it's always supposed to give the same output forever after stabilization, it seems prudent to choose an algorithm that can be expected to last for a decade or more without regrets. It would be awkward to deprecate this part of std because it later turns out to have some flaws that can't be fixed without breaking reproducibility (see the Go math/rand/v2 story). A well-established cipher that's considered secure today almost certainly won't turn out have major flaws as a source of statistical randomness: any feasible way to distinguish the output from random is a big deal for cryptanalysis, and even an "academic break" (say, key recovery in 2^100 time) doesn't necessarily mean anything about the suitability for Monte Carlo simulations. The same can't be said about the myriad of non-cryptographic designs, where it's often only a matter of time and eyeballs until serious flaws are discovered. For example, AES (Rijndael) is from 1998 and still going strong while the non-cryptographic MT19937 from 1997 was very popular for many years but is now considered flawed and obsolete. ChaCha is from 2008. Several years later, people were still publishing new non-crypto RNGs that fail TestU01, a suite of statistical tests dating to 2007--2009. |
Forgive me, but why re-invent this functionality in the Provision of an OS-getrandom API makes sense since much of the code is in Yes, I get it: PRNGsThere is no particular best PRNG. ChaCha8 should be fine for this use-case, but it's fundamentally a block-based RNG whereas a word-based PRNG like Xoshiro or PCG will be substantially faster for many use-cases. Since this is explicitly about a user-seeded PRNG with user-managed-state there is no good reason not to let the user choose the algorithm too (from some set of options). The name Random traitThe proposed Uniform ranged sampling is the obvious other application. There are quite a few algorithms for this. If you want reproducible outputs, pick one of these and stick with it (or keep the impl unstable until the algorithm is fixed). Now implement for On the topic of reproducibility, we recently removed support for sampling Additional scope?There are several obvious possibilities for scope creep:
This is most of what we elected to keep in ProposalWhat I'd propose therefore is:
I believe this would reduce many people's issues with By itself, this wouldn't remove all usages of |
Under your proposal, why even have the And if std drops the trait, then I can see how this would remove some of the most tricky parts of let mut buf = [0; size_of::<usize>()];
DefaultRandomSouce::default().fill_bytes(&mut buf);
// ^ or a hand-rolled seedable PRNG of choice
let r = usize::from_ne_bytes(buf);
let elem = slice[r % slice.len()]; ... and that's a worse outcome than std providing |
This is a good point, but I think it's workable (though there is possibly reason to keep For block-based PRNGs like ChaCha as well as for For word-based PRNGs, it's questionable whether these should implement the
This goes for some other areas too, e.g.
I mean, we can tell them they shouldn't do that. But if you do want such functionality, then: // surely it's better to support this:
let elt = slice.choose(&mut rng);
// instead of expecting people to write this:
let elt = slice[rng.random_range(..slice.len())]; ? (This doesn't remove the need for And even if the above functionality is in |
I believe only the following things should be in
The "system" sources should allow registration of an alternative implementation similarly to Everything else should be just part of a third-party crate like |
OS interfaces are generally byte-based, but anything ChaCha-based (and related constructions like Blake3 used as XOF) does all the computations over words. If your output buffer is
This would make
Just telling people to not do it is unhelpful and unlikely to be heeded. I've done the modulo thing myself more than once, in full awareness of its issues, just because it was more expedient in that case and I judged it to be good enough. Including "random integer in range" is a sweet spot on the slippery slope to re-inventing all of
I think the situation with |
Yes, you are right. Replacing the impl<R: BlockRngCore<Item = u32>> RngCore for BlockRng<R> {
#[inline]
fn next_u32(&mut self) -> u32 {
super::impls::next_u32_via_fill(self)
}
#[inline]
fn next_u64(&mut self) -> u64 {
super::impls::next_u64_via_fill(self)
}
// ...
} causes a large regression (approx 2-4x cost for
This is the wrong place to discuss As for why this should be in
My apology for the tongue-in-cheek comment ("tell them they shouldn't do that"). So there is a need for "random integer in a range". The first question regarding random-value-generation-in-std is when do you stop the scope creep? Slice shuffling seems useful, as does The second question is whether bringing this into The third question is whether there might be some other advantages to merging a subset of
Yes, time zones are complicated. An API for "give me a UTC time stamp for now" is not, though its implementation may be. The fact that there are plenty of details to discuss regarding random generators and algorithms despite many people "just wanting a random integer in a range" shows the cases are not entirely dissimilar. No, I am not fundamentally opposed to incorporating random-value functionality into |
If it'd be substantially more efficient (or enable types of generators which are substantially more efficient) to add methods like There was no desire or attempt to ignore Along similar lines, while I think "give me a random value of this type" is a useful trait, it's not by any means the only one, and we've already discussed having some mechanism for sampling a distribution so that we can support random floats and similar.
I expect that these at a minimum would make the cut, along with random-in-range functionality (e.g. for things like die rolls). |
Short answer (from memory): this is important for small (non-block) RNGs for the word size of the output (usually If you want benchmarks, we should be able to hack The other important point here is the generator (and intended application). If This is the point I'm getting a little lost on here: is the intended scope to cover only
So this is another point of potential scope creep: do we want a Should a Without having answers to these questions it's hard to know what exactly should be included in |
Some targets (Hermit, RDRAND, RNDR, WASI p2) directly generate random |
Feature gate:
#![feature(deterministic_random_chacha8)]
This is a tracking issue for
DeterministicRandomSource
, a deterministic random number generation that will always generate the same values for a given seed.Public API
Steps / History
Unresolved Questions
seed
function make sense?Footnotes
https://std-dev-guide.rust-lang.org/feature-lifecycle/stabilization.html ↩
The text was updated successfully, but these errors were encountered: