-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reseeding std::rand
.
#722
Reseeding std::rand
.
#722
Conversation
cc @sfackler, @nagisa, @aturon, @alexcrichton |
/// mediated by information contained in `Data`. | ||
trait Random<Data = FullRange> { | ||
/// Create a random value of type `Self` | ||
fn random<R: ?Sized + RngBase>(data: Data, rng: &mut Rng) -> Self; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rng
should be R
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RngBase
should be Rng
I think.
Notes to self; extra alternatives (will incorporate them into the document in the near future):
|
WRT the lack of collections traits, the only reason that they don't exist is that we can't define them in a reasonable way without HKT. Unless the same is true here, I'd prefer to keep the traits. |
a program to have forcibly deterministic execution by seeding the | ||
`ThreadRng`. Changing this is backwards compatible. | ||
|
||
This may make more sense to exist in `std::thread` or |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is the right spot for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This == std::rand
?
This is really nice! I'm slightly hesitant about the
Would it be possible to provide both ranged and non-ranged API variants while still performing some of the unifications of this RFC? |
You may consider using the term RBG (random bit generator) instead of RNG (random number generator) to avoid a common source of confusion. Here's the explanation http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n3847.pdf from the author of |
``` | ||
|
||
As today, the relationship between output of the methods of `Rng`, and | ||
between them and `io::Reader` is unspecified. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/, and between them//?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean to say that the relationship between next_u32
, next_u64
, next_f32
, next_f64
and Reader::read
is pairwise unspecified (I thought "between the methods of Rng
and Reader
" was ambiguous, as it doesn't clearly say anything about intra-Rng
connections.)
like seeding a hashmap to avoid algorithmic complexity DoS attacks | ||
does not require a (possibly expensive) call into the operating system | ||
for every `HashMap::new()` call; a high-quality user-space RNG with an | ||
unpredictable seed (e.g. from the OS) is almost certainly good enough. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes it sound like the operating system's random number generator is somehow inherently better than a userspace one can be, which is totally untrue.
I think it's irresponsible to attempt to pass the responsibility of implementing a CSPRNG on to someone else, when it's known that the implementation people are being told to use is less secure and extremely slow. In fact, it's such a bad practice that it is directly violating the guidelines given by OpenBSD for the usage of the OS random-number generator. If there's no claim that the userspace RNG is cryptographically secure, then it can't be used for hash tables without documenting that they have no guarantees of protection against a DoS attack. It's a vulnerability if the RNG is not regarded as a CSPRNG. |
The fact that the OS random number generators don't scale at all means that Rust will have nothing available in many cases. Typical usage of cryptography in software is very performance critical, and if it's not fast enough then it won't be used. There's a reason that performance is one of the most important aspects in the design / choice of cryptographic primitives. The OS RNG on Linux is not designed by cryptographers or based on proven cryptographic primitives. It is poorly understood, and the few people who have done real cryptanalysis have found significant weaknesses. The argument to an authority in the kernel is an empty one. In fact, the kernel intentionally eschews proven cryptography based on what is essentially an attempt at "security through obscurity". |
One example, finding that the entropy mixing / estimation in the Linux kernel is counter-productive from a security perspective and that the hand-rolled cryptography it uses is no good: |
I think that Rust should not specify a specific cipher for the default RNG. For example on CPUs with hardware AES instructions using AES instead of ChaCha20 for the cipher might be faster while ChaCha20 or other ciphers might be preferred elsewhere. It's also nice to be able to configure a Rust program to run with a deterministic seed without changing it (perhaps even with an environment variable), so exposing the fact that the userspace RNG is seeded from OsRng seems a bad idea as well. For this reason, as well as the fact that an unbuffered /dev/urandom should not be used for anything than seeding another userspace RNG, it might make sense to also not expose OsRng. Overall, it seems the best solution might be to just expose a single opaque "DefaultRng" that is a thread-local ChaCha20/AES/... RNG that is seeded from a global ChaCha20/AES/... RNG (or possibly from the thread-local RNG in the parent thread) that is seeded from /dev/urandom or from a seed specified either at program initialization or in an environment variable. Plus explicit RNGs with deterministic seeding, but maybe those don't really need to be in the standard library. |
@bill-myers what do you mean by "default RNG" and "the userspace RNG"? Based on your second last paragraph the |
CC @apoelstra |
@thestinger thank you for pointing out the poor motivation I'd given the design written here. The discussion of cryptography was particularly ungreat. The use of the operating system was meant to be a default position, since no other generator in A better motivation, inline with the designs for the rest of std, would be: provide a small set of (mostly) opinionated primitives, However, I'm moving more and more towards a version of @bill-myers's final suggestion: move the whole of |
Doing the latter gives us much more flexibility to iterate on the nits in the design of the |
(The intention would be to adopt the design and feedback here in the move, where relevant, but there would be dramatically less need for deletions.) |
I think a fast CSPRNG is a must-have because otherwise the easiest way to get random data will be by calling the standard C library via FFI, despite the many caveats. Stripping away functionality like the range generation due to backwards compatibility concerns would encourage people to roll their own code and it will likely be wrong. In this case it's better to be stuck with a few imperfect APIs forever than widespread misuse of random numbers across the ecosystem. It's important that it's fast (unlike the OS RNGs) because that's another reason to use a non-cryptographically secure RNG even though it should be fast enough for nearly all use cases. This is why I think the risks of bad practices with random data spreading throughout the ecosystem far outweigh the tiny risk that you've implemented it incorrectly. I really doubt that cryptographers could do a better job implementing a clear cut algorithm like ChaCha20 that's explicitly designed to avoid any implementation issues by avoiding branches and table lookups (unlike algorithms like AES that are super hard to do correctly). It's a really overblown risk and it shouldn't drive the design process. The risk for all the snippets of |
I agree that it is important to provide tools to allow users to not shoot themselves in the foot, but the intention after moving it out of Lastly, other "core" functionality will only be available via crates.io (and their git repos) at 1.0, so accessing libraries there will be the norm, and functionality like rust-lang/cargo#4 will make it even easier to load them: what is now |
Closing as discussed. I'm to have the removal happening in the next few days. |
Stabilise the
std::rand
module by focusing it on a smaller set offunctionality, moving "fancy" functionality to a crates.io crate.