-
Notifications
You must be signed in to change notification settings - Fork 432
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Documentation #423
Documentation #423
Conversation
Still polishing. Moved a lot of comments and documentation around in the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not really sure about this new rngs
module:
- first, because we have both
prng
andrngs
modules, though quite possibly the contents ofprng
will all get moved to other crates later - second, because it's lumping external/true RNGs, shims like
ReadRng
, thethread_rng
convenience and the "standard" PRNGs all together, though those are quite different things - third since if we maintain sub-modules like
rand::rngs::mock::StepRng
this gets quite tedious to use, and there are reasons to retain at least some of these sub-modules
Another option might be to create an external
or trng
module for the external/true RNGs (os, jitter, entropy and maybe read).
As far as documentation goes, I'd rather just add an empty module with nothing but doc, since we can move the documentation elsewhere later anyway.
src/rngs/mod.rs
Outdated
//! [`thread_rng`]: ../fn.thread_rng.html | ||
|
||
|
||
mod entropy_rng; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_rng
in the name is redundant; since you're moving anyway maybe remove the suffix?
src/rngs/mod.rs
Outdated
mod reseeding; | ||
mod small_rng; | ||
mod std_rng; | ||
#[cfg(feature="std")] mod thread_rng; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
small_rng
and std_rng
are small — is it worth having a module for each instead of just putting the code here? I did consider putting the thread_rng
stuff into a std_rng
module once but not sure it's the best option.
Again, _rng
is redundant on these... simply using thread
for the mod name is a bit funny but may be okay.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I quite like putting ThreadRng
and StdRng
together. They are basically supposed to be the same RNG anyway, except for the location in memory (and details...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did consider putting the
thread_rng
stuff into astd_rng
module once but not sure it's the best option.
On second though maybe not. Are we sure thread_rng
will keep wrapping StdRng
, also on things like WebAssembly and no_std
?
src/rngs/mod.rs
Outdated
//! | ||
//! This module provides the three standard RNGs exposed by Rand (besides | ||
//! specific PRNG algorithms in the `prng` module). Those are [`ThreadRng`], | ||
//! [`StdRng`], and [`SmallRng`]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ThreadRng
is not really a standard generator but merely a handle... I think it would be better to say the two standard PRNGs
src/rngs/mod.rs
Outdated
//! | ||
//! [`ReadRng`] is an adapter to turn any `Read` into an RNG. | ||
//! | ||
//! Finally there is [`StepRng`], which is not much more than a counter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we call this mock::StepRng
please? It emphasises that this is for testing only, and we might add more mock RNGs later. I.e. don't re-export StepRng
below.
src/rngs/mod.rs
Outdated
//! | ||
//! To get a seed for those PRNGs at runtime, [`EntropyRng`], [`OsRng`] and | ||
//! [`JitterRng`] are sources of external randomness. Those are mostly | ||
//! implementation details for Rand, prefer to use the [`FromEntropy`] trait |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This implies there are no direct uses of OsRng
— we've previously recommended using it directly for important keys.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll add a note.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using OsRng
is also essentially what crypto libraries like ring do.
It started as just an experiment, but I am starting to like it to be honest. But you have good concerns. I also thought having both a
In a way they were already together, in the top dir. I think that now having a module with documentation for them helps make the uses and differences more clear than what we had before. But I don't mind much where
Yes, a good alternative (don't like the names much though). But if we move almost everything out as you suggest the module is empty except for Would it help to look at things that way? Not in how the RNGs provided are different, but that they are all implementers of |
B.t.w. if this is acceptable I will split out the reorganisation into another PR, to make it better reviewable. |
Yes, this is a good idea. I'm still not convinced about the reorganisation though; I'll take another look (but likely I won't have time until Sunday or Monday). |
I kind of like the reorganization. I think it should be consistent with |
//! if rand::random() { // generates a boolean | ||
//! println!("Heads!"); | ||
//! } | ||
//! ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not a fan of advertising random
, because it is kind of an antipattern. I don't think it is very useful to tell how to generate "some random value" without saying what kind of random value. I would expect to at least mention that the values are sampled from a uniform distribution and that the float interval is [0, 1)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A week ago I would have agreed, but there seems a sizeable number of people for which it is pretty much what they need.
68953a4
to
bf84acc
Compare
Rebased on master and split the commits into sort-of logical pieces to help reviewing. I plan to go over the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding the re-org:
- good job putting
StdRng
andSmallRng
in their own modules - putting all the generators (
OsRng
,SmallRng
,ThreadRng
) in anrngs
module kind of works, although it might still be nice to put the external generators elsewhere? Not sure. ReseedingRng
andReadRng
are merely adaptors, and feel a bit out of place
@vks we don't need to worry much about the consistency between rngs
and prng
because prng
will likely disappear. My main concern is that half of rand
is about "RNGs" so the name is very generic.
src/rngs/thread.rs
Outdated
ThreadRng { rng: THREAD_RNG_KEY.with(|t| t.clone()) } | ||
impl ThreadRng { | ||
/// Get a reference to the thread-local random number. | ||
pub fn new() -> ThreadRng { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Until #404 is decided I think this constructor should be #[doc(hidden)]
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You want to avoid having both?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think at least avoid to that I push through a solution without proper discussion 😄.
In this case it made code organization a little nicer: the types live in the rngs
module, the two convenience functions random()
and tread_rng()
in src
.
I like to use modules to remove common prefixes or suffixes from names: |
Finished the PRNG module. Probably far from perfect. I tried to cover the aspects of performance, quality, period, and security, which seem like the practical problems to me, without going into too much detail. |
@dhardy I don't think al that much of my writing or my English. If you basically agree with something but see it could use some tweaking, is it easier to do that directly with a commit? |
@pitdicker okay, I'll consider tweaking. I may get enough time today to read through. |
I went over the doc for the distribution module: dhardy@64bb0c4 I'll probably end up re-arranging the commits, but think this particular module is more-or-less done. One thing I noted is that the |
I think this is fine, the uniform distribution is usually defined as a range in statistics (see Wikipedia). |
Thank you, look like good changes! I did split it a little more in sentences and paragraphs. |
I don't really like the trait name |
Those changes look good. You're right about |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've read over the OsRng, JitterRng and ThreadRng changes; mostly looks good but some notes.
src/rngs/os.rs
Outdated
/// especially on virtual machines, `/dev/urandom` may return data that is less | ||
/// random. | ||
/// | ||
/// As a countermeasure we try to do a single read from `/dev/random` in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know you like short paragraphs, but this countermeasure is directly related to the above problem and should be part of the same paragraph. This is more a problem here than where the text was before.
src/rngs/jitter.rs
Outdated
@@ -47,10 +47,90 @@ const MEMORY_SIZE: usize = MEMORY_BLOCKS * MEMORY_BLOCKSIZE; | |||
/// Use of `JitterRng` is recommended for initializing cryptographic PRNGs when | |||
/// [`OsRng`] is not available. | |||
/// | |||
/// `JitterRng` can be used without the standard library, but not conveniently. | |||
/// You have to provide a high-precision timer, and have to carefully follow the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... not conveniently, since you must provide ... and carefully follow ...
src/rngs/jitter.rs
Outdated
/// # Quality testing | ||
/// | ||
/// [`JitterRng::new()`] automatically does its own quality testing. But before | ||
/// using `JitterRng` on untested hardware, after changes that could effect how |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[
JitterRng::new()
] has built-in quality-of-entropy tests, however before usingJitterRng
on untested hardware, ..., or after ... (such as a new LLVM version), ...
src/rngs/jitter.rs
Outdated
/// [NIST SP 800-90B Entropy Estimation Suite]( | ||
/// https://github.com/usnistgov/SP800-90B_EntropyAssessment). | ||
/// | ||
/// Use the following code using [`timer_stats`] to collect the data: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Including the quality-testing warning in this documentation is fine, but the code and instructions is a bit much IMO. Have you considered putting this program in examples/
to make it simpler to run?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that is a good idea. Sounds like a bit more work than I'd like to put in it at the moment t.b.h. A few things I don't yet have the answer to: Do examples have a way to not run them when testing? Where to keep the instructions, in the JitterRng
documentation in in the source of the 'example'? How do we make clear to users that this is not like the other examples (to not scare them away 😄)? Maybe by adding some readme to the examples dir?
59e9ae2
to
2d6bafb
Compare
Cycles per byte is a very good measurement. First, because it is unlike
throughput, at least independent of SKUs within single architecture.
Second, because it can be measured without worrying about noise.
While I agree that it does not provide a good overview for absolute level
of performance, it definitely does a good job of providing an overview of
relative performance levels (using chacha, but I need 2x faster? We'll just
gotta pick an algorithm that has double the bytes per cycle.
Finally, you can standardise and aggregate score with varying weights.
Running the code on a cycle-accurate simulator for few variants of x86 and
arm and then aggregating the score to a single number will provide a fairly
accurate estimate for any use-case.
As for what to benchmark, I'd say we either pick the worst case for an
algorithm or the best case for the algorithm, noting which case was
benchmarked and what function exhibited said case.
…On Tue, May 15, 2018, 11:05 Diggory Hardy ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In src/prng/mod.rs
<#423 (comment)>
:
> //!
//! Currently Rand provides only one PRNG, and not a very good one at that:
//!
-//! | name | performance | worst-case | memory usage | initialisation | quality | predictability |
-//! |----- |------------ |----------- | -------------|--------------- |-------- |--------------- |
-//! | [`XorShiftRng`] | fast | fast | 16 bytes | fast | poor | trivial after 4 words |
+//! | name | full name | performance | memory | quality | period | features |
+//! |------|-----------|-------------|--------|---------|--------|----------|
+//! | [`XorShiftRng`] | Xorshift 32/128 | ⭐⭐⭐ | 16 bytes | ⭐ | `u32` * 2<sup>128</sup> - 1 | — |
Fine, remove the "predictability" stuff for basic PRNGs then. You're right
about potential misunderstanding.
Cycles / bytes sounds good in theory but (a) is still highly dependent on
CPU type, (b) is less easy to interpret for users and (c) unfortunately
there is a significant difference in performance depending on which of
next_u32/next_u64/fill_bytes is used.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#423 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AApc0tOzTl7ku0-NAEid5jGyzILT6aMDks5tyoxVgaJpZM4TwE6z>
.
On Tue, May 15, 2018, 11:05 Diggory Hardy ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In src/prng/mod.rs
<#423 (comment)>
:
> //!
//! Currently Rand provides only one PRNG, and not a very good one at that:
//!
-//! | name | performance | worst-case | memory usage | initialisation | quality | predictability |
-//! |----- |------------ |----------- | -------------|--------------- |-------- |--------------- |
-//! | [`XorShiftRng`] | fast | fast | 16 bytes | fast | poor | trivial after 4 words |
+//! | name | full name | performance | memory | quality | period | features |
+//! |------|-----------|-------------|--------|---------|--------|----------|
+//! | [`XorShiftRng`] | Xorshift 32/128 | ⭐⭐⭐ | 16 bytes | ⭐ | `u32` * 2<sup>128</sup> - 1 | — |
Fine, remove the "predictability" stuff for basic PRNGs then. You're right
about potential misunderstanding.
Cycles / bytes sounds good in theory but (a) is still highly dependent on
CPU type, (b) is less easy to interpret for users and (c) unfortunately
there is a significant difference in performance depending on which of
next_u32/next_u64/fill_bytes is used.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#423 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AApc0tOzTl7ku0-NAEid5jGyzILT6aMDks5tyoxVgaJpZM4TwE6z>
.
|
Really? Is there a way of measuring it other than running a benchmark then dividing by the frequency? The rest of what you say makes sense. I'm less sure about choosing best/worst; I was wondering about benchmarking next_x and fill_bytes separately. |
We wrote in the documentation that the performance depends on a lot of factors, among which the surrounding code. Then continue to write that is a reason to not include exact numbers, and that users should benchmark a few different RNGs in their use case. So I'd like to keep things slightly vague, and give just enough of an indication to see whether an RNG can be expected to be clearly faster than another one.
|
src/prng/mod.rs
Outdated
//! | ||
//! Normal PRNGs often use very little memory, commonly only a few words, where | ||
//! Simple PRNGs often use very little memory, commonly only a few words, where | ||
//! a *word* is usually either `u32` or `u64`. This is not universal however, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay — since this starts about "simple PRNGs", change "This is not universal" to "This is not true for all non cryptographic PRNGs"
Or don't use Simple. I don't know, but doesn't seem like quite a lot of non-crypto PRNGs are simple.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't you want to use 'simple' here? #423 (comment)
Yes, what you've done with stars sounds good enough for now. Actual benchmark numbers would be more informative and separate |
Benchmarking |
The current benchmarks do so just fine, by doing it a 1000 times. |
On modern x86s hardware counters exist for cycles and retired instructions. Those are fairly accurate and noise-resistant as they only measure cycles and instructions for the process being measured. As mentioned before they can be read on linux with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking very good now, but a few comments still.
src/prng/mod.rs
Outdated
//! dependence on the CPU architecture as well as the impact of the size of | ||
//! data requested. Because of all this, we do not include performance numbers | ||
//! here but merely a qualitative description. | ||
//! here but merely a qualitative rating (which is not comparable between PRNGs | ||
//! and CSPRNGs). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The performance is now comparable, so you can remove this comment.
src/prng/mod.rs
Outdated
//! dependence on the CPU architecture as well as the impact of the size of | ||
//! data requested. Because of all this, we do not include performance numbers | ||
//! here but merely a qualitative rating (which is not comparable between PRNGs | ||
//! and CSPRNGs). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ratings are now comparable
src/prng/mod.rs
Outdated
//! | ||
//! ### Worst-case performance | ||
//! Because CSPRNGs usually produce a block of values into a cache, they have | ||
//! poor worst case performance (in contrast to regular PRNGs). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
regular PRNGs, which are usually constant-time
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really want to use the term "constant-time" here, as it has special meaning in cryptography. But I'll try something.
src/prng/mod.rs
Outdated
//! Because CSPRNGs usually produce a block of values into a cache, they have | ||
//! poor worst case performance (in contrast to regular PRNGs). | ||
//! | ||
//! Simple PRNGs often use very little memory, commonly only a few words, where |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think another title is needed (memory usage) so that this doesn't come under "worst case performance"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, lost a line. Good catches!
Ready to merge? |
Yes, thank you! |
You too, you did half of the writing 😄. |
Split out from #422.
As I commented in #422 (comment) I tried something a bit more crazy: more all
*Rng
stuff into a separate module.I am sure there quite a few things still messy, but this is just in a way reopening the other PR. Will work on it some more tomorrow.
I have tried to set up the documentation online. The top-level documentation is now of a reasonable length, and I added more documentation in the
rngs
anddistribution
modules.