Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation #423

Merged
merged 12 commits into from
May 15, 2018
Merged

Documentation #423

merged 12 commits into from
May 15, 2018

Conversation

pitdicker
Copy link
Contributor

Split out from #422.

As I commented in #422 (comment) I tried something a bit more crazy: more all *Rng stuff into a separate module.

I am sure there quite a few things still messy, but this is just in a way reopening the other PR. Will work on it some more tomorrow.

I have tried to set up the documentation online. The top-level documentation is now of a reasonable length, and I added more documentation in the rngs and distribution modules.

@pitdicker
Copy link
Contributor Author

Still polishing. Moved a lot of comments and documentation around in the uniform module.

Copy link
Member

@dhardy dhardy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not really sure about this new rngs module:

  • first, because we have both prng and rngs modules, though quite possibly the contents of prng will all get moved to other crates later
  • second, because it's lumping external/true RNGs, shims like ReadRng, the thread_rng convenience and the "standard" PRNGs all together, though those are quite different things
  • third since if we maintain sub-modules like rand::rngs::mock::StepRng this gets quite tedious to use, and there are reasons to retain at least some of these sub-modules

Another option might be to create an external or trng module for the external/true RNGs (os, jitter, entropy and maybe read).

As far as documentation goes, I'd rather just add an empty module with nothing but doc, since we can move the documentation elsewhere later anyway.

src/rngs/mod.rs Outdated
//! [`thread_rng`]: ../fn.thread_rng.html


mod entropy_rng;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_rng in the name is redundant; since you're moving anyway maybe remove the suffix?

src/rngs/mod.rs Outdated
mod reseeding;
mod small_rng;
mod std_rng;
#[cfg(feature="std")] mod thread_rng;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small_rng and std_rng are small — is it worth having a module for each instead of just putting the code here? I did consider putting the thread_rng stuff into a std_rng module once but not sure it's the best option.

Again, _rng is redundant on these... simply using thread for the mod name is a bit funny but may be okay.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I quite like putting ThreadRng and StdRng together. They are basically supposed to be the same RNG anyway, except for the location in memory (and details...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did consider putting the thread_rng stuff into a std_rng module once but not sure it's the best option.

On second though maybe not. Are we sure thread_rng will keep wrapping StdRng, also on things like WebAssembly and no_std?

src/rngs/mod.rs Outdated
//!
//! This module provides the three standard RNGs exposed by Rand (besides
//! specific PRNG algorithms in the `prng` module). Those are [`ThreadRng`],
//! [`StdRng`], and [`SmallRng`].
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ThreadRng is not really a standard generator but merely a handle... I think it would be better to say the two standard PRNGs

src/rngs/mod.rs Outdated
//!
//! [`ReadRng`] is an adapter to turn any `Read` into an RNG.
//!
//! Finally there is [`StepRng`], which is not much more than a counter
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we call this mock::StepRng please? It emphasises that this is for testing only, and we might add more mock RNGs later. I.e. don't re-export StepRng below.

src/rngs/mod.rs Outdated
//!
//! To get a seed for those PRNGs at runtime, [`EntropyRng`], [`OsRng`] and
//! [`JitterRng`] are sources of external randomness. Those are mostly
//! implementation details for Rand, prefer to use the [`FromEntropy`] trait
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implies there are no direct uses of OsRng — we've previously recommended using it directly for important keys.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add a note.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using OsRng is also essentially what crypto libraries like ring do.

@pitdicker
Copy link
Contributor Author

It started as just an experiment, but I am starting to like it to be honest. But you have good concerns.

I also thought having both a prng and rngs module can be a bit confusing. But we decided on only exposing wrappers, not specific PRNGs. So I also just assumed this module would not live long.

  • second, because it's lumping external/true RNGs, shims like ReadRng, the thread_rng convenience and the "standard" PRNGs all together, though those are quite different things

In a way they were already together, in the top dir. I think that now having a module with documentation for them helps make the uses and differences more clear than what we had before.

But I don't mind much where ReadRng and StepRng live. ReadRng seems like something odd enough to not need a top level module. What did you have in mind for adding extra testing RNGs?

Another option might be to create an external or trng module for the external/true RNGs (os, jitter, entropy and maybe read).

Yes, a good alternative (don't like the names much though). But if we move almost everything out as you suggest the module is empty except for StdRng, SmallRng, and ThreadRng. That would lose the advantage of combining all RNG into one place.

Would it help to look at things that way? Not in how the RNGs provided are different, but that they are all implementers of RngCore. It mirrors how we bundle everything Distribution-related in one module.

@pitdicker
Copy link
Contributor Author

B.t.w. if this is acceptable I will split out the reorganisation into another PR, to make it better reviewable.

@pitdicker pitdicker mentioned this pull request May 3, 2018
@dhardy
Copy link
Member

dhardy commented May 3, 2018

B.t.w. if this is acceptable I will split out the reorganisation into another PR, to make it better reviewable.

Yes, this is a good idea. I'm still not convinced about the reorganisation though; I'll take another look (but likely I won't have time until Sunday or Monday).

@vks
Copy link
Collaborator

vks commented May 3, 2018

I kind of like the reorganization. I think it should be consistent with prng, so the module should probably be named rng. Maybe they should even be merged?

//! if rand::random() { // generates a boolean
//! println!("Heads!");
//! }
//! ```
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a fan of advertising random, because it is kind of an antipattern. I don't think it is very useful to tell how to generate "some random value" without saying what kind of random value. I would expect to at least mention that the values are sampled from a uniform distribution and that the float interval is [0, 1).

Copy link
Contributor Author

@pitdicker pitdicker May 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A week ago I would have agreed, but there seems a sizeable number of people for which it is pretty much what they need.

@pitdicker pitdicker force-pushed the documentation branch 3 times, most recently from 68953a4 to bf84acc Compare May 4, 2018 04:36
@pitdicker
Copy link
Contributor Author

Rebased on master and split the commits into sort-of logical pieces to help reviewing.

I plan to go over the prng module today and tomorrow, and may move some stuff around in the rest of the documentation to make it 'fit'.

Copy link
Member

@dhardy dhardy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the re-org:

  • good job putting StdRng and SmallRng in their own modules
  • putting all the generators (OsRng, SmallRng, ThreadRng) in an rngs module kind of works, although it might still be nice to put the external generators elsewhere? Not sure.
  • ReseedingRng and ReadRng are merely adaptors, and feel a bit out of place

@vks we don't need to worry much about the consistency between rngs and prng because prng will likely disappear. My main concern is that half of rand is about "RNGs" so the name is very generic.

ThreadRng { rng: THREAD_RNG_KEY.with(|t| t.clone()) }
impl ThreadRng {
/// Get a reference to the thread-local random number.
pub fn new() -> ThreadRng {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Until #404 is decided I think this constructor should be #[doc(hidden)].

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You want to avoid having both?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think at least avoid to that I push through a solution without proper discussion 😄.

In this case it made code organization a little nicer: the types live in the rngs module, the two convenience functions random() and tread_rng() in src.

@vks
Copy link
Collaborator

vks commented May 4, 2018

@dhardy

My main concern is that half of rand is about "RNGs" so the name is very generic.

I like to use modules to remove common prefixes or suffixes from names: rng::Std instead of StdRng. But I guess changing the names at this point is a lot of unnecessary churn...

@pitdicker
Copy link
Contributor Author

Finished the PRNG module. Probably far from perfect.

I tried to cover the aspects of performance, quality, period, and security, which seem like the practical problems to me, without going into too much detail.

@pitdicker
Copy link
Contributor Author

@dhardy I don't think al that much of my writing or my English. If you basically agree with something but see it could use some tweaking, is it easier to do that directly with a commit?

@dhardy
Copy link
Member

dhardy commented May 7, 2018

@pitdicker okay, I'll consider tweaking. I may get enough time today to read through.

@dhardy
Copy link
Member

dhardy commented May 7, 2018

I went over the doc for the distribution module: dhardy@64bb0c4

I'll probably end up re-arranging the commits, but think this particular module is more-or-less done.

One thing I noted is that the Uniform distribution implements sampling over a space defined as a range: in a way it is unfortunate that we renamed it; hypothetically it might be interesting to define uniform sampling over a space which is not a single contiguous range (though currently I can only think of extension to tuples, which would still be contiguous ranges even if not ordered).

@vks
Copy link
Collaborator

vks commented May 7, 2018

One thing I noted is that the Uniform distribution implements sampling over a space defined as a range: in a way it is unfortunate that we renamed it; hypothetically it might be interesting to define uniform sampling over a space which is not a single contiguous range (though currently I can only think of extension to tuples, which would still be contiguous ranges even if not ordered).

I think this is fine, the uniform distribution is usually defined as a range in statistics (see Wikipedia).

@pitdicker
Copy link
Contributor Author

I went over the doc for the distribution module: dhardy/rand@64bb0c4

Thank you, look like good changes! I did split it a little more in sentences and paragraphs.

@pitdicker
Copy link
Contributor Author

I don't really like the trait name UniformImpl, what do you think of UniformSampler instead?

@dhardy
Copy link
Member

dhardy commented May 8, 2018

Those changes look good.

You're right about UniformImpl; UniformSampler sounds fine. Ideally though not all in the same PR (it might make sense to pull all the "distributions" stuff into a different PR).

@pitdicker pitdicker added the C-docs Documentation label May 8, 2018
Copy link
Member

@dhardy dhardy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've read over the OsRng, JitterRng and ThreadRng changes; mostly looks good but some notes.

src/rngs/os.rs Outdated
/// especially on virtual machines, `/dev/urandom` may return data that is less
/// random.
///
/// As a countermeasure we try to do a single read from `/dev/random` in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know you like short paragraphs, but this countermeasure is directly related to the above problem and should be part of the same paragraph. This is more a problem here than where the text was before.

@@ -47,10 +47,90 @@ const MEMORY_SIZE: usize = MEMORY_BLOCKS * MEMORY_BLOCKSIZE;
/// Use of `JitterRng` is recommended for initializing cryptographic PRNGs when
/// [`OsRng`] is not available.
///
/// `JitterRng` can be used without the standard library, but not conveniently.
/// You have to provide a high-precision timer, and have to carefully follow the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... not conveniently, since you must provide ... and carefully follow ...

/// # Quality testing
///
/// [`JitterRng::new()`] automatically does its own quality testing. But before
/// using `JitterRng` on untested hardware, after changes that could effect how
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[JitterRng::new()] has built-in quality-of-entropy tests, however before using JitterRng on untested hardware, ..., or after ... (such as a new LLVM version), ...

/// [NIST SP 800-90B Entropy Estimation Suite](
/// https://github.com/usnistgov/SP800-90B_EntropyAssessment).
///
/// Use the following code using [`timer_stats`] to collect the data:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Including the quality-testing warning in this documentation is fine, but the code and instructions is a bit much IMO. Have you considered putting this program in examples/ to make it simpler to run?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that is a good idea. Sounds like a bit more work than I'd like to put in it at the moment t.b.h. A few things I don't yet have the answer to: Do examples have a way to not run them when testing? Where to keep the instructions, in the JitterRng documentation in in the source of the 'example'? How do we make clear to users that this is not like the other examples (to not scare them away 😄)? Maybe by adding some readme to the examples dir?

@pitdicker pitdicker force-pushed the documentation branch 2 times, most recently from 59e9ae2 to 2d6bafb Compare May 15, 2018 09:11
@nagisa
Copy link
Contributor

nagisa commented May 15, 2018 via email

@dhardy
Copy link
Member

dhardy commented May 15, 2018

because it can be measured without worrying about noise

Really? Is there a way of measuring it other than running a benchmark then dividing by the frequency?

The rest of what you say makes sense. I'm less sure about choosing best/worst; I was wondering about benchmarking next_x and fill_bytes separately.

@pitdicker
Copy link
Contributor Author

We wrote in the documentation that the performance depends on a lot of factors, among which the surrounding code. Then continue to write that is a reason to not include exact numbers, and that users should benchmark a few different RNGs in their use case.

So I'd like to keep things slightly vague, and give just enough of an indication to see whether an RNG can be expected to be clearly faster than another one.

next_u64 seems like a good function to me to use for rough comparison. 64 bits is what you need for f64 and on many systems usize. It can highlight the difference between 32- en 64-bit RNGs, which next_u32 often can't.

src/prng/mod.rs Outdated
//!
//! Normal PRNGs often use very little memory, commonly only a few words, where
//! Simple PRNGs often use very little memory, commonly only a few words, where
//! a *word* is usually either `u32` or `u64`. This is not universal however,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay — since this starts about "simple PRNGs", change "This is not universal" to "This is not true for all non cryptographic PRNGs"

Or don't use Simple. I don't know, but doesn't seem like quite a lot of non-crypto PRNGs are simple.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't you want to use 'simple' here? #423 (comment)

@dhardy
Copy link
Member

dhardy commented May 15, 2018

Yes, what you've done with stars sounds good enough for now. Actual benchmark numbers would be more informative and separate next_u64 and fill_bytes would be good, but this will do for now.

@vks
Copy link
Collaborator

vks commented May 15, 2018

perf can measure cycles on Linux, but I'm not sure whether it does that by multiplying time with frequency.

Benchmarking next_* instead of fill_bytes is probably not feasible for block RNGs, were the performance of next_* is extremely dependent on whether the next block has to be computed.

@pitdicker
Copy link
Contributor Author

Benchmarking next_* instead of fill_bytes is probably not feasible for block RNGs, were the performance of next_* is extremely dependent on whether the next block has to be computed.

The current benchmarks do so just fine, by doing it a 1000 times.

@nagisa
Copy link
Contributor

nagisa commented May 15, 2018

Really? Is there a way of measuring it other than running a benchmark then dividing by the frequency?

On modern x86s hardware counters exist for cycles and retired instructions. Those are fairly accurate and noise-resistant as they only measure cycles and instructions for the process being measured. As mentioned before they can be read on linux with perf-stat.

Copy link
Member

@dhardy dhardy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking very good now, but a few comments still.

src/prng/mod.rs Outdated
//! dependence on the CPU architecture as well as the impact of the size of
//! data requested. Because of all this, we do not include performance numbers
//! here but merely a qualitative description.
//! here but merely a qualitative rating (which is not comparable between PRNGs
//! and CSPRNGs).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The performance is now comparable, so you can remove this comment.

src/prng/mod.rs Outdated
//! dependence on the CPU architecture as well as the impact of the size of
//! data requested. Because of all this, we do not include performance numbers
//! here but merely a qualitative rating (which is not comparable between PRNGs
//! and CSPRNGs).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ratings are now comparable

src/prng/mod.rs Outdated
//!
//! ### Worst-case performance
//! Because CSPRNGs usually produce a block of values into a cache, they have
//! poor worst case performance (in contrast to regular PRNGs).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

regular PRNGs, which are usually constant-time

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really want to use the term "constant-time" here, as it has special meaning in cryptography. But I'll try something.

src/prng/mod.rs Outdated
//! Because CSPRNGs usually produce a block of values into a cache, they have
//! poor worst case performance (in contrast to regular PRNGs).
//!
//! Simple PRNGs often use very little memory, commonly only a few words, where
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think another title is needed (memory usage) so that this doesn't come under "worst case performance"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, lost a line. Good catches!

@pitdicker
Copy link
Contributor Author

Ready to merge?

@dhardy dhardy merged commit 6290e99 into rust-random:master May 15, 2018
@dhardy
Copy link
Member

dhardy commented May 15, 2018

Yes, thank you!

@pitdicker pitdicker deleted the documentation branch May 15, 2018 11:15
@pitdicker
Copy link
Contributor Author

You too, you did half of the writing 😄.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-docs Documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants