Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an iterator to Distribution #361

Merged
merged 3 commits into from
Apr 1, 2018
Merged

Conversation

pitdicker
Copy link
Contributor

This tries to solve the same problem as #275, having a nice way to sample from distributions using an iterator. See also #58.

I have added a sample_iter method to Distribution for now. But I don't mind if it is a method on Rng instead, or on both...

With benchmarks similar to those in #275, the results are all over the place. In all situations it is slightly slower than iter::repeat, but usually not by more than a couple of percent. The exception is Uniform, where we use that values generated by the RNG directly:

test gen_1k_fill                   ... bench:       1,129 ns/iter (+/- 1) = 906 MB/s
test gen_1k_gen_iter               ... bench:         729 ns/iter (+/- 1) = 1404 MB/s
test gen_1k_iter_repeat            ... bench:         372 ns/iter (+/- 0) = 2752 MB/s
test gen_1k_sample_iter            ... bench:         626 ns/iter (+/- 1) = 1635 MB/s <- new

Example of using Distribution::sample_iter instead of Rng::gen_iter:

use distributions::{Alphanumeric, Range, Uniform};
let rng = thread_rng();

// Vec of 16 x f32:
// old
let v: Vec<f32> = iter::repeat(()).map(|()| rng.gen()).take(16).collect();
// new
let v: Vec<f32> = Uniform.sample_iter(&mut rng).take(16).collect();

// String
// old
let s: String = rng.gen_ascii_chars().take(10).collect();
// new
let s: String = Alphanumeric.sample_iter(&mut rng).take(7).collect();

// Combined values
// old
println!("{:?}", rng.gen_iter::<(f64, bool)>().take(5)
                    .collect::<Vec<(f64, bool)>>());
// new
println!("{:?}", Uniform.sample_iter(&mut rng).take(5)
                        .collect::<Vec<(f64, bool)>>());

// Dice-rolling:
// old
let die_range = Range::new_inclusive(1, 6);
let mut roll_die = iter::repeat(()).map(|()| rng.sample(die_range));
while roll_die.next().unwrap() != 6 {
    println!("Not a 6; rolling again!");
}
// new
let die_range = Range::new_inclusive(1, 6);
let mut roll_die = die_range.sample_iter(&mut rng);
while roll_die.next().unwrap() != 6 {
    println!("Not a 6; rolling again!");
}

@pitdicker
Copy link
Contributor Author

I also added sample_iter to Rng, and it is slightly nicer to use in most cases:

use rand::{thread_rng, Rng};
use rand::distributions::{Alphanumeric, Range, Uniform};

let mut rng = thread_rng();

// Vec of 16 x f32:
let v: Vec<f32> = thread_rng().sample_iter(&Uniform).take(16).collect();

// String:
let s: String = rng.sample_iter(&Alphanumeric).take(7).collect();

// Combined values
println!("{:?}", thread_rng().sample_iter(&Uniform).take(5)
                             .collect::<Vec<(f64, bool)>>());

// Dice-rolling:
let die_range = Range::new_inclusive(1, 6);
let mut roll_die = rng.sample_iter(&die_range);
while roll_die.next().unwrap() != 6 {
    println!("Not a 6; rolling again!");
}

Copy link
Member

@dhardy dhardy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking nice — perhaps it is worth having both methods!

/// let mut rng = thread_rng();
///
/// // Vec of 16 x f32:
/// let v: Vec<f32> = thread_rng().sample_iter(&Uniform).take(16).collect();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, but lets not promote bad styles in examples by creating a local rng handle then not using it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to show that it was also possible to use thread_rng() directly. But yes, bad style. Will change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think most people will figure that out.

/// }
/// ```
fn sample_iter<'a, T, D: Distribution<T>>(&'a mut self, distr: &'a D)
-> distributions::DistIter<'a, D, Self, T> where Self: Sized
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting that this takes the distribution by reference while sample takes it by value. I'm not saying this is wrong. If we have both, then distr.iter(&mut rng) is still an option when passing by reference is required, so this could take the distribution by value (which would consume die_range above but only if the & is removed — we also implement Distribution for the references).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure we can have both, because the iterator expects an reference instead of something owned. Or do we want to use some sort of Cow or MaybeOwned trick?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it will work if you just change to distr: D here; essentially Distribution::iter receives a reference from this function stack.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

error[E0597]: `distr` does not live long enough
   --> src/lib.rs:414:9
    |
414 |         distr.sample_iter(self)
    |         ^^^^^ borrowed value does not live long enough
415 |     }
    |     - borrowed value only lives until here
    |
note: borrowed value must be valid for the lifetime 'a as defined on the method body at 411:5...
   --> src/lib.rs:411:5
    |
411 | /     fn sample_iter<'a, T, D: Distribution<T>>(&'a mut self, distr: D)
412 | |         -> distributions::DistIter<'a, D, Self, T> where Self: Sized
413 | |     {
414 | |         distr.sample_iter(self)
415 | |     }
    | |_____^

error: aborting due to previous error

For more information about this error, try `rustc --explain E0597`.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, of course the iterator is returned from sample_iter!

@pitdicker
Copy link
Contributor Author

Do you want to do the honours of merging today (and I the rebasing...)?

@dhardy dhardy merged commit 8c099c3 into rust-random:master Apr 1, 2018
@pitdicker pitdicker deleted the sample_iter branch April 1, 2018 17:28
pitdicker pushed a commit that referenced this pull request Apr 4, 2018
Add an iterator to `Distribution`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants