SeedableRng::Seed and from_hashable #62

dhardy · 2017-11-24T16:51:14Z

Edit: PR no longer shows the code thanks to our rebasing of Rand history, but the commit is still available here.

Implement seeding via hash function, as discussed in #18.

This is a very rough implementation.

To discuss, regarding the hash function:

SeaHash is exposed for general usage, simply because it can be (if we're going to do this, we must stabilise it, so there's no worry that it will be replaced). This is a little odd, but seems sensible on the whole.
We could hide SeaHash and from_hashable behind a feature gate (hash or seahash) until we're more comfortable with the idea — that allows stabilising rand-core without this feature, in theory. But hopefully we don't need to do this.
More AsBytesFixed impls are needed, e.g. for u64
I'm unsure that AsBytesFixed is the best way of going about this. I'm considering moving the hashing into the trait implementation with trait Hash<H> { fn hash(state: &mut H, val: Self); } or similar. This allows specialised implementations of SeaHash.
How general do we make our SeaHash impl? I mean, do we allow pushing multiple items to it? This is functionality we don't need but requires very little extra code and possibly a small performance hit, but it does make the SeaHash impl we expose significantly more useful to other users. (We could also consider whether from_hashable should be more general, e.g. consume a SeaHash struct.)

Regarding SeedableRng:

Making Seed an associated type is a compromise; SeedSize associated constant doesn't appear to work; we do however restrict Seed to types supported by Finalize
Making Seed a u64 array instead of u8 array is certainly more convenient for us; any reason we shouldn't do this?
The new SeedableRng::from_seed impls for Isaac aren't 100% backwards compatible (old version allowed much larger seeds).

Lots to discuss, but I think this is moving forward. (@ticki if you get the time, I'd love your thoughts.)

Also removed SeedableRng impl for IsaacWordRng: it's not reproducible across platforms

These changes make it possible to sample from closed ranges, not only from open. Included is a small optimisation for the modulus operator, and an optimisation for the types i8/u8 and i16/u16.

The license declaration in the README is non-specific. I think this is a hold-over from extraction from the Rust repo. The Rust repo has a file that details the other licenses involved. I scanned through this code and most of it has a rust standard mit/apache header. Some files have no header, and could be under BSD, but if that's the case, that specific license text needs to be added somewhere to this repo.

Update README.md license section

fuchsia: magenta was renamed zircon

Bump to 0.3.17

Allow sampling from a closed integer range

No reseeding

This makes documentation work correctly with the new pulldown-cmark Markdown parser (rust-lang/rust#44229).

Fix formatting warnings with commonmark enabled

I have implemented it as a function instead of a trait, as that makes it easy to add it to every RNG ont at a time. I split the `init` function in two instead of the current version that uses a bool to select between two paths. This makes it more clear how the seed is used. The current `mix` macro has to be defined in the function, and would have to be duplicated. Therefore I converted it to a seperate function. I precalculated the values a...h, but am not sure this is a good idea. It makes the resulting code smaller, and gives a small performance win. Because it are 'magic' values anyway, I thought why not?

Also moved the `impl_uint_from_fill` macro from `os.rs` to `randcore`. I had to modify its error handling anyway, and it is shared with `OsRng`.

dhardy · 2017-12-20T14:45:11Z

I found a way to move from_hashable out of rand-core, which I think is preferable for most people. It does unfortunately mean users have to use rand::FromHashable; now, but I guess we can live with that.

Add HC-128 RNG

And a little cleanup around the init functions

Replace `convert_slice_{32,64}` with `read_u{32,64}_into`

Restrict the seed type to a few more array sizes

Make u128 range use widening multiply

Also minor doc fix

dhardy · 2017-12-31T13:18:04Z

Updated: rebased and squashed. I still have the old history locally, but don't see much value.

pitdicker · 2017-12-31T14:07:06Z

High time to get this merged 😄. Moving from_hashable out of rand-core seems like a good idea!

How confident are you in the custom finalizers? And is there anything specific I should look at?

I would like it if the RNG tests would only depend on things in rand-core. Especially if they are to be split out into another crate. Does it make sense to add a test_stdng_construction test in rand instead?

dhardy · 2017-12-31T14:57:53Z

Actually, I think I will replace SeaHash with MetroHash; it's better known and reviewed and has similar performance, as well as native 128-bit output and 256-bits of state (IIRC), and a permutation function which should make it relatively easy to output any amount of state. But I didn't get that done yet.

pitdicker · 2017-12-31T19:00:21Z

If we go with 128-bit hashes, there is a lot more choice. MetroHash, CityHash, FarmHash, MurmurHash, SpookyHash, maybe more.
I think that it is good to look at the performance, but that simplicity is more important. Maybe I am wrong, but it seems to me MetroHash, CityHash and FarmHash (and maybe xxHash) are part of a series of hashes that come and go. It would be nice if we could pick something that does not seem badly outdated over ~5 years. I have a little preference for Murmurhash3 there.

This repro contains a list of hashes with benchmarks: https://github.com/rurban/smhasher

128-bit MetroHash and MurmurHash3 seem about evenly matched for small inputs. I wonder what their performance on x86 is.

dhardy · 2018-01-01T10:37:45Z

That repo benchmarks hash functions by throughput on reasonably large data sizes. We don't care much about that. It also doesn't say much about security (other than some stuff about hash tables which I don't even think is correct).

I agree that simplicity is fairly important; it seems most hash functions innards aren't that complex, but the input/output functions can get complex to handle multiple input sizes, optimise special cases, and some other functionality.

Hmm, this conversation is split between two threads?

pitdicker · 2018-01-01T11:06:28Z

That repo benchmarks hash functions by throughput on reasonably large data sizes.

If you click on the name of the hash, there are also the results for hashing 1, 2, 3, 4, etc. bytes. And the last column is useful for us, because that shows if it is statistically ok. We don't really have security considerations here, right?

Hmm, this conversation is split between two threads?

Sorry 😄 Seemed to fit there. Link for others that may read this: #18 (comment)

dhardy and others added 30 commits September 13, 2017 09:15

Fix OsRng for other platforms (hopefully)

2cb9acd

Merge remote-tracking branch 'pitdicker/isaac-rewrite'

37faeb5

Also removed SeedableRng impl for IsaacWordRng: it's not reproducible across platforms

rand_core: improve a few comments/doc

da1ff46

rand_core: update Cargo.toml

67b22bd

rand_core: add impl<R: Rng> Rng for &mut R

ec70309

Allow sampling from a closed integer range

d8b8474

These changes make it possible to sample from closed ranges, not only from open. Included is a small optimisation for the modulus operator, and an optimisation for the types i8/u8 and i16/u16.

Remove range.rs

7edd06b

Replace range with range2

96503f7

Remove reseed from SeedableRng

86de12b

fuchsia: magenta was renamed zircon

6024981

Merge pull request rust-random#172 from raggi/master

ace47cf

Update README.md license section

Merge pull request rust-random#173 from raggi/zircon

7b0c29f

fuchsia: magenta was renamed zircon

Bump to 0.3.17

611d55b

Merge pull request rust-random#174 from raggi/0.3.17

cbfb7cf

Bump to 0.3.17

Merge pull request #2 from pitdicker/range_int

97ab178

Allow sampling from a closed integer range

Merge pull request #6 from pitdicker/no_reseeding

dfdf89c

No reseeding

Fix formatting warnings with commonmark enabled

9d44ab6

This makes documentation work correctly with the new pulldown-cmark Markdown parser (rust-lang/rust#44229).

Merge pull request rust-random#178 from mbrubeck/doc

6fd1009

Fix formatting warnings with commonmark enabled

Fix FlatMap::size_hint (thanks @bluss)

e3784ab

Merge remote-tracking branch 'origin/master'

250aa65

Fix i128_support in range.rs

496c0fc

Add initialization benchmarks

a6528a3

Clean up ISAAC tests

08ac750

Remove range2 from benchmarks

2e714f2

Add Error type and ErrorKind

849f01a

Add back fill_bytes

dbbe143

Improve error handeling of ReadRng

019d9c1

Also moved the `impl_uint_from_fill` macro from `os.rs` to `randcore`. I had to modify its error handling anyway, and it is shared with `OsRng`.

Remove default implementation from try_fill

44ef65f

dhardy added 2 commits December 19, 2017 15:12

Merge: reorder use/mod declarations

f58b2f1

Fix comment

4e088f1

dhardy and others added 13 commits December 20, 2017 15:14

Remove log dependency; make i128_support transitive

729644c

Merge pull request #74 from pitdicker/hc-128

eaeee11

Add HC-128 RNG

Restrict Seed type

4f1deb7

Replace convert_slice_{32,64} with read_u{32,64}_into

92831f3

And a little cleanup around the init functions

Fold from_rng into SeedableRng

cc6da7a

Merge pull request #77 from pitdicker/blockrng

5d9acdc

Replace `convert_slice_{32,64}` with `read_u{32,64}_into`

Fix sealed implementation of SeedRestriction to prevent extension

e9359ac

Make u128 range use widening multiply

945e8ff

Restrict the seed type to a few more array sizes

4ea098b

Merge pull request #80 from pitdicker/seed_sizes

45ace88

Restrict the seed type to a few more array sizes

Merge pull request #79 from pitdicker/range_128

1f9ce3a

Make u128 range use widening multiply

Add rand_core::le::test_read unit test

8990da2

Also minor doc fix

Add from_hashable with SeaHash implementation

0195188

dhardy force-pushed the seedable branch from ed88cf4 to 0195188 Compare December 31, 2017 13:17

dhardy added the not ready label Jan 2, 2018

dhardy mentioned this pull request Jan 12, 2018

Tracker: planned changes for 0.5 rust-random/rand#232

Closed

33 tasks

dhardy mentioned this pull request Mar 4, 2018

Seed RNGs with a uniform value rust-random/rand#80

Closed

dhardy closed this May 30, 2018

dhardy force-pushed the master branch from 64f7c88 to c4d1446 Compare May 30, 2018 16:00

dhardy mentioned this pull request Jun 21, 2018

Convenient PRNG construction: seed_from_u64 / from_hashable rust-random/rand#522

Closed

dhardy mentioned this pull request Jul 12, 2018

Add finish_buf output function for 128-bit hasher jedisct1/rust-siphash#8

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SeedableRng::Seed and from_hashable #62

SeedableRng::Seed and from_hashable #62

dhardy commented Nov 24, 2017 •

edited

Loading

dhardy commented Dec 20, 2017

dhardy commented Dec 31, 2017

pitdicker commented Dec 31, 2017

dhardy commented Dec 31, 2017

pitdicker commented Dec 31, 2017

dhardy commented Jan 1, 2018

pitdicker commented Jan 1, 2018

SeedableRng::Seed and from_hashable #62

SeedableRng::Seed and from_hashable #62

Conversation

dhardy commented Nov 24, 2017 • edited Loading

dhardy commented Dec 20, 2017

dhardy commented Dec 31, 2017

pitdicker commented Dec 31, 2017

dhardy commented Dec 31, 2017

pitdicker commented Dec 31, 2017

dhardy commented Jan 1, 2018

pitdicker commented Jan 1, 2018

dhardy commented Nov 24, 2017 •

edited

Loading