Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace fastrand with getrandom and base64 #188

Closed
wants to merge 3 commits into from

Conversation

vks
Copy link

@vks vks commented Aug 16, 2022

  • Instead of setting up a predictable userspace RNG, we get unpredictable random bytes directly from the OS. This fixes Use of predictable RNG #178.
  • To obtain a uniformly distributed alphanumeric string, we convert the random bytes to base64 and throw away any letters we don't want (+ and /). With a low probability, this may result in obtaining too few alphanumeric letters, in which case we request more randomness from the OS until we have enough.
  • Because we cannot control the seed anymore, a test manufacturing collisions by setting the same seed for several threads had to be removed.

- Instead of setting up a predictable userspace RNG, we get
  unpredictable random bytes directly from the OS. This fixes Stebalien#178.
- To obtain a uniformly distributed alphanumeric string, we convert the
  the random bytes to base64 and throw away any letters we don't want
  (`+` and `/`). With a low probability, this may result in obtaining
  too few alphanumeric letters, in which case we request more randomness
  from the OS until we have enough.
- Because we cannot control the seed anymore, a test manufacturing
  collisions by setting the same seed for several threads had to be removed.
@tarcieri
Copy link

Wouldn't using the URL-safe Base64 alphabet avoid the issues with + and /? (using - and _ instead, which are both valid in filenames)

@Stebalien
Copy link
Owner

  • I'd prefer not to introduce a dependency on base64. We can achieve the same result by taking each byte mod 64, throwing out 62 and 63, then mapping it to an alphabet. That will give us uniform randomness.
  • We should just read more randomness than we need (e.g., 256 bytes) and store it in thread-local storage.
  • Looking into getrandom, I'm concerned about blocking on early boot (per the getrandom documentation). We'd be trading a potential and unlikely DoS, for a deadlock on some platforms (e.g., an init system built in rust).
  • We should absolutely not panic (e.g., unwrap the result from getrandom), under any circumstances.

@Stebalien
Copy link
Owner

Hm. It also looks like we have a gap in our CI (#189). We should at least build on WASM (without WASI), even if we can't actually create temporary files.

@Stebalien
Copy link
Owner

Looking into getrandom, I'm concerned about blocking on early boot (per the getrandom documentation). We'd be trading a potential and unlikely DoS, for a deadlock on some platforms (e.g., an init system built in rust).

I guess the question here is: are we worse off? I haven't taken a look at rand or fastrand, they may have the same early-boot issue.

I'd prefer not to introduce a dependency on base64.

Basically, people tend to look at dependencies with skepticism. It's a small dependency, but it's still yet another thing to audit (and having a temporary file library depend on it is a bit strange).

@5225225
Copy link

5225225 commented Aug 17, 2022

I haven't taken a look at rand or fastrand, they may have the same early-boot issue.

Fastrand seeds its RNG initially with the thread id and Instant::now. It does not use any other forms of entropy.

Stracing rand generating a i32 with just rand::generate::<i32>(), it calls

getrandom(NULL, 0, GRND_NONBLOCK)     = 0
getrandom("\x73[...]\x41\xb1", 32, 0) = 32

I can't quite figure out where the call to NONBLOCK is coming from. Checking the docs, rand does seem problematic to use in early boot.

It is possible that when used during early boot the first call to OsRng will block until the system’s RNG is initialised. It is also possible (though highly unlikely) for OsRng to fail on some platforms, most likely due to system mis-configuration.

(https://docs.rs/rand/0.8.5/rand/rngs/struct.OsRng.html)

And OsRng is what rand's default RNG uses (as a base source of entropy)

In any case, if we can't get entropy at all, or if getrandom fails, then we'd need to fallback to... Current time and thread id, I suppose? So, keep on using fastrand, or roll our own generation.


#[cfg(unix)]
#[test]
fn test_make_uds_conflict() {
Copy link

@5225225 5225225 Aug 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test can presumably still exist, but less deterministically, by setting the rand_bytes to 1, then trying to generate 62 files. (Which will require that some of the threads retry unless they all get incredibly lucky, which won't happen).

I'd also use a https://doc.rust-lang.org/std/sync/struct.Barrier.html inside the thread but before the generation, to ensure they all wait until everyone's ready, so that if thread spawning is slower than file generation, it's not just sequential.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is possible, but the number of retries will no longer be deterministic.

@vks
Copy link
Author

vks commented Aug 17, 2022

@Stebalien Would you be ok with a base64 alphabet (e.g. by adding - and _) to simplify the implementation?

I can get rid of the base64 dependency, but this would essentially require vendoring a base64 encoder (or using the random bytes inefficiently).

@vks
Copy link
Author

vks commented Aug 17, 2022

Looking into getrandom, I'm concerned about blocking on early boot (per the getrandom documentation). We'd be trading a potential and unlikely DoS, for a deadlock on some platforms (e.g., an init system built in rust).

Couldn't such a special application generate its own random file name?

@Stebalien
Copy link
Owner

Stebalien commented Aug 17, 2022

Would you be ok with a base64 alphabet (e.g. by adding - and _) to simplify the implementation?

I want to stick to alpha-numeric (least surprise).

Couldn't such a special application generate its own random file name?

Not really. I really don't want to limit the use-cases of this library.


But this is getting significantly more complicated than I had anticipated. It feels like we're basically re-inventing the rand crate.

I have limited time for the next few weeks, but I'm going to think about this a bit and come back to you.

@Stebalien
Copy link
Owner

Replaced by #314

@Stebalien Stebalien closed this Nov 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use of predictable RNG
4 participants