Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use of predictable RNG #178

Open
vks opened this issue Apr 30, 2022 · 13 comments · May be fixed by #314
Open

Use of predictable RNG #178

vks opened this issue Apr 30, 2022 · 13 comments · May be fixed by #314

Comments

@vks
Copy link

vks commented Apr 30, 2022

This library recently switched to non-CSPRNGs (see #141 and #162), which are inherently predictable based on observations of their output. Doesn't this make the library vulnerable to attacks listed here, because the generated filenames can be predicted?

@Stebalien
Copy link
Owner

Stebalien commented May 2, 2022

This temporary file library is equivalent to mkstemp. The relevant part of the document is:

However, mkstemp() still suffers from the use of predictable file names and can leave an application vulnerable to denial of service attacks if an attacker causes mkstemp() to fail by predicting and pre-creating the filenames to be used.

This is one of those cases where you need to understand the threat-model. There are many theoretical attacks that simply aren't worth the effort to guard against:

  1. It relies on being able to observer the creation of many temporary files (to be able to predict future ones).
  2. It relies on being able to create arbitrarily named temporary files in the same directory as the target application. If you can do this, you can likely exhaust inodes, use up all disk space, etc.
  3. It is, at worst, a DoS vector. It can't be used to actually trick the target application into doing anything.

Basically, if an attacker is in a place where they can exploit this issue, you likely have bigger issues.

Also note, python, golang, and C all behave the same way (use non-cryptographic randomness for performance).


If you want to improve the situation, I'd be happy to accept a patch that re-seeds the random number generator with system randomness if it fails to create a temporary file after some number of tries.

edited: My original comment made statements about security "experts" which could have been taken as a comment about security researchers in general. This was not my intention, nor was it relevant.

@vks
Copy link
Author

vks commented May 2, 2022

I disagree with 1., "many" can be three, depending on the RNG.

I agree that 2. is a problem anyway, and that ideally the temporary files should not share a namespace with other applications, but temporary files are often used like this.

Non-cryptographic randomness is not a performance advantage, CSPRNGs have comparable speed, because they are optimized for SIMD while the non-CSPRNGs are not. Non-CSPRNGs are simpler, not necessarily faster.

The scenario you are mentioning (DoS) is the least concerning. I would be much more concerned about an attacker manipulating file permissions and content, or escalating privileges. This can be avoided with flags like O_CREAT and O_EXCL (and more recently O_TMPFILE), but as far as I know not all operating systems (and file systems) support them.

What is the motivation for not using a CSPRNG? Compilation time and code complexity?

@Stebalien
Copy link
Owner

The scenario you are mentioning (DoS) is the least concerning. I would be much more concerned about an attacker manipulating file permissions and content, or escalating privileges. This can be avoided with flags like O_CREAT and O_EXCL (and more recently O_TMPFILE), but as far as I know not all operating systems (and file systems) support them.

This is the main reason this crate exists. It handles all the platform quirks around O_CREAT and O_EXCL, and does not depend on randomness for security. This library will never re-use an existing file when creating a temporary file (and will use O_EXCL where possible). In practice:

  1. On Macos and Windows, the temporary file directory is always per-user.
  2. On Linux, O_TMPFILE will be used when creating unnamed temporary files.

The only thing affected would be named temporary files/directories on Linux where the user is using the shared /tmp directory directly.

What is the motivation for not using a CSPRNG? Compilation time and code complexity?

Complexity and dependency weight. Performance isn't really an issue given that the syscall cost to create a file will likely dominate in most cases.

@pinkforest
Copy link

pinkforest commented Aug 14, 2022

Can this crate be used in production where the filenames must be unpredictable ?

If the crate does not guarantee predictability then it perhaps should simply be documented as such.

This crate is advertising itself as:

A secure, cross-platform, temporary file library for Rust.

Thanks

@5225225
Copy link

5225225 commented Aug 14, 2022

(got sent here from the advisory-db link)

If you want to improve the situation, I'd be happy to accept a patch that re-seeds the random number generator with system randomness if it fails to create a temporary file after some number of tries.

There's no method in the std to get system randomness, would a dependency on getrandom be acceptable here? (I suspect a fair few crates already have it in their dependency tree). Failing that, the windows/mac/linux specific bits of getrandom could be vendored.

In my testing, on linux at least, getrandom is fast enough to the point where it's reasonable to use it directly for entropy (a few million [u8; 32]'s per second), so the whole RNG could maybe be removed. Re-seeding / only using entropy directly as a fallback would also be fine.

Especially since you're doing a syscall and probably file IO afterwards, the perf of getting entropy is likely fine.

@Stebalien
Copy link
Owner

Can this crate be used in production where the filenames must be unpredictable ?

Yes, it absolutely can. Same as python and go which use the same randomness function for temporary files.

Worst-case, there's a DoS vector. But if you have some untrusted application that can create many files with chosen names in a shared temporary directory of temporary files, you this is the least of your concerns.

Security is all about understanding your threat model and making trade-offs (performance, complexity, etc.).

There's no method in the std to get system randomness, would a dependency on getrandom be acceptable here? (I suspect a fair few crates already have it in their dependency tree). Failing that, the windows/mac/linux specific bits of getrandom could be vendored.

A call to getrandom would be fine given, as you say, we're already performing a syscall. I'm happy to accept a patch to do so (but it's not really a priority).

@pinkforest
Copy link

pinkforest commented Aug 14, 2022

Cool thanks for clarifying -

We are pretty practical at RustSec and we have recommended your crate over others -

We recently had to file advisory on temporary and previously tempdir and your crate popped up so I wanted to clarify this.

https://rustsec.org/advisories/RUSTSEC-2018-0022.html
https://rustsec.org/advisories/RUSTSEC-2018-0017.html

Thanks for working on a crate that focuses on security and portability 🦄

@vks
Copy link
Author

vks commented Aug 16, 2022

Using getrandom instead is possible, but this will require mapping &[u8] to alphanumeric strings, which requires vendoring some kind of integer sampling.

@Stebalien
Copy link
Owner

Using getrandom instead is possible, but this will require mapping &[u8] to alphanumeric strings, which requires vendoring some kind of integer sampling.

  1. Define a dictionary (A-Za-z0-9).
  2. Take each byte of input modulo the length of the dictionary and look it up in the dictionary.

@Stebalien
Copy link
Owner

The best solution is probably to read, maybe, 1k of bytes into a thread-local buffer, then draw randomness from that until we run out.

vks added a commit to vks/tempfile that referenced this issue Aug 16, 2022
- Instead of setting up a predictable userspace RNG, we get
  unpredictable random bytes directly from the OS. This fixes Stebalien#178.
- To obtain a uniformly distributed alphanumeric string, we convert the
  the random bytes to base64 and throw away any letters we don't want
  (`+` and `/`). With a low probability, this may result in obtaining
  too few alphanumeric letters, in which case we request more randomness
  from the OS until we have enough.
- Because we cannot control the seed anymore, a test manufacturing
  collisions by setting the same seed for several threads had to be removed.
@vks
Copy link
Author

vks commented Aug 16, 2022

Take each byte of input modulo the length of the dictionary and look it up in the dictionary.

Unfortunately, that method is slightly biased and does not result in a uniform alphanumeric distribution.

@Stebalien
Copy link
Owner

Unfortunately, that method is slightly biased and does not result in a uniform alphanumeric distribution.

Hm, that slightly biases the first 8 characters.

@5225225
Copy link

5225225 commented Aug 17, 2022

Modulo bias would be arguably Fine here since we only need a value that can't be perfectly predicted consistently, not a value that can be predicted even slightly better than average.

Stebalien added a commit that referenced this issue Nov 19, 2024
Re-seed thread-local RNG from system randomness if we run into a
temporary file-name collision. This should address the concerns about
using a predictable RNG without hurting performance in the common case
where nobody is trying to predict our filenames. I'm only re-seeding
once because if we _still_ fail to create a temporary file, the
collision was likely due to too many temporary files instead of an
attacker predicting our random temporary file names.

I've also reduced the number of tries from 2^31 to 2^16. If it takes
more than that to create a temporary file, something else is wrong.
Pausing for a long time is usually worse than just failing.

fixes #178
@Stebalien Stebalien linked a pull request Nov 19, 2024 that will close this issue
Stebalien added a commit that referenced this issue Nov 19, 2024
Re-seed thread-local RNG from system randomness if we run into a
temporary file-name collision. This should address the concerns about
using a predictable RNG without hurting performance in the common case
where nobody is trying to predict our filenames. I'm only re-seeding
once because if we _still_ fail to create a temporary file, the
collision was likely due to too many temporary files instead of an
attacker predicting our random temporary file names.

I've also reduced the number of tries from 2^31 to 2^16. If it takes
more than that to create a temporary file, something else is wrong.
Pausing for a long time is usually worse than just failing.

fixes #178
Stebalien added a commit that referenced this issue Nov 19, 2024
Re-seed thread-local RNG from system randomness if we run into a
temporary file-name collision. This should address the concerns about
using a predictable RNG without hurting performance in the common case
where nobody is trying to predict our filenames. I'm only re-seeding
once because if we _still_ fail to create a temporary file, the
collision was likely due to too many temporary files instead of an
attacker predicting our random temporary file names.

I've also reduced the number of tries from 2^31 to 2^16. If it takes
more than that to create a temporary file, something else is wrong.
Pausing for a long time is usually worse than just failing.

fixes #178
Stebalien added a commit that referenced this issue Nov 19, 2024
Re-seed thread-local RNG from system randomness if we run into a
temporary file-name collision. This should address the concerns about
using a predictable RNG without hurting performance in the common case
where nobody is trying to predict our filenames. I'm only re-seeding
once because if we _still_ fail to create a temporary file, the
collision was likely due to too many temporary files instead of an
attacker predicting our random temporary file names.

I've also reduced the number of tries from 2^31 to 2^16. If it takes
more than that to create a temporary file, something else is wrong.
Pausing for a long time is usually worse than just failing.

fixes #178
Stebalien added a commit that referenced this issue Nov 19, 2024
Re-seed thread-local RNG from system randomness if we run into a
temporary file-name collision. This should address the concerns about
using a predictable RNG without hurting performance in the common case
where nobody is trying to predict our filenames. I'm only re-seeding
once because if we _still_ fail to create a temporary file, the
collision was likely due to too many temporary files instead of an
attacker predicting our random temporary file names.

I've also reduced the number of tries from 2^31 to 2^16. If it takes
more than that to create a temporary file, something else is wrong.
Pausing for a long time is usually worse than just failing.

fixes #178
Stebalien added a commit that referenced this issue Nov 20, 2024
Re-seed thread-local RNG from system randomness if we run into a
temporary file-name collision. This should address the concerns about
using a predictable RNG without hurting performance in the common case
where nobody is trying to predict our filenames. I'm only re-seeding
once because if we _still_ fail to create a temporary file, the
collision was likely due to too many temporary files instead of an
attacker predicting our random temporary file names.

I've also reduced the number of tries from 2^31 to 2^16. If it takes
more than that to create a temporary file, something else is wrong.
Pausing for a long time is usually worse than just failing.

fixes #178
Stebalien added a commit that referenced this issue Dec 6, 2024
Re-seed thread-local RNG from system randomness if we run into a
temporary file-name collision. This should address the concerns about
using a predictable RNG without hurting performance in the common case
where nobody is trying to predict our filenames. I'm only re-seeding
once because if we _still_ fail to create a temporary file, the
collision was likely due to too many temporary files instead of an
attacker predicting our random temporary file names.

I've also reduced the number of tries from 2^31 to 2^16. If it takes
more than that to create a temporary file, something else is wrong.
Pausing for a long time is usually worse than just failing.

fixes #178
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants