Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use of predictable RNG #178

Open
vks opened this issue Apr 30, 2022 · 13 comments · May be fixed by #188
Open

Use of predictable RNG #178

vks opened this issue Apr 30, 2022 · 13 comments · May be fixed by #188

Comments

@vks
Copy link

vks commented Apr 30, 2022

This library recently switched to non-CSPRNGs (see #141 and #162), which are inherently predictable based on observations of their output. Doesn't this make the library vulnerable to attacks listed here, because the generated filenames can be predicted?

@Stebalien
Copy link
Owner

Stebalien commented May 2, 2022

This temporary file library is equivalent to mkstemp. The relevant part of the document is:

However, mkstemp() still suffers from the use of predictable file names and can leave an application vulnerable to denial of service attacks if an attacker causes mkstemp() to fail by predicting and pre-creating the filenames to be used.

This is one of those cases where you need to understand the threat-model. There are many theoretical attacks that simply aren't worth the effort to guard against:

  1. It relies on being able to observer the creation of many temporary files (to be able to predict future ones).
  2. It relies on being able to create arbitrarily named temporary files in the same directory as the target application. If you can do this, you can likely exhaust inodes, use up all disk space, etc.
  3. It is, at worst, a DoS vector. It can't be used to actually trick the target application into doing anything.

Basically, if an attacker is in a place where they can exploit this issue, you likely have bigger issues.

Also note, python, golang, and C all behave the same way (use non-cryptographic randomness for performance).


If you want to improve the situation, I'd be happy to accept a patch that re-seeds the random number generator with system randomness if it fails to create a temporary file after some number of tries.

edited: My original comment made statements about security "experts" which could have been taken as a comment about security researchers in general. This was not my intention, nor was it relevant.

@vks
Copy link
Author

vks commented May 2, 2022

I disagree with 1., "many" can be three, depending on the RNG.

I agree that 2. is a problem anyway, and that ideally the temporary files should not share a namespace with other applications, but temporary files are often used like this.

Non-cryptographic randomness is not a performance advantage, CSPRNGs have comparable speed, because they are optimized for SIMD while the non-CSPRNGs are not. Non-CSPRNGs are simpler, not necessarily faster.

The scenario you are mentioning (DoS) is the least concerning. I would be much more concerned about an attacker manipulating file permissions and content, or escalating privileges. This can be avoided with flags like O_CREAT and O_EXCL (and more recently O_TMPFILE), but as far as I know not all operating systems (and file systems) support them.

What is the motivation for not using a CSPRNG? Compilation time and code complexity?

@Stebalien
Copy link
Owner

The scenario you are mentioning (DoS) is the least concerning. I would be much more concerned about an attacker manipulating file permissions and content, or escalating privileges. This can be avoided with flags like O_CREAT and O_EXCL (and more recently O_TMPFILE), but as far as I know not all operating systems (and file systems) support them.

This is the main reason this crate exists. It handles all the platform quirks around O_CREAT and O_EXCL, and does not depend on randomness for security. This library will never re-use an existing file when creating a temporary file (and will use O_EXCL where possible). In practice:

  1. On Macos and Windows, the temporary file directory is always per-user.
  2. On Linux, O_TMPFILE will be used when creating unnamed temporary files.

The only thing affected would be named temporary files/directories on Linux where the user is using the shared /tmp directory directly.

What is the motivation for not using a CSPRNG? Compilation time and code complexity?

Complexity and dependency weight. Performance isn't really an issue given that the syscall cost to create a file will likely dominate in most cases.

@pinkforest
Copy link

pinkforest commented Aug 14, 2022

Can this crate be used in production where the filenames must be unpredictable ?

If the crate does not guarantee predictability then it perhaps should simply be documented as such.

This crate is advertising itself as:

A secure, cross-platform, temporary file library for Rust.

Thanks

@5225225
Copy link

5225225 commented Aug 14, 2022

(got sent here from the advisory-db link)

If you want to improve the situation, I'd be happy to accept a patch that re-seeds the random number generator with system randomness if it fails to create a temporary file after some number of tries.

There's no method in the std to get system randomness, would a dependency on getrandom be acceptable here? (I suspect a fair few crates already have it in their dependency tree). Failing that, the windows/mac/linux specific bits of getrandom could be vendored.

In my testing, on linux at least, getrandom is fast enough to the point where it's reasonable to use it directly for entropy (a few million [u8; 32]'s per second), so the whole RNG could maybe be removed. Re-seeding / only using entropy directly as a fallback would also be fine.

Especially since you're doing a syscall and probably file IO afterwards, the perf of getting entropy is likely fine.

@Stebalien
Copy link
Owner

Can this crate be used in production where the filenames must be unpredictable ?

Yes, it absolutely can. Same as python and go which use the same randomness function for temporary files.

Worst-case, there's a DoS vector. But if you have some untrusted application that can create many files with chosen names in a shared temporary directory of temporary files, you this is the least of your concerns.

Security is all about understanding your threat model and making trade-offs (performance, complexity, etc.).

There's no method in the std to get system randomness, would a dependency on getrandom be acceptable here? (I suspect a fair few crates already have it in their dependency tree). Failing that, the windows/mac/linux specific bits of getrandom could be vendored.

A call to getrandom would be fine given, as you say, we're already performing a syscall. I'm happy to accept a patch to do so (but it's not really a priority).

@pinkforest
Copy link

pinkforest commented Aug 14, 2022

Cool thanks for clarifying -

We are pretty practical at RustSec and we have recommended your crate over others -

We recently had to file advisory on temporary and previously tempdir and your crate popped up so I wanted to clarify this.

https://rustsec.org/advisories/RUSTSEC-2018-0022.html
https://rustsec.org/advisories/RUSTSEC-2018-0017.html

Thanks for working on a crate that focuses on security and portability 🦄

@vks
Copy link
Author

vks commented Aug 16, 2022

Using getrandom instead is possible, but this will require mapping &[u8] to alphanumeric strings, which requires vendoring some kind of integer sampling.

@Stebalien
Copy link
Owner

Using getrandom instead is possible, but this will require mapping &[u8] to alphanumeric strings, which requires vendoring some kind of integer sampling.

  1. Define a dictionary (A-Za-z0-9).
  2. Take each byte of input modulo the length of the dictionary and look it up in the dictionary.

@Stebalien
Copy link
Owner

The best solution is probably to read, maybe, 1k of bytes into a thread-local buffer, then draw randomness from that until we run out.

vks added a commit to vks/tempfile that referenced this issue Aug 16, 2022
- Instead of setting up a predictable userspace RNG, we get
  unpredictable random bytes directly from the OS. This fixes Stebalien#178.
- To obtain a uniformly distributed alphanumeric string, we convert the
  the random bytes to base64 and throw away any letters we don't want
  (`+` and `/`). With a low probability, this may result in obtaining
  too few alphanumeric letters, in which case we request more randomness
  from the OS until we have enough.
- Because we cannot control the seed anymore, a test manufacturing
  collisions by setting the same seed for several threads had to be removed.
@vks vks linked a pull request Aug 16, 2022 that will close this issue
@vks
Copy link
Author

vks commented Aug 16, 2022

Take each byte of input modulo the length of the dictionary and look it up in the dictionary.

Unfortunately, that method is slightly biased and does not result in a uniform alphanumeric distribution.

@Stebalien
Copy link
Owner

Unfortunately, that method is slightly biased and does not result in a uniform alphanumeric distribution.

Hm, that slightly biases the first 8 characters.

@5225225
Copy link

5225225 commented Aug 17, 2022

Modulo bias would be arguably Fine here since we only need a value that can't be perfectly predicted consistently, not a value that can be predicted even slightly better than average.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants