Skip to content

Windows - std::fs bug - specific file name causes thread to hang or false positive on file open. #83751

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
nothingbutnetcode opened this issue Apr 1, 2021 · 17 comments
Labels
A-io Area: `std::io`, `std::fs`, `std::net` and `std::path` C-bug Category: This is a bug. O-windows Operating system: Windows

Comments

@nothingbutnetcode
Copy link

When trying to open a file (which in this case did not exist) that starts with con. - eg: con.txt, con.toml - fs::read_to_string will hang the thread. File::open will incorrectly succeed, and subsequently panic when methods on the file handle are called.

I tried this code:

use std::fs::File;
use std::{fs};

fn main() {
    // 1 - this will hang the app
    let _ = fs::read_to_string("con.cxv").unwrap();

    // 2 - this returns a file (but it does not exist)
    match File::open("con.txt") {
        Ok(file) => {
            println!("length {}", file.metadata().unwrap().len()) // so this will panic
        }
        Err(err) => {
            println!("{:?}", err);
        }
    }
}

I expected to see this happen: File not found error returned.

Instead, this happened: Thread hang or incorrect result as per description above.

Meta

Bug exists in nightly as well as stable.

rustc --version --verbose:

rustc 1.51.0 (2fd73fabe 2021-03-23)
binary: rustc
commit-hash: 2fd73fabe469357a12c2c974c140f67e7cdd76d0
commit-date: 2021-03-23
host: x86_64-pc-windows-msvc
release: 1.51.0
LLVM version: 11.0.1
Backtrace

<backtrace>

@nothingbutnetcode nothingbutnetcode added the C-bug Category: This is a bug. label Apr 1, 2021
@nothingbutnetcode nothingbutnetcode changed the title 1.51 + Windows MSVC - std::fs bug 1.51 + Windows MSVC - std::fs bug - specific file name causes thread to hang or false positive on file open. Apr 1, 2021
@jonas-schievink
Copy link
Contributor

That's because any file named con (ignoring the extension) refers to the CONsole device: https://superuser.com/questions/86999/why-cant-i-name-a-folder-or-file-con-in-windows

Reading from it presumably will block until the console is closed, so the hang is expected.

@jonas-schievink jonas-schievink added the O-windows Operating system: Windows label Apr 1, 2021
@the8472
Copy link
Member

the8472 commented Apr 1, 2021

If you use a verbatim prefix (\\?\) then the resolution to legacy device names shouldn't happen. But then some programs then may have issues dealing with a directory containing con.txt.

@nothingbutnetcode
Copy link
Author

The main issue here is, in the case of read_to_string which hangs, you can check if it exists first with Path::new("con.txt").exists(), this gives the create answer of false, but according to the read_to_string API doc - "This function will return an error if path does not already exist." So one would expect that error to return, rather than DOS your app.

@the8472
Copy link
Member

the8472 commented Apr 1, 2021

The exists() result is not the canonical definition of what exists from the perspective of the filesystem. Rust just tries to probe existence in one particular way (fetching metadata()) without opening the file. Opening a path may give a different result.
Not sure about windows, but on linux there can be weird cases where a filesystem doesn't support metadata but does support opening paths, e.g. some FUSE filesystems may do that.

@nothingbutnetcode
Copy link
Author

The docs rather concretely state "Returns true if the path points at an existing entity."

@ChrisDenton
Copy link
Member

ChrisDenton commented Apr 1, 2021

The docs are wrong, unfortunately. The function itself is in the process of being deprecated and replaced with try_exists that will hopefully be more clear on what's happening.

Just to be clear on what this problem is, the path in:

Path::new("con.txt")

Will be rewritten by the OS as (roughly speaking):

Path::new(r"\\.\CON")

This does actually exist. It's the CON device. Unfortunately the way Rust currently tests for existence is by calling metadata(). And the way Rust gets metadata on Windows is to open the file without any permissions. But the CON device has to be opened with read permission otherwise it will fail.

So when using exists the open fails because it's not opened for reading. When File::open is called the file is open with read access so it succeeds.

@the8472
Copy link
Member

the8472 commented Apr 1, 2021

@nothingbutnetcode

The docs rather concretely state "Returns true if the path points at an existing entity."

That statement cannot be seen as universally true because it is then qualified by

If you cannot access the directory containing the file, e.g., because of a permission error, this will return false.

So file can exist without exists() returning true. But that caveat mentioned in the documentation is not the only edge-case that exists. Filesystems are complex. Even more so across platforms and when dealing with legacy features.

@ChrisDenton

The function itself is in the process of being deprecated and replaced with try_exists that will hopefully be more clear on what's happening.

It may be better, but I think it won't cover all edge-cases either, so it shouldn't be taken as canonical definition for existence. E.g. unix different calls are used for metadata and open. So one can return ENOENT when the other doesn't.

@nothingbutnetcode
Copy link
Author

Sure, but accepting the caveat furthers the point, the restrictions on permissions return the safer option of false!

@the8472
Copy link
Member

the8472 commented Apr 1, 2021

I'm not sure what the point is. read_to_string will try to open something for reading and if that succeeds it will read it.

Which is a different operation then trying to check if something exists and thus may give different results.

This is a general principle with filesystems. If you want to do a check for something it's best to just perform the operation you want to check, rather than some proxy-operation, because the proxy can give different results than the actual operation due to weird differences. This is also the source of TOCTOU attacks.

If you want to avoid legacy handling use verbatim-prefixed absolute paths. If you want to detect legacy parsing you can canonicalize the path first. But both of those approaches have their own issues.

@nothingbutnetcode
Copy link
Author

I guess the point is there should be a safe API that guards against this, otherwise every single user API that accepts consumer provided file names to be attempted to be read need to manually guard against these filenames themselves (thus assuming all users know about these OS nuances - I did not before today), or face being DOS'd.

@the8472
Copy link
Member

the8472 commented Apr 1, 2021

If you're accepting untrusted paths over the network to write into/read from the filesystem then you probably have bigger issues. E.g. they could contain absolute paths or .. and lead to arbitrary access on the system.

If you're accepting arbitrary input from a trusted user then the user can still shoot themselves in the foot. E.g. by attempting to read huge files that will cause the process to run out of memory since read_to_string doesn't stream the data. Or on unix platforms you can achieve the same kind of DoS with named pipes, /dev/tty or similar things.

std::fs provides direct filesystem access with very limited abstraction, not a foolproof, highly secure sandbox.

Take std::path::Prefix for example, nuances are not masked.

@the8472
Copy link
Member

the8472 commented Apr 1, 2021

Note that you would probably get the same effects on cmd via type con.cxv

@nothingbutnetcode
Copy link
Author

nothingbutnetcode commented Apr 1, 2021

If you have a file server that accepts a file name to be read, knowing that it can be guarded against that with the current implementation of Path.exists() for these, clearly exceptional, windows filenames, are you saying that it is acceptable for the read_to_string method (the convenience method to read a file) to hang? If that's the case, fine - then document it as such.

@the8472
Copy link
Member

the8472 commented Apr 1, 2021

Neither exists nor read_to_string are appropriate to when implementing some network server. On efficiency and security grounds. A hang is the most benign kind of trouble you might be inviting there.

But the problem isn't specific to those two methods. Any naive approach to interacting with the filesystem will have pitfalls.
For your particular problem fs::canonicalize may help, but might not be sufficient, depending on what you're trying to build. But I'd say that's out of scope for this ticket anyway, you need a book on securing network services.

We could add some caveats to those particular methods, but that doesn't solve the general problem because there's so much ground to cover.

@nothingbutnetcode
Copy link
Author

I fear you're in love with that hill you occupy. How about a method that understands that Windows (bizarrely) will translate con.txt to Path::new(r"\.\CON") as per @ChrisDenton's observation, causing this issue, and guard against it?

@the8472
Copy link
Member

the8472 commented Apr 1, 2021

How about a method that understands that Windows (bizarrely) will translate con.txt to Path::new(r".\CON")

I believe canonicalize already does that, which is why I suggested it. But i don't have a windows machine nearby to test on. Give it a try.

and guard against it?

The issue is that CON is just one particular case of many possible sources of trouble. We'd need a way to say "this is not a normal file and probably won't behave like one". E.g. another case is that windows recently gained AF_UNIX socket support for WSL interop. These sockets can sit anywhere in a normal the filesystem like a regular file would, not on special paths. So trying to match paths wouldn't help.

On unix systems these kinds issues are fairly common. One way to check for them is opening the file and then doing file.metadata()?.file_type() which can distinguish special files (e.g. is_fifo()) from regular files (is_file()).
I don't know how CON behaves on windows, maybe see what it's file_type() says?

Not that that is bulletproof either. If you want to deal with IO hangs the usual approaches would be interrupting hung threads from another thread, helper processes that can be killed and restarted after a timeout or using nonblocking IO APIs where requests can be canceled.

@wesleywiser wesleywiser changed the title 1.51 + Windows MSVC - std::fs bug - specific file name causes thread to hang or false positive on file open. Windows - std::fs bug - specific file name causes thread to hang or false positive on file open. Apr 2, 2021
@wesleywiser
Copy link
Member

I would agree that this can be a surprising behavior of Windows but it is the way that platform is intended to operate. Windows is hardly exceptional in this manner though as your program will hang on both macOS and Linux if a user supplies /dev/tty as a path.

Perhaps the std::fs module docs can be updated to include a section explaining that the apis are thin wrappers over the platform specific apis and don't particularly isolate you from any quirks of those underlying apis?

@workingjubilee workingjubilee added the A-io Area: `std::io`, `std::fs`, `std::net` and `std::path` label Mar 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-io Area: `std::io`, `std::fs`, `std::net` and `std::path` C-bug Category: This is a bug. O-windows Operating system: Windows
Projects
None yet
Development

No branches or pull requests

6 participants