Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libs: add platform-specific APIs to extract file descriptors, SOCKETs, HANDLEs, etc from std::io #19169

Merged
merged 3 commits into from
Nov 26, 2014

Conversation

aturon
Copy link
Member

@aturon aturon commented Nov 21, 2014

This PR adds some internal infrastructure to allow the private std::sys module to access internal representation details of std::io.

It then exposes those details in two new, platform-specific API surfaces: std::os::unix and std::os::windows.

To start with, these will provide the ability to extract file descriptors, HANDLEs, SOCKETs, and so on from std::io types.

More functionality, and more specific platforms (e.g. std::os::linux) will be added over time.

Closes #18897

@aturon
Copy link
Member Author

aturon commented Nov 21, 2014

cc @cmr
cc #19116

@aturon
Copy link
Member Author

aturon commented Nov 21, 2014

r? @alexcrichton

@emberian
Copy link
Member

cc @xales

Looks good to me!

@aturon
Copy link
Member Author

aturon commented Nov 21, 2014

cc @carllerche

@l0kod
Copy link
Contributor

l0kod commented Nov 21, 2014

Great!

As discuss in #15643, it would be interesting to expose a less raw object like FileDesc instead of directly exposing fd_t-like. This way, it's possible to use a more safe and abstract FileDesc wrapper to control some resources (e.g. keep the automatic file descriptor closing feature).

Additionally, a safer FileDesc could have this properties:

  • a safe read-only object with this available functions: fstat, fd, handle, tell
  • a writable object with this available functions: read, write, seek, dataseek, fsync, truncate
  • unsafe functions: new, unwrap
  • add a safe Clone trait using dup-like (e.g. libc::fcntl(self.fd, libc::F_DUPFD_CLOEXEC))

@aturon
Copy link
Member Author

aturon commented Nov 21, 2014

@l0kod Definitely! I'm quite excited about these abstractions developing over time.

That said, I wanted to start conservatively by exposing the raw object and letting the community experiment with abstractions in external crates like the ones you describe, before we commit to anything in libstd.

@l0kod
Copy link
Contributor

l0kod commented Nov 21, 2014

FileDesc-like interface will also discourage the direct/only use of lifetime-less (and inherently unsafe) fd_t for FFI wrappers.

/// Windows-specific extensions to `std::io::net::pipe::UnixListener`.
pub trait UnixListenerExt {
/// Extract the raw handle, without taking any ownership.
fn as_raw_fd(&self) -> Handle;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/fd/handle

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(a few below as well)

@alexcrichton
Copy link
Member

This is an interesting design with respect to an extension-trait-per-object rather than an extension trait per piece of functionality, but I agree that use std::os::unix::prelude::* is pretty reasonable for getting around any ergonomic concern, and this improves discoverability as you know exactly what trait to look at to see what fancy methods are available.

Over time I think we'll need to figure out how to deal with the rustdoc problem to show windows extension traits as well as unix ones, but we can deal with that later. Overall I like the direction this is going in for providing platform-specific functionality.

cc @brson, I'm curious as to your opinion with this organization as well.

@aturon
Copy link
Member Author

aturon commented Nov 21, 2014

@alexcrichton

This is an interesting design with respect to an extension-trait-per-object rather than an extension trait per piece of functionality

So, my original thought there was that these extension traits would diverge over time, and I think in the long run we probably will want type-specific extensions like this.

That said, thinking more about the functionality being proposed here, there's actually a good use case for exposing this via more generic traits, so that the os::unix etc. API can eventually work generically over them. I will revise.

@aturon
Copy link
Member Author

aturon commented Nov 21, 2014

Updated with improved design.

This commit adds a `AsInner` trait to `sys_common` and provides
implementations on many `std::io` types. This is a building block for
exposing platform-specific APIs that hook into `std::io` types.
The new `std::os::unix` module exposes several extension traits
for extracting file descriptors from `std::io` types.
fn as_raw_fd(&self) -> Fd {
self.as_fd().fd()
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we could mark these as "super extra experimental" because I'd love to reimplement File on top of just a HANDLE instead of relying on msvcrt for file descriptors... (fine for now of course)

The new `std::os::windows` module exposes several extension traits
for extracting file descriptors, sockets, and handles from `std::io`
types.
@l0kod
Copy link
Contributor

l0kod commented Nov 22, 2014

I like this design as well, but I hope we will regain the FileDesc-like safety soon instead of the raw Fd, Handle and Socket.

Moreover, having different function names for unix and windows seems to make OS-independent code more cumbersome to write. Not sure if it's a good idea but, what if we had an OS-independent file descriptor type? Maybe something like IoHandle<T>, with T implemented for Fd, Handle and Socket? This trait could manage the file descriptor lifetime with Drop, a safe duplication with Clone and expose the raw value with Deref<T>.

Replacing as_raw_fd(&self) -> Fd with as_iohandle(&self) -> &IoHandle<T> would allow safe and OS-independent APIs. This should also replace fd(&self) and handle(&self).
Handling mutability safely is still a problem.

@l0kod
Copy link
Contributor

l0kod commented Nov 22, 2014

Deref<Mutex<T>> for IoHandle<T> should solve the mutability problem :)

@mahkoh
Copy link
Contributor

mahkoh commented Nov 22, 2014

There is no mutability safety problem.

@l0kod
Copy link
Contributor

l0kod commented Nov 22, 2014

There is no mutability safety problem.

Look at #15643 for background information. Basically it could be unsafe to write or even read from a file descriptor because of concurrent access or file descriptor wrappers state.

@mahkoh
Copy link
Contributor

mahkoh commented Nov 22, 2014

Basically it could be unsafe to write or even read from a file descriptor because of concurrent access or file descriptor wrappers state.

There is nothing unsafe about this. See http://doc.rust-lang.org/reference.html#behavior-considered-undefined

@l0kod
Copy link
Contributor

l0kod commented Nov 22, 2014

This is low level FFI unsafe behavior. Again, read #15643 for more details.

@mahkoh
Copy link
Contributor

mahkoh commented Nov 22, 2014

This is low level FFI unsafe behavior.

You should first define what this even means.

@pythonesque
Copy link
Contributor

@l0kod I don't think this is unsafe. Can you give me an example of a situation where sharing a file descriptor could cause memory unsafety? AFAIK, the kernel already synchronizes updates, refcounts, etc. In the mmap case, trying to access truncated memory will result in a SIGBUS, you can't cause memory unsafety that way. If as I suspect, there is no way to cause memory unsafety this way, then it is not in the scope of things that should be protected with an unsafe block. If someone wants to write a higher level library that makes additional guarantees, that's fine, but Rust is under no obligation to protect that library.

@l0kod
Copy link
Contributor

l0kod commented Nov 22, 2014

One main goal of Rust is safety and safety is not only about memory but about behavior as well, especially for file descriptor (e.g. Drop trait goal). Take a look at my first comment or other, comments, describing, the, unsafe, behaviors.

Please read the entire #15643 thread before other questions/comments.

@pythonesque
Copy link
Contributor

Sorry, I don't buy that.

First off, while safety in general may be a desirable goal, Rust's "unsafe" feature is very much only about memory safety. It is not for things other than memory safety. For example, deadlock is a much more common problem than close-then-open, and everyone agrees that code that can deadlock is incorrect, but Rust does not require that all code that can potentially deadlock be marked as unsafe, nor does it require libraries to protect against deadlock. Same with leaking memory--it's entirely legal in Rust. You're encouraged to avoid it, but it is not unsafe. It is not considered unsafe to perform an overlong bit shift, or cause a stack overflow, or have races where they don't impact memory safety (it's easy to screw up by overusing Relaxed atomics). What is different about file descriptors that warrants the use of unsafe?

Secondly, saying a function might be written to rely on an assumption for memory safety, so don't violate that assumption in safe code, doesn't make sense. Unless you make the assumption explicitly part of the definition of unsafe in the document above, other code can expose the file descriptors safely, and then your code will still not work. This is not dogmatism, this is fundamentally the whole reason unsafe works--that all code is required to enforce the invariants. Functions cannot rely on file descriptors not being used by other threads, etc., for memory safety, unless that is explicitly stated in the definition of unsafe. The unsafe traits system is not here yet, but is designed to help provide additional invariants, maybe you can figure out a way to do something with that (but I doubt it).

Thirdly, none of the behaviors you listed (close, then open, execve) are memory unsafe. So I have no idea why this matters. If you want protection from this kind of abuse, you have to do it through the kernel anyway.

I read all your comments. None of them explain why this is memory unsafe or why your proposal will help make other code safe. Please stop recommending this, it doesn't make sense.

@l0kod
Copy link
Contributor

l0kod commented Nov 22, 2014

Again, as discussed in #15643, it's not (directly) about memory safety but about safe file descriptor handling. Some examples:

  • file descriptor leak (can quickly fulfill the file descriptor pool per process)
  • Rust should provide features to cleanly manage file descriptor already opened at execution as well
  • concurrent inconsistent write(2), e.g. multiple tasks write at the same time in the same FD
  • concurrent lseek(2), e.g. with read(2), which lead to corrupt data
  • file descriptor closing, e.g. store raw FD1, close FD1, write to FD1 (which is close)
  • file descriptor replacing, e.g. store raw FD1, close FD1, open a new file (FD2), write to FD1 (which is now FD2)
  • some file descriptor (e.g. open /proc/self/mem, mmap or memfd) can be a representation of the process memory, which would then impact the memory safety
  • keep in mind that FFI can use file descriptor and manage them in your back (safe wrapper could prevent bugs)

Using a raw file descriptor is unsafe. This errors can be security bugs as well. We can prevent most of them with a good file descriptor interface for functions using them (e.g. FFI). FileDesc already have some safety with Drop.

@l0kod
Copy link
Contributor

l0kod commented Nov 22, 2014

Here are some sketchy thoughts:

struct IoHandle<T> {
    handle: Mutex<T>,
}

impl IoHandle<T> {
    pub unsafe fn new(handle: T) {
        IoHandle{ handle: Mutex::new(handle) }
    }

    // Use select! with the file descriptor
    pub fn event(&self) -> Receiver {}
}

impl Deref<Mutex<Fd>> for IoHandle<Fd> {
    fn deref(&self) -> &Mutex<Fd> {
        &self.handle
    }
}

impl Drop for IoHandle<Fd> {
    fn drop(&mut self) {
        let _ = unsafe { libc::close(*self.lock()) };
    }
}

impl Clone for IoHandle<Fd> {
    fn clone(&self) -> IoHandle<Fd> {
        let fd = unsafe { libc::fcntl(*self.lock(), libc::F_DUPFD_CLOEXEC) };
        if fd == -1 {
            panic!(last_error());
        }
        unsafe { IoHandle::new(fd) }
    }
}

trait AsIoHandle<T, W> {
    fn as_iohandle(&self) -> &IoHandle<T>;
    fn from_iohandle(IoHandle<T>) -> W;
}

impl File {
    fn path(&'a self) -> Option<&'a Path> {}
    fn open_at(&self, path: &Path) -> IoResult<File> {}
}

impl AsIoHandle for File {
    #[cfg(windows)]
    fn from_iohandle(&self, IoHandle<Handle>) -> IoResult<File> {}
    #[cfg(unix)]
    fn from_iohandle(&self, IoHandle<Fd>) -> IoResult<File> {}

    #[cfg(windows)]
    fn as_iohandle(&self) -> &IoHandle<Handle> {}
    #[cfg(unix)]
    fn as_iohandle(&self) -> &IoHandle<Fd> {}
}

@pythonesque
Copy link
Contributor

Please reread my previous comment. It doesn't matter whether these things are unsafe by some other definition of safety, Rust's unsafe is only about memory safety and furthermore cannot work for anything other than memory safety. If you want to make raw file descriptors to be unsafe in general, please file an RFC to change the definition of unsafe.

@aturon
Copy link
Member Author

aturon commented Nov 22, 2014

@l0kod

Replacing as_raw_fd(&self) -> Fd with as_iohandle(&self) -> &IoHandle<T> would allow safe and OS-independent APIs.

I may be missing something, but I don't think this would really change things: the T would still vary by platform.

Fundamentally, these os::platform modules are intended for platform-specific APIs, and should be made as idiomatic for each platform as possible. Platform-independent APIs, on the other hand, can live in std::io.

@l0kod @mahkoh @pythonesque

I think there are a lot of interesting questions about what kind of safer/higher-level, but platform-specific abstractions we can provide here. But as I said above, the intent of this PR is just to expose the lowest-level hooks into std::io so that the broader Cargo ecosystem can explore abstractions on top of them; I'd rather that experimentation happen externally for now (that's the general migration path into std in any case).

It'd be good if we can turn the conversation back to issues about this PR in particular, and handle questions about abstractions on top elsewhere.

@l0kod
Copy link
Contributor

l0kod commented Nov 22, 2014

I may be missing something, but I don't think this would really change things: the T would still vary by platform.

You're right about the specific T by platform but the IoHandle<T> add a generic wrapper interface common to all platform that should help to add more (safety) features to this type (or trait) in the future (e.g. Drop implementation). In my example, the as_iohandle(), from_iohandle() or event() could benefit from this.
The return by reference is important for a (future) clean ownership too.

@aturon
Copy link
Member Author

aturon commented Nov 26, 2014

ping @alexcrichton -- I still feel that we should land this initial, unopinonated version that just exposes the raw types, and then bring in better abstractions after some external experimentation.

bors added a commit that referenced this pull request Nov 26, 2014
This PR adds some internal infrastructure to allow the private `std::sys` module to access internal representation details of `std::io`.

It then exposes those details in two new, platform-specific API surfaces: `std::os::unix` and `std::os::windows`.

To start with, these will provide the ability to extract file descriptors, HANDLEs, SOCKETs, and so on from `std::io` types.

More functionality, and more specific platforms (e.g. `std::os::linux`) will be added over time.

Closes #18897
@bors bors closed this Nov 26, 2014
@bors bors merged commit 1e66164 into rust-lang:master Nov 26, 2014
@l0kod
Copy link
Contributor

l0kod commented Nov 26, 2014

@aturon, could it be possible to expose the FileDesc struct alone (temporary) to be able to use it?

@l0kod
Copy link
Contributor

l0kod commented Dec 14, 2014

The MIO library from @carllerche use something similar to FileDesc with an IoDesc for auto-closing.

@l0kod
Copy link
Contributor

l0kod commented Jan 10, 2015

To conclude, using Fd could be safe as long as the AsRawFd (implementation) is not dropped before the end of the Fd use (i.e. function taking AsRawFd) except if the AsRawFd implementation need a consistent/known inner Fd state…

@l0kod
Copy link
Contributor

l0kod commented Apr 25, 2015

cc rust-lang/rfcs#1043

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

At least portions of std::sys should be public
8 participants