Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SockRef::from, Socket::sendfile and other functions that operate on arbitrary file descriptors or SOCKETs potentially should be unsafe #218

Closed
ghost opened this issue Apr 13, 2021 · 29 comments · Fixed by #253 or #325

Comments

@ghost
Copy link

ghost commented Apr 13, 2021

See rust-lang/rust#72175 and nix-rust/nix#1421: it seems an external library is allowed to assume a file descriptor is private, for example:

#![allow(clippy::blacklisted_name)]
#![deny(unsafe_code)]

use socket2::SockRef;
use std::mem;

#[allow(unsafe_code)]
mod external_library {
    use std::{mem, os::unix::io::RawFd, ptr};

    static DATA: i32 = -1;

    pub struct Foo([RawFd; 2]);

    impl Foo {
        pub fn new() -> Self {
            let mut sockets = [-1; 2];
            assert_eq!(
                unsafe {
                    libc::socketpair(
                        libc::AF_UNIX,
                        libc::SOCK_SEQPACKET | libc::SOCK_CLOEXEC,
                        0,
                        sockets.as_mut_ptr(),
                    )
                },
                0
            );
            Self(sockets)
        }

        pub fn foo(&self) {
            let data: *const _ = &DATA;
            assert_eq!(
                unsafe {
                    libc::send(
                        self.0[0],
                        ptr::addr_of!(data).cast(),
                        mem::size_of_val(&data),
                        0,
                    )
                },
                mem::size_of_val(&data) as _
            );
        }

        pub fn bar(&self) -> i32 {
            let mut ptr: *const i32 = ptr::null();
            assert_eq!(
                unsafe {
                    libc::recv(
                        self.0[1],
                        ptr::addr_of_mut!(ptr).cast(),
                        mem::size_of_val(&ptr),
                        0,
                    )
                },
                mem::size_of_val(&ptr) as _
            );
            unsafe { *ptr }
        }
    }
}

fn main() {
    let foo = external_library::Foo::new();
    SockRef::from(&3)
        .send(&[0; mem::size_of::<*const i32>()])
        .unwrap();
    foo.foo();
    dbg!(foo.bar());
}

Here, the safe SockRef::from allows to crash a (fake) "external library" by letting it dereference a null pointer using only safe code. Making functions that accept arbitrary file descriptors or SOCKETs unsafe can probably solve this problem.

@Thomasdezeeuw
Copy link
Collaborator

See rust-lang/rust#72175 and nix-rust/nix#1421: it seems an external library is allowed to assume a file descriptor is private, for example:

I have no idea what you mean with private, but if you mean shouldn't be modified by external (outside of the type's impls) that it should not implement AsRawFd/AsRawSocket. If it does implement AsRawFd/AsRawSocket that it must be able to deal with the situation it's actually used.

Here, the safe SockRef::from allows to crash a (fake) "external library" by letting it dereference a null pointer using only safe code. Making functions that accept arbitrary file descriptors or SOCKETs unsafe can probably solve this problem.

The SocketRef::from(&3) should be unsafe, but not because SocketRef::from should be unsafe. Because you're creating (implicitly) a RawFd which should be valid, but isn't in this example, that is the unsafe part.

Furthermore you're dereferencing a raw pointer using unsafe, so you're clearing not only using safe code. Putting code in an "external library" doesn't mean it shouldn't count towards unsafe code, it still is.

So, I do agree that creating RawFd from literals, especially when passing them directly to function as you did in the example should be unsafe or at least clearer. But I don't agree SocketRef::from is unsafe, or should be marked as such.

@ghost
Copy link
Author

ghost commented Apr 13, 2021

The SocketRef::from(&3) should be unsafe, but not because SocketRef::from should be unsafe. Because you're creating (implicitly) a RawFd which should be valid, but isn't in this example, that is the unsafe part.

So, I do agree that creating RawFd from literals, especially when passing them directly to function as you did in the example should be unsafe or at least clearer. But I don't agree SocketRef::from is unsafe, or should be marked as such.

Fair. I think at least something in the 3_i32 -> ... -> SockRef path should be unsafe. Maybe a new wrapper type that represents a file descriptor which will be used by the SockRef that is unsafe to construct from an arbitrary number?

@sunfishcode
Copy link
Member

It turns out that RawFd itself implements AsRawFd. So AsRawFd doesn't say anything about the validity or ownership of the returned RawFd value.

rust-lang/rust#72175 is still being discussed, and looks like it may lead to an RFC. If that happens, and if the RFC is accepted, functions that operate on RawFd values should be unsafe. That includes SockRef::from in its current form, since it accepts any type that implements AsRawFd which includes RawFd itself.

That said, SockRef::from is a valuable use case, and ideally it should be possible to do what it's doing without unsafe in its API. As one possible approach, the unsafe-io crate has an OwnsRaw trait which types that implement AsRawFd etc. can implement to additionally state that they really do own their handles, and traits like AsUnsafeSocket type which combine OwnsRaw and AsRawFd (AsRawSocket on Windows). With these, it should be possible to have fn SockRef::from(t: &T) where T: AsUnsafeSocket as a safe API. (Aside: unsafe-io is a new crate, and I'm interested if anyone has ideas for how to improve the API!)

But of course, it makes sense to wait and see how the discussion in rust-lang/rust#72175 and the possible RFC turn out before making major changes here.

@sunfishcode
Copy link
Member

I'm working on a draft of a proposal to add new official wording about RawFd to Rust.

In particular, this proposal would mean functions like SockRef::from with AsRawFd arguments should be marked unsafe, or migrated to alternatives. But the upside would be clearer guarantees for many use cases. I'd be interested in any feedback from folks working on socket2!

@Thomasdezeeuw
Copy link
Collaborator

I'm working on a draft of a proposal to add new official wording about RawFd to Rust.

In particular, this proposal would mean functions like SockRef::from with AsRawFd arguments should be marked unsafe, or migrated to alternatives. But the upside would be clearer guarantees for many use cases. I'd be interested in any feedback from folks working on socket2!

I don't have time to look through the entire proposal, but SockRef::from will remain a safe function. In my opinion the AsRawFd/AsRawSocket documentation should be clearer that the returned fd/socket should be valid. The current documentation for AsRawFd says the following

This method does not pass ownership of the raw file descriptor to the caller. The descriptor is only guaranteed to be valid while the original object has not yet been destroyed.

The second sentence says to me that the implementation needs to ensure the file descriptor is valid (while the object is alive). Unfortunately RawFd is just a c_int (because it needs to match C semantics) and can "just be created" without being valid, i.e. the literal 123 is a "valid" (as in type) RawFd but of course isn't a valid fd. Couple that with the AsRawFd impl for RawFd and we got a problem as shown in the initial issue report.

I think the problem is that the creation of RawFd should be an unsafe operation in which the programmer needs to ensure it's a valid descriptor. But we can't change that anymore. We can improve the documentation however.

@sunfishcode
Copy link
Member

Instead of making it unsafe, what would you think about changing SockRef::from to use AsUnsafeSocket in place of AsRawFd?

AsUnsafeSocket is implemented for all the relevant std types, TcpStream, TcpListener, UdpSocket, UnixStream, UnixListener, and UnixDatagram. For other types, the only thing needed is to add an impl OwnsRaw, so types like socket2::Socket and others could easily implement it. That way, SockGet::from wouldn't be unsafe, and many use cases would continue to work as-is.

And to be sure, this isn't urgent; I'm still in the process of exploring the options.

@Thomasdezeeuw
Copy link
Collaborator

Instead of making it unsafe, what would you think about changing SockRef::from to use AsUnsafeSocket in place of AsRawFd?

I don't think socket2 should dependencies outside of libc/winapi. Furthermore exposing those dependencies in a public API is a bad idea as we're then tied to a fixed version of the crate (i.e. we couldn't update to unsafe-io v2 without also updating to socket2 v2).

I'm also not convinced that AsUnsafeSocket is an improvement of AsRawFd. I think the best way forward is improving the documentation of AsRawFd/AsRawSocket.

AsUnsafeSocket is implemented for all the relevant std types, TcpStream, TcpListener, UdpSocket, UnixStream, UnixListener, and UnixDatagram. For other types, the only thing needed is to add an impl OwnsRaw, so types like socket2::Socket and others could easily implement it. That way, SockGet::from wouldn't be unsafe, and many use cases would continue to work as-is.

And to be sure, this isn't urgent; I'm still in the process of exploring the options.

@sunfishcode
Copy link
Member

Right now, it's possible for code in one library to use functions like SockRef::from to break the encapsulation of code in another library without itself using any unsafe. It's a small loophole in Rust, and it is fixable, so I'm interested in whether any of the possible fixes are practical, rather than just leaving the loophole open and documenting it.

One option would be to propose adding the parts of unsafe-io needed to fix this to std. The OwnsRaw trait might be sufficient. That way you could implement OwnsRaw and use AsRawFd + OwnsRaw, without any dependencies. Does something like that sound practical?

@Thomasdezeeuw
Copy link
Collaborator

Right now, it's possible for code in one library to use functions like SockRef::from to break the encapsulation of code in another library without itself using any unsafe.

This is not true. If the encapsulation implements AsRawFd, which allows external access to the file descriptor, it must be able to handle changes to the file descriptor, e.g. options are set on it. If it can't deal with this it shouldn't implement AsRawFd, it's as simple as that.

It's a small loophole in Rust, and it is fixable, so I'm interested in whether any of the possible fixes are practical, rather than just leaving the loophole open and documenting it.

It's not a loophole, it's by design. It's required to deal with C/Unix/Windows I/O model of using integers/void pointers as identifiers to a kernel file description.

One option would be to propose adding the parts of unsafe-io needed to fix this to std. The OwnsRaw trait might be sufficient. That way you could implement OwnsRaw and use AsRawFd + OwnsRaw, without any dependencies. Does something like that sound practical?

I still don't get the point of OwnsRaw and the benefit over AsRawFd, could you elaborate why you think it's an improvement?

@sunfishcode
Copy link
Member

Suppose library A holds a RawFd it never exposes, which happens to have the value 4. Suppose library B does this:

let s = socket2::SockRef::from(&4);
let mut buf = vec![std::mem::MaybeUninit::new(0u8); 32];
s.peek(&mut buf[..])?;

This compiles on stable Rust today. B can read A's otherwise encapsulated data, with no unsafe of its own.

Another example would be B accidentally holding a RawFd value after closing it, which then aliases a newly created file descriptors in A. POSIX considers all this defined behavior. But in Rust, it's expected that one library shouldn't be able to read another's encapsulated data without using unsafe.

There's a way to fix this: from_raw_fd is unsafe. However there's a loophole: AsRawFd and IntoRawFd are not unsafe, so anything can implement them. It's tempting to just document this, and say that implementing them should imply certain guarantees, however std itself implements them for RawFd and makes no guarantees.

OwnsRaw solves this by being an unsafe trait. Types which implement it have to use the keyword unsafe, and in doing so they commit to the desired guarantees.

@Thomasdezeeuw
Copy link
Collaborator

Suppose library A holds a RawFd it never exposes, which happens to have the value 4. Suppose library B does this:

let s = socket2::SockRef::from(&4);
let mut buf = vec![std::mem::MaybeUninit::new(0u8); 32];
s.peek(&mut buf[..])?;

This comment by withoutboats (not pinging) points to the core issue: rust-lang/rust#76969 (comment), that RawFd is just a type alias. In your example you're creating a RawFd from nothing: that should be unsafe, but isn't. That is not a problem with socket2, but one with std lib, which due to the v1 stability guarantees can't be changed.

We should clearly document this footgun of using integer literals as RawFd in both SocketRef::from and AsRawFd.

@sunfishcode
Copy link
Member

Integer literals are one example. Use-after-close is another. Arithmetic and deserialization are others.

I agree, we could better document the current situation. At the same time, this problem has the same form as problems with raw pointers, such as dangling and aliasing, which Rust does more than just document. Fixing it seems desirable in principle. And it seems possible. Is it practical?

I agree with withoutboats; the underlying problem is in std. OwnsRaw is a new trait, designed to be used with the existing types and traits without changing or replacing them. If there's a consensus around an RFC, then we should be able to arrange for all the important types that implement AsRawFd to implement OwnsRaw too—it's just one line of code per type, and it doesn't break anything to add them. std itself wouldn't need any breaking changes. Functions like SockRef::from could wait until the OwnsRaw impls are in place before adding + OwnsRaw to their bound to minimize disruption.

@RalfJung
Copy link
Member

RalfJung commented Apr 23, 2021

I have no idea what you mean with private, but if you mean shouldn't be modified by external (outside of the type's impls) that it should not implement AsRawFd/AsRawSocket. If it does implement AsRawFd/AsRawSocket that it must be able to deal with the situation it's actually used.

I strongly disagree. AsRawFd for a file descriptor is the equivalent of casting a reference to an integer. The latter is a safe operation, and so is the former. The inverse, turning an integer to a reference, is unsafe -- not just because the integer could be dangling, but also because that memory could be owned by other parties that e.g. have &mut or Box pointers.

The situation here is entirely analogous: as_raw_fd (the "ptr-to-int cast") is safe; anything working with such potentially dangling, potentially aliased FDs ("dereferencing a raw pointer") should be unsafe.

This is not true. If the encapsulation implements AsRawFd, which allows external access to the file descriptor, it must be able to handle changes to the file descriptor, e.g. options are set on it. If it can't deal with this it shouldn't implement AsRawFd, it's as simple as that.

Again, consider the analogy with references and integers: just because &mut can be safely cast to a raw pointer, does not mean that references can handle arbitrary changes to the underlying data through this raw pointer -- it is still wrong to violate the uniqueness assumption of the &mut reference.

Taking a step back, there are two options for how the Rust ecosystem as a whole could treat raw FDs:

  1. "free-for-all", no ownership, doing arbitrary operations on arbitrary FDs is, in principle, allowed.
  2. "exclusive", with ownership, one may only operate on FDs that one owns.

Both options are self-consistent and internally coherent. The standard library quite clearly took option 2, as a consequence as_raw_fd is safe and from_raw_fd is unsafe and RawFd is just a type aliased. This is also the option Rust took for pointers/memory. (Here, option 1 is not really realistic as that would violate memory safety.) You seem to claim that what the standard library does is somehow incoherent; that claim is wrong. It is entirely coherent to say both "AsRawFd is safe" and "working with non-owned FDs is unsafe"; this is exactly what Rust does for pointers and it works equally well for FDs. The only incoherence we have here is parts of the ecosystem picking option 2 with others picking option 1.

In your example you're creating a RawFd from nothing: that should be unsafe, but isn't.

Here you are presuming option 1 -- but that is simply not the option the standard library took. Under option 2, there is nothing wrong with creating a RawFd being a safe operation. This is in analogy with how turning a usize to a *mut i32 is a safe operation.
So, it could be unsafe (if option 1 is what we'd use), but saying it "should" be unsafe is incorrect.

@Thomasdezeeuw
Copy link
Collaborator

I strongly disagree. AsRawFd for a file descriptor is the equivalent of casting a reference to an integer. The latter is a safe operation, and so is the former. The inverse, turning an integer to a reference, is unsafe -- not just because the integer could be dangling, but also because that memory could be owned by other parties that e.g. have &mut or Box pointers.

The situation here is entirely analogous: as_raw_fd (the "ptr-to-int cast") is safe; anything working with such potentially dangling, potentially aliased FDs ("dereferencing a raw pointer") should be unsafe.

I see your point and partially agree. I don't working with fd should be as unsafe as it can/should never cause memory unsafety as the OS always checks the validity of the file descriptor and returns an error if it's invalid. But I do see the point your making.

Again, consider the analogy with references and integers: just because &mut can be safely cast to a raw pointer, does not mean that references can handle arbitrary changes to the underlying data through this raw pointer -- it is still wrong to violate the uniqueness assumption of the &mut reference.

I have to disagree here. If a type needs to hold some invariant it shouldn't allow arbitrary access to the underlying data. Take NonZeroUsize for example, it needs to ensure that the underlying data (usize) is never zero. So it can hand out mutable reference to the underlying data as they might break the invariant. Compared to say Wrapping (which doesn't have any such invariant) it's underlying data is public.

Taking a step back, there are two options for how the Rust ecosystem as a whole could treat raw FDs:

1. "free-for-all", no ownership, doing arbitrary operations on arbitrary FDs is, in principle, allowed.

2. "exclusive", with ownership, one may only operate on FDs that one owns.

Both options are self-consistent and internally coherent. The standard library quite clearly took option 2, as a consequence as_raw_fd is safe and from_raw_fd is unsafe and RawFd is just a type aliased. This is also the option Rust took for pointers/memory. (Here, option 1 is not really realistic as that would violate memory safety.) You seem to claim that what the standard library does is somehow incoherent; that claim is wrong. It is entirely coherent to say both "AsRawFd is safe" and "working with non-owned FDs is unsafe"; this is exactly what Rust does for pointers and it works equally well for FDs. The only incoherence we have here is parts of the ecosystem picking option 2 with others picking option 1.

In your example you're creating a RawFd from nothing: that should be unsafe, but isn't.

Here you are presuming option 1 -- but that is simply not the option the standard library took. Under option 2, there is nothing wrong with creating a RawFd being a safe operation. This is in analogy with how turning a usize to a *mut i32 is a safe operation.
So, it could be unsafe (if option 1 is what we'd use), but saying it "should" be unsafe is incorrect.

I see your point.

What do you (not just @RalfJung) suggest the new API would look like? I have some requirements:

  • No external dependencies.
  • Preferably the operation requires no unsafe for known valid fds, e.g. going from &TcpStream to SocketRef should be safe (as it is with the current API).
  • Preferably crates that expose socket types, e.g. mio::net::TcpStream, don't need to have a (public) dependency on socket2.

@sunfishcode
Copy link
Member

What do you (not just @RalfJung) suggest the new API would look like? I have some requirements:

* No external dependencies.

* _Preferably_ the operation requires no unsafe for known valid fds, e.g. going from `&TcpStream` to `SocketRef` should be safe (as it is with the current API).

* _Preferably_ crates that expose socket types, e.g. `mio::net::TcpStream`, don't need to have a (public) dependency on socket2.

To eliminate the need for a dependency, I've now added the OwnsRaw trait to my RFC draft. So if the RFC is accepted, and once we have OwnsRaw impls for mio::net::TcpStream, socket2::Socket, and other important types, the new API could look like this:

use std::io::OwnsRaw;
...
impl<'s, S> From<&'s S> for SockRef<'s>
where
    S: AsRawFd + OwnsRaw,

That would meet all three requirements. Does that look reasonable?

@Thomasdezeeuw
Copy link
Collaborator

To eliminate the need for a dependency, I've now added the OwnsRaw trait to my RFC draft. So if the RFC is accepted, and once we have OwnsRaw impls for mio::net::TcpStream, socket2::Socket, and other important types, the new API could look like this:

use std::io::OwnsRaw;
...
impl<'s, S> From<&'s S> for SockRef<'s>
where
    S: AsRawFd + OwnsRaw,

That would meet all three requirements. Does that look reasonable?

I think so after a could look at the RFC (https://github.com/sunfishcode/rfcs/blob/3abe7cc704decf63beee16489123e0f2c996f6f3/text/0000-io-safety.md). Also a side note: you might want to include (parts of) @RalfJung comment #218 (comment) in the RFC.

In the mean time any suggestions? We could simply add a trait like OwnsRaw to socket2 as a stop gap and optionally an unsafe version of From::from for types that don't implement OwnsRaw, forcing the caller to ensure the fd is valid.

sunfishcode added a commit to sunfishcode/rfcs that referenced this issue Apr 27, 2021
@sunfishcode
Copy link
Member

I think so after a could look at the RFC (https://github.com/sunfishcode/rfcs/blob/3abe7cc704decf63beee16489123e0f2c996f6f3/text/0000-io-safety.md). Also a side note: you might want to include (parts of) @RalfJung comment #218 (comment) in the RFC.

Cool, and good idea, I've now incorporated more of those ideas.

In the mean time any suggestions? We could simply add a trait like OwnsRaw to socket2 as a stop gap and optionally an unsafe version of From::from for types that don't implement OwnsRaw, forcing the caller to ensure the fd is valid.

You could do that if you want, however I also think it'd be ok to just wait to see how the RFC goes before making any changes here.

@RalfJung
Copy link
Member

RalfJung commented May 1, 2021

I have to disagree here. If a type needs to hold some invariant it shouldn't allow arbitrary access to the underlying data. Take NonZeroUsize for example, it needs to ensure that the underlying data (usize) is never zero. So it can hand out mutable reference to the underlying data as they might break the invariant. Compared to say Wrapping (which doesn't have any such invariant) it's underlying data is public.

NonZeroUsize can be converted to usize in safe code. In other words, even types that need to hold invariants can safely and reasonably provide read-only access to their internal data. Even &mut T maintains some very important invariant on the underlying data, and yet that data can be safely inspected (by casting to usize). That is exactly what as_raw_fd does as well.

Note that the file descriptor in File is not public, and you cannot safely construct a File from the raw underlying data (unlike Wrapping). So I do not understand which analogy you are making here.

To eliminate the need for a dependency, I've now added the OwnsRaw trait to my RFC draft. So if the RFC is accepted, and once we have OwnsRaw impls for mio::net::TcpStream, socket2::Socket, and other important types, the new API could look like this:

I guess your proposal uses the 's lifetime of SocketRef to ensure that the ref does not outlive the owned FD? Neat.


(Moved other things to the IRLO thread.)

@Thomasdezeeuw
Copy link
Collaborator

NonZeroUsize can be converted to usize in safe code. In other words, even types that need to hold invariants can safely and reasonably provide read-only access to their internal data. Even &mut T maintains some very important invariant on the underlying data, and yet that data can be safely inspected (by casting to usize). That is exactly what as_raw_fd does as well.

What I meant was that NonZeroUsize doesn't hand out &mut usize, because it couldn't ensure the invariant. In my opinion AsRawFd is similar to handing out a OsLock<FileDescription>. Yes, access to Rust's memory isn't possible with just a RawFd, but the implementer should known mutable access in possible to the kernel's file description (the data the kernel holds for each file/socket/etc.). So, what I tried to convey was that handing out a RawFd, even when the method is safe, does allow mutable access to the file description. And types that implemented AsRawFd should be able to handle this.

Note that the file descriptor in File is not public, and you cannot safely construct a File from the raw underlying data (unlike Wrapping). So I do not understand which analogy you are making here.

I was referring to mutable access to Wrapping's internal data (e.g. usize), compared to NonZeroUsize. But let's leave this, me bringing this type up sidetracks us more then anything else.

To eliminate the need for a dependency, I've now added the OwnsRaw trait to my RFC draft. So if the RFC is accepted, and once we have OwnsRaw impls for mio::net::TcpStream, socket2::Socket, and other important types, the new API could look like this:

I guess your proposal uses the 's lifetime of SocketRef to ensure that the ref does not outlive the owned FD? Neat.

Exactly, that's why it only implements From<&'s AsRawFd>, so we always ensured the owned socket outlives the SockRef.

(Moved other things to the IRLO thread.)

Could you link to that I can't find it.

@RalfJung
Copy link
Member

RalfJung commented May 1, 2021

In my opinion AsRawFd is similar to handing out a OsLock.

I think here we are getting to the bottom of the problem -- the ownership associated with the return value of AsRawFd is unspecified. This is the exact problem that C has with pointers (types don't say who owns what); Rust solved that with Box and reference types.

You interpret it to be actually returning some form of ownership (even though I am not sure what the OsLock is about). However, note that as_raw_fd takes a type with a lifetime, so surely your idealized return type must have a lifetime as well? That file descriptor is definitely not valid forever.

I interpret it to be like this:

type RawInt = usize;
pub fn as_raw_integer<T>(x: &mut T) -> RawInt { x as *mut _ as usize }

IOW, I do not think any ownership is returned here.

I do think that there are some indications in std that as_raw_fd does not return ownership. Given your idealized return type, you do seem to entertain the idea that FDs can be exclusively owned -- but clearly, that is in contradiction with having safe ways to access arbitrary usize file descriptors, i.e., it is in contradiction with the APIs currently offered by this crates. Or am I misunderstanding something?

That said, I also admit that the AsRawFd trait seems kind of useless in this world.

Could you link to that I can't find it.

Sure, sorry: https://internals.rust-lang.org/t/pre-rfc-i-o-safety/14585

@Thomasdezeeuw
Copy link
Collaborator

In my opinion AsRawFd is similar to handing out a OsLock.

I think here we are getting to the bottom of the problem -- the ownership associated with the return value of AsRawFd is unspecified. This is the exact problem that C has with pointers (types don't say who owns what); Rust solved that with Box and reference types.

You interpret it to be actually returning some form of ownership (even though I am not sure what the OsLock is about).

I interpret it not about ownership, but it's about access. With OsLock meant that the OS serialises access to the file desctiption (avoiding data races inside the kernel etc.), so it's like handing out a lock. I shouldn't been clearer about this.

However, note that as_raw_fd takes a type with a lifetime, so surely your idealized return type must have a lifetime as well? That file descriptor is definitely not valid forever.

Indeed that why SockRef has a lifetime. In a sense SockRef is already the idealized return type to me. It holds the raw fd, but has a lifetime to ensure the fd remains valid.

I interpret it to be like this:

type RawInt = usize;
pub fn as_raw_integer<T>(x: &mut T) -> RawInt { x as *mut _ as usize }

IOW, I do not think any ownership is returned here.

In a sense, but file descriptor are of course special. They aren't about the value themselves, but what give you access to the file description. Returning a file descriptor (to me at least) also implicitly returning access to the file description.

I do think that there are some indications in std that as_raw_fd does not return ownership.

I agree AsRawFd doesn't return ownership, for that we IntoRawFd which passes ownership.

Given your idealized return type, you do seem to entertain the idea that FDs can be exclusively owned

I indeed think file descriptor can be exclusively owned. AsRawFd (to me) allows the user to borrow the file descriptor, meaning the type doesn't have exclusive access anymore

-- but clearly, that is in contradiction with having safe ways to access arbitrary usize file descriptors, i.e., it is in contradiction with the APIs currently offered by this crates. Or am I misunderstanding something?

The API in question From<&T> where T: AsRawFd was never about accessing arbitrary usize file descriptors, in fact quite the opposite. It was about borrowing, safely, a socket type without having go through the unsafe hoop of AsRawFd/FromRawFd (as the From implementation did that for the user). So they can do SockRef::from(&TcpStream) without having to deal with unsafe or raw file desctiptors.

An ideal trait, perhaps as a safer alternative to AsRawFd, would look something like the following:

struct BorrowedFd<'fd> {
    fd: RawFd,
    _lifetime: PhantomData<&'fd RawFd>, // Lifetime of the file descriptor we're borrowing from
}

trait AsBorrowedFd {
    // Perhaps some types need mutable access, so `&'fd mut self`.
    fn as_borrow_fd<'fd>(&'fd self) -> BorrowedFd<'fd>;
}

That said, I also admit that the AsRawFd trait seems kind of useless in this world.

Could you link to that I can't find it.

Sure, sorry: https://internals.rust-lang.org/t/pre-rfc-i-o-safety/14585

@RalfJung
Copy link
Member

RalfJung commented May 1, 2021

In a sense, but file descriptor are of course special. They aren't about the value themselves, but what give you access to the file description. Returning a file descriptor (to me at least) also implicitly returning access to the file description.

One could say the same thing about pointers. :) You don't care about the address itself, you care about the data the ptr gives you access to. FDs might be special but not more special than pointers.

In Rust, we all agree that returning a usize that represents the underlying address of some data in memory does not implicitly give you the right to do anything with that address. You need to carefully read the docs to figure out who is allowed to do what with which piece of memory and when. We use ownership to encode those things in the type system and have the compiler help us. I think using the same strategy and terminology makes a lot of sense for FDs, as well. They are just pointers into a different address space.

I indeed think file descriptor can be exclusively owned. AsRawFd (to me) allows the user to borrow the file descriptor, meaning the type doesn't have exclusive access anymore

The thing is, without the "I/O safety" proposal, FDs cannot be exclusively owned. You need to accept that proposal to even have the notion of "owning an FD" in the vocabulary of Rust type invariants. If anyone can just take the integer "5" and treat it as an FD and write to it, all in safe code -- that is a counterexample to ownership of FDs.

The API in question From<&T> where T: AsRawFd was never about accessing arbitrary usize file descriptors, in fact quite the opposite. It was about borrowing, safely, a socket type without having go through the unsafe hoop of AsRawFd/FromRawFd (as the From implementation did that for the user). So they can do SockRef::from(&TcpStream) without having to deal with unsafe or raw file desctiptors.

You can also do SockRef::from(&(5 as RawFd)), though. So this story never quite worked, did it?

It looks like AsRawFd is simply the wrong trait for what you want to achieve. The name already says that this is about getting "raw" FD access -- the "raw" here is in direct analogy to "raw pointer". The fact that any integer can be safely turned into an impl AsRawFd is another giveaway. So since you have been using AsRawFd I interpreted this as a deliberate choice; now it looks to me like you did this for lack of a better alternative? (You could have created your own trait for this, but I guess that has non-obvious downsides as well and anyway now is not the time to rehash the design of this crate.^^) In that case I think we have agreement, even though we are not using all the same words for the same things. :)

AsBorrowedFd looks great, yes! Crucially, BorrowedFd borrows ownership of the underlying FD and thus this type must be unsafe-to-construct from a RawFd (but it can implement AsRawFd for the conversion the other way around). Please post it in that forum thread. I was imagining something like AsOwnedFd, which would be basically the same but forIntoRawFd.

@Thomasdezeeuw
Copy link
Collaborator

One could say the same thing about pointers. :) You don't care about the address itself, you care about the data the ptr gives you access to. FDs might be special but not more special than pointers.

Indeed both pointer and fds are special cases on integers.

In Rust, we all agree that returning a usize that represents the underlying address of some data in memory does not implicitly give you the right to do anything with that address. You need to carefully read the docs to figure out who is allowed to do what with which piece of memory and when. We use ownership to encode those things in the type system and have the compiler help us. I think using the same strategy and terminology makes a lot of sense for FDs, as well. They are just pointers into a different address space.

I indeed think file descriptor can be exclusively owned. AsRawFd (to me) allows the user to borrow the file descriptor, meaning the type doesn't have exclusive access anymore

The thing is, without the "I/O safety" proposal, FDs cannot be exclusively owned. You need to accept that proposal to even have the notion of "owning an FD" in the vocabulary of Rust type invariants. If anyone can just take the integer "5" and treat it as an FD and write to it, all in safe code -- that is a counterexample to ownership of FDs.

But we can't write to integer 5 without unsafe code (well we can with the current SocketRef::from, but we already established that is unsound and undesired), you need to call the unsafe libc::write function or similar. But I very much agree that Rust would benefit from a ownership, lifetime and borrowing systems (equivalents) for file descriptors.

The API in question From<&T> where T: AsRawFd was never about accessing arbitrary usize file descriptors, in fact quite the opposite. It was about borrowing, safely, a socket type without having go through the unsafe hoop of AsRawFd/FromRawFd (as the From implementation did that for the user). So they can do SockRef::from(&TcpStream) without having to deal with unsafe or raw file desctiptors.

You can also do SockRef::from(&(5 as RawFd)), though. So this story never quite worked, did it?

It looks like AsRawFd is simply the wrong trait for what you want to achieve.

Agreed.

The name already says that this is about getting "raw" FD access -- the "raw" here is in direct analogy to "raw pointer". The fact that any integer can be safely turned into an impl AsRawFd is another giveaway. So since you have been using AsRawFd I interpreted this as a deliberate choice; now it looks to me like you did this for lack of a better alternative?

Yes.

(You could have created your own trait for this, but I guess that has non-obvious downsides as well and anyway now is not the time to rehash the design of this crate.^^) In that case I think we have agreement, even though we are not using all the same words for the same things. :)

Great.

AsBorrowedFd looks great, yes! Crucially, BorrowedFd borrows ownership of the underlying FD and thus this type must be unsafe-to-construct from a RawFd (but it can implement AsRawFd for the conversion the other way around). Please post it in that forum thread.

Done.

I was imagining something like AsOwnedFd, which would be basically the same but forIntoRawFd.

@Thomasdezeeuw
Copy link
Collaborator

Pr #237 adds an example and some more docs around the unsafe/unsound usage of SockRef::from as a stop-gap for v0.4. Hopefully we can figure out a proper fix for v0.5.

@notgull
Copy link
Contributor

notgull commented Apr 28, 2022

Could we add an Into<Socket> bound to this, since the types that can be fed into Into<Socket> are well-known to be able to be converted into sockets?

@Thomasdezeeuw
Copy link
Collaborator

Could we add an Into<Socket> bound to this, since the types that can be fed into Into<Socket> are well-known to be able to be converted into sockets?

@notgull Anything that implements From<T> for Socket implies Into<Socket> for T, see https://doc.rust-lang.org/std/convert/trait.From.html#generic-implementations.

@Thomasdezeeuw
Copy link
Collaborator

Looks like I/O safety might be stabilised in 1.62: rust-lang/rust#95118.

@notgull
Copy link
Contributor

notgull commented Aug 11, 2022

I/O Safety has been stabilized in 1.63.0. I would be willing to write a PR that changes this crate to use the new I/O-safe traits, if the MSRV bump is acceptable.

@Thomasdezeeuw
Copy link
Collaborator

@notgull we're going to bump to 1.63 in v0.5 for that reason, see #320.

We do have to be careful about the change in API though, we'll want to give people and nice path to upgrade. So, maybe we add the owned variants of the traits and leave the current (unsound) ones in place an deprecate them, removing them in the next version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants