Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect handling of address argument in recvfrom() #539

Open
Philonous opened this issue Mar 7, 2023 · 10 comments
Open

Incorrect handling of address argument in recvfrom() #539

Philonous opened this issue Mar 7, 2023 · 10 comments
Assignees

Comments

@Philonous
Copy link

Context:

recvfrom() stores the source address of a received package in the sockaddr structure pointed to by the address pointer and the length of the address is stored in the memory pointed to by the address_len argument.

Problem description:

However, not all protocol provide source addresses, e.g. AF_UNIX does not.
In this case, the contents of the address parameter is "unspecified" [1] and should not be inspected. I have not found it in the specification, but in practice I have observed that at least on Linux address_len is set to 0

recvBufFrom does not check address_len at all and instead tries to parse the contents of address directly [2]. This leads to an error because it interprets the zeroed memory as AF_UNSPEC which is not supported. The error is ignored and getPeerName is called, which also fails.

The solution is to peek ptr_len and check if it is 0, and if so to return an "unset" address. I'm not sure how best to represent such an address, perhaps SockAddrUnix [] would do the trick?

Compare also how Rust handles the situation [3]

I have included a minimal code example that reproduced the problem [4]

@kazu-yamamoto kazu-yamamoto self-assigned this Mar 7, 2023
@kazu-yamamoto
Copy link
Collaborator

@Philonous As you think, SockAddrUnix "" would be the best workaround.
Would you kindly send a PR?

@kazu-yamamoto
Copy link
Collaborator

@Philonous Gentle ping.

@Philonous
Copy link
Author

Philonous commented Mar 21, 2023

I'm working on it. This isn't as obvious as I first thought.

  • Since 3.0 Socket doesn't container the address family any more, so I can't check it to return the correct address type when addrlen is 0.
  • I'm thinking that just returning SockAddrUnix "" whenever addrlen is 0 regardless of the socket type might work, it's at least not worse than the situation before.

However, there's another problem:

str <- peekCAString ((#ptr struct sockaddr_un, sun_path) p)

is incorrect:

  • POSIX doesn't specify that sun_path is null-terminated (though at least on Linux and *BSD apparently it is)
  • On Linux, the terminating null byte can sometimes be truncated [1] even for correctly sized buffers (sizeof(sockaddr_un))
    • This doesn't affect the existing code because it allocates and zeroes sizeof(sockaddr_storage) = 128 bytes[2][3], so there's always a null byte after sun_path. But SocketAddress' is publicly exported and can be implemented for a new type, e.g. one that passes a buffer of sizesizeof(sockaddr_un) bytes, which should work.
  • Abstract paths can contain null characters, so the only way to handle them properly is to know the length of sun_path
    • This is handled correctly in other places[4], so it's surprising that peekSocketAddress completely fails

As far as I can see, the interface peekSocketAddress :: Ptr sa -> IO sa can't work, because it needs addrlen to correctly decode sockaddr_un. It might also be a good idea to handle the case where addlen = 0 there.

However, there is one
case where confusing behavior can result: if 108 non-null bytes
are supplied when a socket is bound, then the addition of the
null terminator takes the length of the pathname beyond
sizeof(sun_path). Consequently, when retrieving the socket
address (for example, via accept(2)), if the input addrlen
argument for the retrieving call is specified as sizeof(struct
sockaddr_un), then the returned address structure won't have a
null terminator in sun_path.

@kazu-yamamoto
Copy link
Collaborator

Since 3.0 Socket doesn't container the address family any more, so I can't check it to return the correct address type when addrlen is 0.

Why don't you use getSocketName to get its SocketAddress type?

@kazu-yamamoto
Copy link
Collaborator

As far as I can see, the interface peekSocketAddress :: Ptr sa -> IO sa can't work, because it needs addrlen to correctly decode sockaddr_un. It might also be a good idea to handle the case where addlen = 0 there.

Can't we use sizeOfSocketAddress or sizeOfSockAddr to get addrlen?

@Philonous
Copy link
Author

Why don't you use getSocketName to get its SocketAddress type?

  • AFAIK Windows doesn't allow getsocketname() on unbound sockets. [1] Windows doesn't have AF_UNIX either, so the code could be conditionally included.
  • POSIX explicitly mentions that getsocketname() on unbound sockets returns unspecified data [2]. Although on Linux it seems to at least return the correct address family.

Can't we use sizeOfSocketAddress or sizeOfSockAddr to get addrlen?

We already have addrlen, recvfrom() etc. are passing it back to us. The problem is that we need it inside peekSocketAddress to correctly decode sockaddr_un.
sizeOfSocketAddress needs an already decoded address, so we can't use it inside peekSocketAddress.

WSAEINVAL | The socket has not been bound to an address with bind, or ADDR_ANY is specified in bind but connection has not yet occurred.

If the socket has not been bound to a local name, the value stored in the object pointed to by address is unspecified.

@kazu-yamamoto
Copy link
Collaborator

Let's introduce a breaking change:

peekSocketAddress :: Ptr sa -> Int -> IO sa

And let's release it with a new major version.

@Philonous
Copy link
Author

I've been working on this, but I've been stuck on

`catchIOError` \_ -> getPeerName s

I think the idea here is that for connected sockets, recvfrom() and recvmsg() don't return the remote address, so the getpeername() is there for convenience.

This assumes that recvfrom() will only not return an address when the socket is connected, which is false for AF_UNIX (this is what this ticket is about in the first place). In general, this would be false whenever we receive a packet from an unnamed socket. So I've been trying to figure out if this can happen besides with AF_UNIX. It could in theory happen with socketpair(), but none of the platforms I checked (Linux, FreeBSD, OpenBSD, NetBSD, Solaris) support socketpair() with anything but AF_UNIX, windows doesn't have socketpair().

Next I tried to figure out if getsockname() is guaranteed to return sensible data even for unbound sockets at least on *nix, since POSIX specifically states that it is "unspecified". On Windows it seems to be forbidden anyway, so it would have to be #ifdef-ed out. Linux just returns an empty address, which is fine, FreeBSD returns the correct address family and a bunch of garbage, but it looked like it should at least be safe to read, but I'm not entirely certain. I did not check on the other platforms for lack of time and easy access

I think this is brittle. A slightly better idea (in my opinion) might be to store the address family in the Socket value like before 3.0.

But I think the morally correct approach would be to change recvFrom's etc type signature from

recvFrom :: SocketAddress sa => Socket -> Int -> IO (ByteString, sa)

to

recvFrom :: SocketAddress sa => Socket -> Int -> IO (ByteString, Maybe sa)

and remove the call to getpeername().

It would remove the possibility of failure and more closely resemble the the underlying API.
Unfortunately, it would be a breaking change and presumably affect a lot of existing code.

@kazu-yamamoto
Copy link
Collaborator

I would suggest to keep recvFrom as is and provide a new safeRecvFrom which satisfies you.

@kazu-yamamoto kazu-yamamoto mentioned this issue May 30, 2023
5 tasks
@kazu-yamamoto
Copy link
Collaborator

We started a project for version 3.2.
Breaking changes are welcome now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants