Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SocketAddr from UnixDatagram recv_from missing the last character (on OS X) #69061

Closed
zonyitoo opened this issue Feb 11, 2020 · 14 comments
Closed
Labels
C-bug Category: This is a bug. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@zonyitoo
Copy link

zonyitoo commented Feb 11, 2020

I tried this code:

use std::os::unix::net::UnixDatagram;

fn main() {
    let socket = UnixDatagram::bind("/tmp/shadowsocks-manager.sock").unwrap();

    let mut buf = [0u8; 65536];
    loop {
        let (n, src_addr) = socket.recv_from(&mut buf).unwrap();
        println!("size={} src={:?}", n, src_addr);
    }
}

Send request with a Python test script:

import socket

cli = socket.socket(socket.AF_UNIX, socket.SOCK_DGRAM)
cli.bind('/tmp/shadowsocks-manager-c.sock')
cli.connect('/tmp/shadowsocks-manager.sock')

cli.send(b'ping')
print(cli.recv(1506))

I expected to see this happen:

size=4 src="/tmp/shadowsocks-manager-c.sock" (pathname)

Instead, this happened:

size=4 src="/tmp/shadowsocks-manager-c.soc" (pathname)

Missing the last character.

Meta

rustc --version --verbose:

rustc 1.42.0-nightly (859764425 2020-01-07)
binary: rustc
commit-hash: 85976442558bf2d09cec3aa49c9c9ba86fb15c1f
commit-date: 2020-01-07
host: x86_64-apple-darwin
release: 1.42.0-nightly
LLVM version: 9.0

Problem

On OS X, sockaddr_un is defined as

/*
 * [XSI] Definitions for UNIX IPC domain.
 */
struct  sockaddr_un {
	unsigned char   sun_len;        /* sockaddr len including null */
	sa_family_t     sun_family;     /* [XSI] AF_UNIX */
	char            sun_path[104];  /* [XSI] path name (gag) */
};

which has one more byte sun_len before sun_path than Linux's definition:

#define	__SOCKADDR_COMMON(sa_prefix) \
  sa_family_t sa_prefix##family

/* Structure describing the address of an AF_LOCAL (aka AF_UNIX) socket.  */
struct sockaddr_un
  {
    __SOCKADDR_COMMON (sun_);
    char sun_path[108];		/* Path name.  */
  };

This issue also exists in Tokio (Mio), too. tokio-rs/tokio#2230

@zonyitoo zonyitoo added the C-bug Category: This is a bug. label Feb 11, 2020
@sfackler sfackler added the T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. label Feb 11, 2020
@sfackler
Copy link
Member

Wow, can't believe this made it 3.5 years without being noticed!

@zonyitoo
Copy link
Author

zonyitoo commented Feb 11, 2020

@sfackler
Copy link
Member

I can reproduce this, but weirdly not when sending from Rust code, just Python.

@sfackler
Copy link
Member

The problem appears to be that Python sets the socklen_t to just the size up to the last byte of the addr not including the null terminator, and that number gets sent to recvfrom on OSX. I guess we should check if the last byte is a null before ignoring it.

@zonyitoo
Copy link
Author

Hmm, so it is Python's problem? Because in C, socklen_t in sendto can be calculated by SUN_LEN macro.

@zonyitoo
Copy link
Author

Test with C client:

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/socket.h>
#include <sys/un.h>
#include <string.h>
#include <stddef.h>

int main(int argc, char *argv[])
{
    int ret = 0;

    int fd = socket(AF_UNIX, SOCK_DGRAM, 0);
    if (fd < 0)
    {
        perror("socket");
        return EXIT_FAILURE;
    }

    struct sockaddr_un baddr;
    memset(&baddr, 0, sizeof(baddr));

    const char *BIND_PATH = "/tmp/shadowsocks-manager-c.sock";
    baddr.sun_family = AF_UNIX;
    strcpy(baddr.sun_path, BIND_PATH);

    unlink(BIND_PATH);

    if ((ret = bind(fd, (struct sockaddr *)&baddr, sizeof(baddr))) < 0)
    {
        perror("bind");
        close(fd);
        return EXIT_FAILURE;
    }

    const char buf[] = "ping";

    struct sockaddr_un caddr;
    memset(&caddr, 0, sizeof(caddr));
    caddr.sun_family = AF_UNIX;

    const char *SVR_PATH = "/tmp/shadowsocks-manager.sock";
    strcpy(caddr.sun_path, SVR_PATH);

    caddr.sun_len = SUN_LEN(&caddr);

    socklen_t caddr_len = offsetof(struct sockaddr_un, sun_path) + strlen(caddr.sun_path) + 1;

    ssize_t rcount = sendto(fd, buf, sizeof(buf), 0, (struct sockaddr *)&caddr, caddr_len);
    if (rcount < 0)
    {
        perror("sendto");
        close(fd);
        return EXIT_FAILURE;
    }

    return 0;
}

And Rust server output:

size=5 src="/tmp/shadowsocks-manager-c.sock\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}" (pathname)

@zonyitoo
Copy link
Author

zonyitoo commented Feb 12, 2020

socklen_t caddr_len = offsetof(struct sockaddr_un, sun_path) + strlen(caddr.sun_path) + 1;

Could be replaced by

socklen_t caddr_len = SUN_LEN(&caddr);

or

socklen_t caddr_len = caddr.sun_len;

They have the same value.

@tormol
Copy link
Contributor

tormol commented Feb 16, 2020

This is also an issue on FreeBSD and NetBSD. (See the first ~15 lines of the output of this test program ran on FreeBSD, macOS and Linux.)

Both Linux and FreeBSD supports socket paths of length ..=SUN_LEN, while macOS and NetBSD apparently supports even longer pathnames.

On Linux .as_pathname() handles paths of length SUN_LEN fine, but send_to() and similar rejects them.

I think that std should support ..=SUN_LEN whenever the OS supports it.
if the OS supports even longer addrs .as_pathname() should maybe return None when the last byte is not NUL though, as there is no way to know whether it was truncated or fit exactly.

@sfackler
Copy link
Member

I think the relevant bit of unix(7) may be this:

For example, some (but not all) implementations append a null terminator if none is present in the supplied sun_path.

So I think the correct behavior is to only chop off the last byte of sun_path if it is actually a zero byte.

@zonyitoo
Copy link
Author

It have been quite a long time, what's the status of this issue? @sfackler

@zonyitoo
Copy link
Author

zonyitoo commented Jun 13, 2021

I propose 2 simple fixes to

fn address(&self) -> AddressKind<'_> {
let len = self.len as usize - sun_path_offset(&self.addr);
let path = unsafe { mem::transmute::<&[libc::c_char], &[u8]>(&self.addr.sun_path) };
// macOS seems to return a len of 16 and a zeroed sun_path for unnamed addresses
if len == 0
|| (cfg!(not(any(target_os = "linux", target_os = "android")))
&& self.addr.sun_path[0] == 0)
{
AddressKind::Unnamed
} else if self.addr.sun_path[0] == 0 {
AddressKind::Abstract(&path[1..len])
} else {
AddressKind::Pathname(OsStr::from_bytes(&path[..len - 1]).as_ref())
}
}

  1. (recommend) Uses the sun_len in sockaddr_un for BSD-like systems
fn address(&self) -> AddressKind<'_> {
    let len = self.len as usize - sun_path_offset(&self.addr);
    let path = unsafe { mem::transmute::<&[libc::c_char], &[u8]>(&self.addr.sun_path) };

    // macOS seems to return a len of 16 and a zeroed sun_path for unnamed addresses
    if len == 0
        || (cfg!(not(any(target_os = "linux", target_os = "android")))
            && self.addr.sun_path[0] == 0)
    {
        AddressKind::Unnamed
    } else if self.addr.sun_path[0] == 0 {
        AddressKind::Abstract(&path[1..len])
    } else {
        cfg_if! {
            if #[cfg(any(target_os = "macos",
                         target_os = "ios",
                         target_os = "freebsd",
                         target_os = "dragonfly",
                         target_os = "openbsd",
                         target_os = "netbsd"))] {
                // BSD-like systems may not includes the last '\0' in length,
                // because they have a sun_len as the length of sun_path
                let sun_len = self.addr.sun_len;
            } else {
                // Trim the last '\0'
                let mut sun_len = len;
                while sun_len > 0 && path[sun_len] == 0 {
                    sun_len -= 1;
                }
            }
        }

        AddressKind::Pathname(OsStr::from_bytes(&path[..sun_len]).as_ref())
    }
}
  1. Simply trim the tailing \0s in path instead:
// Trim the last '\0'
let mut sun_len = len;
while sun_len > 0 && path[sun_len] == 0 {
    sun_len -= 1;
}

I personally prefer the first solution. What do you think? I could make a PR for this.

@tormol
Copy link
Contributor

tormol commented Jun 13, 2021

If sun_len contains the original length of a truncated address, then using it allows distinguishing between a truncated pathname and one that is exactly SUN_LEN bytes.

@zonyitoo
Copy link
Author

So @tormol prefers the 1st solution?

JohnTitor added a commit to JohnTitor/rust that referenced this issue Jun 21, 2021
SocketAddr of Unix Sockets use sun_len as path's length in BSD-like OSes

- fixes rust-lang#69061
@zonyitoo
Copy link
Author

Made a PR to cpython instead: python/cpython#26866

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: This is a bug. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants