Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accepting Incoming TCP Connections fails on Android #82400

Closed
wngr opened this issue Feb 22, 2021 · 18 comments · Fixed by #82731
Closed

Accepting Incoming TCP Connections fails on Android #82400

wngr opened this issue Feb 22, 2021 · 18 comments · Fixed by #82731
Labels
O-android Operating system: Android regression-from-stable-to-stable Performance or correctness regression from one stable version to another. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Milestone

Comments

@wngr
Copy link

wngr commented Feb 22, 2021

Starting with Android Oreo (8), Android started using a seccomp based filter approach to syscalls, explicitly allowing syscalls, see https://android-developers.googleblog.com/2017/07/seccomp-filter-in-android-o.html.
https://android.googlesource.com/platform/bionic.git/+/master/libc/SYSCALLS.TXT enumerates the allowed syscalls.
This means, that dispatching a generic syscall mechanism as introduced with this PR #78572 will result in a panic.

On top of that, I found that older versions of Android, such as Android 6, will return Function not implemented (os error 38) for this syscall.
My tests showed that this only happens on x86, although I can't explain why.

As there's no way to detect the target android API level, the safest way would probably be to always use the accept syscall, and remove the special handling of the accept4 syscall. This could also be just guarded for x86 -- but would be happy to get an explanation on why it's only on that architecture (see tokio-rs/mio#1446).

I have tested this with both real devices as well as Android emulators.

Code

I tried this code:

use std::net::TcpListener;

let listener = TcpListener::bind("0.0.0.0:8080").unwrap();
match listener.accept() {
    Ok((_socket, addr)) => println!("new client: {:?}", addr),
    Err(e) => println!("couldn't get client: {:?}", e),
}

and poked it with telnet <android_host> 8080.

I expected to see this happen: accept returns Ok(_)

Instead, this happened:

  • Android >= 8.0: panic because of seccomp
02-22 13:14:23.287  6015  6041 F my.app.DEBUG: Build fingerprint: 'Android/sdk_phone_x86/generic_x86:10/QPP6.190730.005.B1/5775370:userdebug/test-keys'
02-22 13:14:23.287  6015  6041 F my.app.DEBUG: Revision: '0'
02-22 13:14:23.287  6015  6041 F my.app.DEBUG: ABI: 'x86'
02-22 13:14:23.288  6015  6041 F my.app.DEBUG: Timestamp: 2021-02-22 13:14:23+0000
02-22 13:14:23.288  6015  6041 F my.app.DEBUG: pid: 6015, tid: 6057, name: tokio-runtime-w  >>> my.app:background_services <<<
02-22 13:14:23.288  6015  6041 F my.app.DEBUG: uid: 10103
02-22 13:14:23.288  6015  6041 F my.app.DEBUG: signal 31 (SIGSYS), code 1 (SYS_SECCOMP), fault addr --------
02-22 13:14:23.288  6015  6041 F my.app.DEBUG: Cause: seccomp prevented call to disallowed x86 system call 364
02-22 13:14:23.289  6015  6041 F my.app.DEBUG: Abort message: 'Fatal signal 31 (SIGSYS), code 1 (SYS_SECCOMP) in tid 4784 (tokio-runtime-w), pid 4735 (ground_services)'
02-22 13:14:23.289  6015  6041 F my.app.DEBUG: eax 0000016c  ebx 0000003f  ecx bfccfdb0  edx bfccfd6c
02-22 13:14:23.289  6015  6041 F my.app.DEBUG: edi eae65c34  esi 00080000
02-22 13:14:23.289  6015  6041 F my.app.DEBUG: ebp bfccfd88  esp bfccfd18  eip ee70cad9


  • Android < 8.0: strace output
[pid 10918] syscall_364(0x34, 0x9d5c9cf8, 0x9d5c9ca0, 0x80800, 0x9fda9dc8, 0x9fda9dc8 <unfinished ...>
[pid 10918] <... syscall_364 resumed> ) = -1 (errno 38)

Which translate to Function not implemented.

Version it worked on

It most recently worked on: Rust 1.48

Version with regression

rustc --version --verbose:

 rustc --version --verbose
rustc 1.49.0 (e1884a8e3 2020-12-29)
binary: rustc
commit-hash: e1884a8e3c3e813aada8254edfa120e85bf5ffca
commit-date: 2020-12-29
host: x86_64-unknown-linux-gnu
release: 1.49.0

Backtrace

Backtrace

<backtrace>

@estebank estebank added E-needs-bisection Call for participation: This issue needs bisection: https://github.com/rust-lang/cargo-bisect-rustc O-android Operating system: Android regression-from-stable-to-stable Performance or correctness regression from one stable version to another. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Feb 22, 2021
@rustbot rustbot added the I-prioritize Issue: Indicates that prioritization has been requested for this issue. label Feb 22, 2021
@estebank estebank added the T-libs Relevant to the library team, which will review and decide on the PR/issue. label Feb 22, 2021
@nagisa
Copy link
Member

nagisa commented Feb 22, 2021

Is the x86 thing an emulator? From the strace output it would seem that an entirely wrong syscall is being invoked.

Can this issue reproduce if you call accept4 from C? If so, consider reporting a bug against Android as well.

@wngr
Copy link
Author

wngr commented Feb 22, 2021

Is the x86 thing an emulator?

Happens both on an Emulator and a real x86 device (Zebra ET50).

I will try to whip up a minimal reproducer using the socket4 API.

Edit:

use std::mem;
use std::net::SocketAddr;

fn main() {
    unsafe {
        let bind_to: SocketAddr = "0.0.0.0:8080".parse().unwrap();
        println!("Trying to bind to {}", bind_to);
        let sock_addr =
            nix::sys::socket::SockAddr::new_inet(nix::sys::socket::InetAddr::from_std(&bind_to));

        let fd = libc::socket(libc::AF_INET, libc::SOCK_STREAM | libc::SOCK_CLOEXEC, 0);
        let (addrp, len) = sock_addr.as_ffi_pair();
        assert_eq!(libc::bind(fd, addrp, len as _), 0);
        assert_eq!(libc::listen(fd, 128), 0);

        let mut storage: libc::sockaddr_storage = mem::zeroed();
        let mut len = mem::size_of_val(&storage) as libc::socklen_t;

        if libc::accept4(
            fd,
            &mut storage as *mut _ as *mut _,
            &mut len,
            libc::SOCK_CLOEXEC,
        ) < 0
        {
            println!("error on accept: {}", nix::errno::errno())
        } else {
            println!("read {} from socket", len);
        }
    }
}

This works fine on my linux machine, but running it on an Android API 23 x86 emulator yields:

root@generic_x86:/data/local # strace ./android-socket 
[..]
write(1, "Trying to bind to 0.0.0.0:8080\n", 31Trying to bind to 0.0.0.0:8080
) = 31
socket(PF_INET, SOCK_STREAM|SOCK_CLOEXEC, IPPROTO_IP) = 3
bind(3, {sa_family=AF_INET, sin_port=htons(8080), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
listen(3, 128)                          = 0
syscall_364(0x3, 0xbf85b6c0, 0xbf85b698, 0x80000, 0x80, 0x10) = -1 (errno 38)
write(1, "error on accept: 38\n", 20error on accept: 38
)   = 20
futex(0xb771722c, FUTEX_WAKE_PRIVATE, 2147483647) = 0
mprotect(0xb771a000, 4096, PROT_READ|PROT_WRITE) = 0
mprotect(0xb771a000, 4096, PROT_READ)   = 0
close(0)                                = 0
close(1)                                = 0
close(2)                                = 0
futex(0xb770f544, FUTEX_WAKE_PRIVATE, 2147483647) = 0
mprotect(0xb771a000, 4096, PROT_READ|PROT_WRITE) = 0
mprotect(0xb771a000, 4096, PROT_READ)   = 0
mprotect(0xb771a000, 4096, PROT_READ|PROT_WRITE) = 0
mprotect(0xb771a000, 4096, PROT_READ)   = 0
munmap(0xb771a000, 4096)                = 0
exit_group(0)                           = ?
+++ exited with 0 +++

This also fails when using the generic syscall with errno 38:

        libc::syscall(
            libc::SYS_accept4,
            fd,
            &mut storage as *mut _ as *mut _,
            &mut len,
            libc::SOCK_CLOEXEC,
         )

Seems my guess on why that started happening is wrong; now I wonder why that ever worked .. 😕

Just for reference:

cross build --target i686-linux-android --release && adb push target/i686-linux-android/release/android-socket /data/local

@apiraino
Copy link
Contributor

@de-vri-es and @rustbot ping libs

any insights to share about this?

@rustbot
Copy link
Collaborator

rustbot commented Feb 24, 2021

Error: This team (libs) cannot be pinged via this command; it may need to be added to triagebot.toml on the master branch.

Please let @rust-lang/release know if you're having trouble with this bot.

@de-vri-es
Copy link
Contributor

de-vri-es commented Feb 24, 2021

Hmm, maybe this has something to do with it: https://android.googlesource.com/platform/bionic.git/+/master/libc/SYSCALLS.TXT#264

# sockets for x86. These are done as an "indexed" call to socketcall syscall.
<snip>
int           __accept4:socketcall:18(int, struct sockaddr*, socklen_t*, int)  x86
<snip>

I've never heard of the "socketcall syscall" before, but apparently, on x86 Linux (before 4.3), you have to call socketcall(SYS_ACCEPT4, &args).

A bit annoying, because there is no documentation on what args should be. It "points to a block containing the actual arguments". The Linux man page specifically says this:

User programs should call the appropriate functions by their usual names. Only standard library implementors and kernel hackers need to know about socketcall().

Linux 4.3 also added the regular syscalls for x86, but that's not something we can rely on for the android target :(

So I think there are two options:

  1. Drop accept4 on Android and accept the race condition on the CLOEXEC flag.
  2. Figure out how to use socketcall correctly for x86 Android in libc.

@de-vri-es
Copy link
Contributor

Linux 4.3 also added the regular syscalls for x86, but that's not something we can rely on for the android target :(

Note: although the same weird syscall is required for Linux < 4.3 (which is still a tier 1 platform), it shouldn't affect the Linux target because there rust libc is just calling accept4. from the real libc.

@joshtriplett
Copy link
Member

It would be nice if Android's seccomp filter allowed the direct syscalls on x86 too, not just on arm; that'd be worth trying to get added to Android.

But in the meantime, it looks like Android targets on 32-bit x86 will need to invoke socketcall for all socket syscalls, rather than making the syscalls directly.

@de-vri-es
Copy link
Contributor

de-vri-es commented Feb 24, 2021

It would be nice if Android's seccomp filter allowed the direct syscalls on x86 too, not just on arm; that'd be worth trying to get added to Android.

It may not be the seccomp filter though. Older x86 kernels really do not have the accept4 syscall.

Anyway, I'm working on a PR for libc to switch to socketcall on x86 android.

@de-vri-es
Copy link
Contributor

PR opened: rust-lang/libc#2079

@apiraino
Copy link
Contributor

thank you @de-vri-es and @joshtriplett

@apiraino
Copy link
Contributor

@rustbot label -I-prioritize -E-needs-bisection

@rustbot rustbot removed E-needs-bisection Call for participation: This issue needs bisection: https://github.com/rust-lang/cargo-bisect-rustc I-prioritize Issue: Indicates that prioritization has been requested for this issue. labels Feb 24, 2021
@de-vri-es
Copy link
Contributor

de-vri-es commented Feb 24, 2021

Note: back when accept4 was used on more platforms, std couldn't update to the latest libc. So it currently has the same workaround. I submitted #82473 to switch to libc::accept4 so std would also receive the fix (after a libc bump).

bors added a commit to rust-lang/libc that referenced this issue Feb 25, 2021
Implement accept4 on x86 android with `socketcall` syscall.

Linux x86 kernels before 4.3 only support the `socketcall` syscall rather than individual syscalls for socket operations. Since `libc` does a raw syscall for `accept4` on Android, it doesn't work on x86 systems.

This PR instead implements `accept4` for x86 android using `socketcall`. The value for `SYS_ACCEPT4` (in contrast to `SYS_accept4` 👀) is taken from the `linux/net.h` header.

Also note that the `socketcall` syscall takes all arguments as array of long ints. I've double checked with `glibc` to check how they pass arguments, since the Linux man page only says this: "args points to a block containing the actual arguments" and "only standard library implementors and kernel hackers need to know about socketcall()".

This should fix rust-lang/rust#82400
bors added a commit to rust-lang/libc that referenced this issue Feb 26, 2021
Implement accept4 on x86 android with `socketcall` syscall.

Linux x86 kernels before 4.3 only support the `socketcall` syscall rather than individual syscalls for socket operations. Since `libc` does a raw syscall for `accept4` on Android, it doesn't work on x86 systems.

This PR instead implements `accept4` for x86 android using `socketcall`. The value for `SYS_ACCEPT4` (in contrast to `SYS_accept4` 👀) is taken from the `linux/net.h` header.

Also note that the `socketcall` syscall takes all arguments as array of long ints. I've double checked with `glibc` to check how they pass arguments, since the Linux man page only says this: "args points to a block containing the actual arguments" and "only standard library implementors and kernel hackers need to know about socketcall()".

This should fix rust-lang/rust#82400
bors added a commit to rust-lang/libc that referenced this issue Feb 27, 2021
Linux x86 kernels before 4.3 only support the `socketcall` syscall rather than individual syscalls for socket operations. Since `libc` does a raw syscall for `accept4` on Android, it doesn't work on x86 systems.

This PR instead implements `accept4` for x86 android using `socketcall`. The value for `SYS_ACCEPT4` (in contrast to `SYS_accept4` 👀) is taken from the `linux/net.h` header.

Also note that the `socketcall` syscall takes all arguments as array of long ints. I've double checked with `glibc` to check how they pass arguments, since the Linux man page only says this: "args points to a block containing the actual arguments" and "only standard library implementors and kernel hackers need to know about socketcall()".

This should fix rust-lang/rust#82400
bors added a commit to rust-lang/libc that referenced this issue Feb 27, 2021
Implement accept4 on x86 android with `socketcall` syscall.

Linux x86 kernels before 4.3 only support the `socketcall` syscall rather than individual syscalls for socket operations. Since `libc` does a raw syscall for `accept4` on Android, it doesn't work on x86 systems.

This PR instead implements `accept4` for x86 android using `socketcall`. The value for `SYS_ACCEPT4` (in contrast to `SYS_accept4` 👀) is taken from the `linux/net.h` header.

Also note that the `socketcall` syscall takes all arguments as array of long ints. I've double checked with `glibc` to check how they pass arguments, since the Linux man page only says this: "args points to a block containing the actual arguments" and "only standard library implementors and kernel hackers need to know about socketcall()".

This should fix rust-lang/rust#82400
bors added a commit to rust-lang/libc that referenced this issue Feb 27, 2021
Linux x86 kernels before 4.3 only support the `socketcall` syscall rather than individual syscalls for socket operations. Since `libc` does a raw syscall for `accept4` on Android, it doesn't work on x86 systems.

This PR instead implements `accept4` for x86 android using `socketcall`. The value for `SYS_ACCEPT4` (in contrast to `SYS_accept4` 👀) is taken from the `linux/net.h` header.

Also note that the `socketcall` syscall takes all arguments as array of long ints. I've double checked with `glibc` to check how they pass arguments, since the Linux man page only says this: "args points to a block containing the actual arguments" and "only standard library implementors and kernel hackers need to know about socketcall()".

This should fix rust-lang/rust#82400
bors added a commit to rust-lang/libc that referenced this issue Feb 27, 2021
…ohnTitor

Implement accept4 on x86 android with `socketcall` syscall.

Linux x86 kernels before 4.3 only support the `socketcall` syscall rather than individual syscalls for socket operations. Since `libc` does a raw syscall for `accept4` on Android, it doesn't work on x86 systems.

This PR instead implements `accept4` for x86 android using `socketcall`. The value for `SYS_ACCEPT4` (in contrast to `SYS_accept4` 👀) is taken from the `linux/net.h` header.

Also note that the `socketcall` syscall takes all arguments as array of long ints. I've double checked with `glibc` to check how they pass arguments, since the Linux man page only says this: "args points to a block containing the actual arguments" and "only standard library implementors and kernel hackers need to know about socketcall()".

This should fix rust-lang/rust#82400
@de-vri-es
Copy link
Contributor

This isn't actually fixed yet, my bad for accidentally linking the issue. Might be good to re-open until it really is solved in std too.

@apiraino
Copy link
Contributor

@de-vri-es no worries, I'll reopen it

@tesuji

This comment has been minimized.

@JohnTitor JohnTitor reopened this Feb 28, 2021
@Mark-Simulacrum Mark-Simulacrum added this to the 1.49.0 milestone Mar 1, 2021
@de-vri-es
Copy link
Contributor

Since the libc PR is merged, and std now uses libc::accept4, all that is left to fix this on the main branch is to have a libc version bump (and a matching bump of the libc dependency for std).

Is that going to be on time for 1.49.0, or should someone be pinged to ask for a libc release?

bors added a commit to rust-lang/libc that referenced this issue Mar 2, 2021
Bump up libc version to 0.2.87

r? `@ghost`
In order to unblock rust-lang/rust#82400.
This also closes #2065.
bors added a commit to rust-lang/libc that referenced this issue Mar 2, 2021
Bump up libc version to 0.2.87

r? `@ghost`
In order to unblock rust-lang/rust#82400.
This also closes #2065.
bors added a commit to rust-lang/libc that referenced this issue Mar 2, 2021
Bump up libc version to 0.2.87

r? `@ghost`
In order to unblock rust-lang/rust#82400.
This also closes #2065.
@JohnTitor
Copy link
Member

@de-vri-es Released libc 0.2.87, you can now update the libc dependency :)

@de-vri-es
Copy link
Contributor

Thanks! Opened #82731 :)

bors added a commit to rust-lang-ci/rust that referenced this issue Mar 5, 2021
…=joshtriplett

[beta] Fix TcpListener::accept() on x86 Android on beta by disabling the use of accept4.

This is the same as rust-lang#82475, but for beta.

In a nutshell: `TcpListener::accept` is broken on Android x86 on stable and beta because it performs a raw `accept4` syscall, which doesn't exist on that platform. This was originally reported in rust-lang#82400, so you can find more details there.

`@rustbot` label +O-android
r? `@Mark-Simulacrum`
m-ou-se added a commit to m-ou-se/rust that referenced this issue Mar 9, 2021
…-Simulacrum

Bump libc dependency of std to 0.2.88.

This PR bumps the `libc` dependency of `std` to 0.2.88. This will fix `TcpListener::accept` for Android on x86 platforms (rust-lang/libc@31a2777).

This will really finally fix rust-lang#82400 for the main branch :)

r? `@JohnTitor`
@bors bors closed this as completed in ba63a84 Mar 9, 2021
aruediger pushed a commit to Actyx/wsrpc that referenced this issue Mar 12, 2021
With Rust 1.49.0, accepting incoming connections on tcp sockets failed
in different ways:
Starting with Android Oreo (8), Android started using a seccomp based
filter approach to syscalls, explicitly allowing syscalls, see
https://android-developers.googleblog.com/2017/07/seccomp-filter-in-android-o.html.
https://android.googlesource.com/platform/bionic.git/+/master/libc/SYSCALLS.TXT
enumerates the allowed syscalls.
rust-lang/rust#78572 refactored the way
`std::net::TcpListener` accepts incoming connections on tcp sockets.
With the seccomp profile above, doing a generic syscall will result in a
panic:
```
[..]
02-22 13:14:23.288  6015  6041 F my.app.DEBUG: signal 31 (SIGSYS), code
1 (SYS_SECCOMP), fault addr --------
02-22 13:14:23.288  6015  6041 F my.app.DEBUG: Cause: seccomp prevented
call to disallowed x86 system call 364
02-22 13:14:23.289  6015  6041 F my.app.DEBUG: Abort message: 'Fatal
signal 31 (SIGSYS), code 1 (SYS_SECCOMP) in tid 4784 (tokio-runtime-w),
pid 4735 (ground_services)'
```

On top of that, I found that older versions of Android, such as Android
6 (our Zebra ET50), will return Function not implemented (os error 38)
for this syscall.  My tests showed that this only happens on x86,
although I can't explain why. Relevant strace:
```
[pid 10918] syscall_364(0x34, 0x9d5c9cf8, 0x9d5c9ca0, 0x80800,
0x9fda9dc8, 0x9fda9dc8 <unfinished ...>
[pid 10918] <... syscall_364 resumed> ) = -1 (errno 38)
```

I have tested this with both real devices as well as Android emulators.

We have been using the `async-io` based `libp2p::tcp::TcpConfig` so far,
which used `std::net::TcpListener` under the hood. This commit also
switches to using `libp2p::tcp::TokioTcpConfig`. Now, tokio uses mio,
which doesn't use `std::net::TcpListener` but raw sockets directly.
Recently, a workaround for the erroneous behaviour described above was
merged to mio, which is still pending to be released on crates.io
(tokio-rs/mio#1462).  Once tokio uses the
updated mio version, we should move back to the crates.io provided
version.

For tracking the issue in `std::net::TcpListener`, I created
rust-lang/rust#82400.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
O-android Operating system: Android regression-from-stable-to-stable Performance or correctness regression from one stable version to another. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
10 participants