runtime: Should respect/understand the process limit when managing threads #14835

mithro · 2016-03-16T07:10:49Z

Currently if the go runtime tries to create a new system thread and is unable
to do so, it will fail with an error like;

18:22:18.752169 [go test -timeout 600s -v -race ./common/paniccatcher] was slow: 3m11.377s
runtime/cgo: pthread_create failed: Resource temporarily unavailable
SIGABRT: abort
PC=0x7f54695bdcc9

One reason for this occurring is the system have a low "process limit". For a
long time it was fairly common for systems to allow 10k or more, but with
systemd and Linux 4.3 the default limit can be as little as 512.

Most of the code which calls pthread_create in src/runtime/cgo seems to do
something like;

    err = pthread_create(&p, &attr, threadentry, ts);

    pthread_sigmask(SIG_SETMASK, &oset, nil);

    if (err != 0) {
        fprintf(stderr, "runtime/cgo: pthread_create failed: %s\n", strerror(err));
        abort();
    }

This actually seems reasonable as recovering from thread creation is pretty
hard. As well, creating more then your system's process limit does feel like a
"just don't do that" type things.

However, from what I can see goroutine scheduler will create up to
sched.maxmcount threads and this is set to be initialized to 10k in proc.go at
line 425 (https://github.com/golang/go/blob/master/src/runtime/proc.go#L425).

Linux provides an API for getting the current thread limit, the getrlimit call
with RLIMIT_NPROC (see http://man7.org/linux/man-pages/man2/setrlimit.2.html)
which already seems to be exposed to Go code as syscall.Getrlimit but it is
missing the RLIMIT_NPROC constant needed to get the information.

This is similar to idea of respecting memlimit see
https://github.com/golang/go/blob/master/src/runtime/os1_linux.go#L270 and
probably related to #5049

The text was updated successfully, but these errors were encountered:

mithro · 2016-03-16T07:12:34Z

I started trying to make this happen at mithro@7538394 but don't yet understand the go runtime code enough to know how to make the syscall.getrlimit call...

mithro · 2016-03-16T07:39:14Z

I also uploaded the patch to https://go-review.googlesource.com/20751 incase anyone wants to make comments and suggest how it could be done.

minux · 2016-03-16T08:21:49Z

Why would setting sched.maxmcount helps at all? I think the only benefit of that is to make the Go process fail in a different way (exceed thread limit.)

bradfitz · 2016-03-16T14:08:41Z

@minux, I believe @mithro is proposing that Go read (not write) the max thread count and use it to avoid creating new threads when it would fail anyway.

mithro · 2016-03-16T14:21:02Z

I'm still trying to create a simple test case which reliably causes the exception I listed above. I think it should be as simple as using ulimit to set the number of process to something small and then running go code with loads of goroutines but I'm having trouble making it occur.

If I understand correctly (which is a big if -- this is my first time looking at this code) the sched code seems to spawn native threads to run goroutines on? If so, it seems like it would make sense to read the max process count from the OS and not spawn more than XX% of that? I don't really understand under what conditions it decide it needs more native threads?

The other option, which feels a lot harder, is to make an unsuccessful pthread_create call not a hard failure (and dealing with the consequences)?

minux · 2016-03-16T16:22:42Z

But the runtime does not have a way to limit new thread creation even if it knows the actual limit. The runtime only creates new os threads to host goroutines when there are no other available threads. And if that happens, there are no way to continue to execute Go code without deadlock. So limiting the total number of threads won't work. For example, t suppose there are a bunch of goroutines that all blocked by reading from a pipe, and then another goroutine wakes up to write to the pipe. In such a scenario, it's possible to use unbounded number of threads, and limiting the number of os threads will just deadlock the program. That's the reason why using getrlimit will only replace pthread_create failure with another failure (reached max. thread limit). If the program requires more threads to run it will crash regardless.

ianlancetaylor · 2016-03-16T16:38:13Z

@mithro You should read the discussion on #4056.

mithro · 2016-03-18T02:55:21Z

Thanks @ianlancetaylor, #4056 does include a lot of discussion on this topic. The current summary seems to be that it is preferred for the runtime to abort here rather than have the potential for code to deadlock and users should manage their goroutines to prevent this from happening.

With that in mind, I'm going to close this bug and open a new one just about letting go code read the current thread limit so that user code is able to use this information when managing their goroutines.

minux · 2016-03-18T02:57:55Z

I don't think exposing the thread limit could help the user managing goroutines because the user doesn't know when the runtime will create a new thread. And Go code could already use the syscall package (or x/sys/unix) to query the thread limit.

mithro · 2016-03-18T03:15:59Z

@minux The RLIMIT_NPROC constant is currently missing from the syscall package. I believe the fix is just to update the regex in src/syscall/mkerrors.sh and rerun it. Let's move the conversation about getting that fixed to #14854.

mithro mentioned this issue Mar 18, 2016

x/sys/unix: Getrlimit should support RLIMIT_NPROC (number of process limits) #14854

Closed

mithro closed this as completed Mar 18, 2016

golang locked and limited conversation to collaborators Mar 19, 2017

gopherbot added the FrozenDueToAge label Mar 19, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

runtime: Should respect/understand the process limit when managing threads #14835

runtime: Should respect/understand the process limit when managing threads #14835

mithro commented Mar 16, 2016

mithro commented Mar 16, 2016

mithro commented Mar 16, 2016

minux commented Mar 16, 2016 via email

bradfitz commented Mar 16, 2016

mithro commented Mar 16, 2016

minux commented Mar 16, 2016 via email

ianlancetaylor commented Mar 16, 2016

mithro commented Mar 18, 2016

minux commented Mar 18, 2016 via email

mithro commented Mar 18, 2016

runtime: Should respect/understand the process limit when managing threads #14835

runtime: Should respect/understand the process limit when managing threads #14835

Comments

mithro commented Mar 16, 2016

mithro commented Mar 16, 2016

mithro commented Mar 16, 2016

minux commented Mar 16, 2016 via email

bradfitz commented Mar 16, 2016

mithro commented Mar 16, 2016

minux commented Mar 16, 2016 via email

ianlancetaylor commented Mar 16, 2016

mithro commented Mar 18, 2016

minux commented Mar 18, 2016 via email

mithro commented Mar 18, 2016