-
Notifications
You must be signed in to change notification settings - Fork 7.3k
child_process.fork() and read from parent process.stdin causes 100% CPU usage #6271
Comments
Confirmed, thanks. I think we're looking at two things here: a) inadvisable behavior in node (implicitly allowing fd 0 to get closed), and
A simple test case: #include <assert.h>
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/epoll.h>
#include <unistd.h>
#define E(x) do { errno = 0; { x; } if (errno) perror(#x), exit(1); } while (0)
int main(void) {
struct epoll_event e;
char buf[32];
ssize_t n;
int epfd;
int fd;
fd = 0;
E(epfd = epoll_create1(0));
e.events = EPOLLIN;
e.data.u64 = fd;
E(epoll_ctl(epfd, EPOLL_CTL_ADD, fd, &e));
E(n = epoll_wait(epfd, &e, 1, -1));
assert(n == 1);
assert(e.data.u64 == fd);
do
E(n = read(fd, buf, sizeof(buf)));
while (n != 0);
close(fd);
for (;;) {
E(n = epoll_wait(epfd, &e, 1, -1));
assert(n == 1);
assert(e.events == EPOLLHUP);
assert(e.data.u64 == fd);
if (epoll_ctl(epfd, EPOLL_CTL_DEL, fd, &e)) // fails with EBADF
perror("epoll_ctl(EPOLL_CTL_DEL)");
}
return 0;
} The call to close() is supposed to remove the file descriptor from the epoll set but it doesn't, doesn't matter if fds 1 and 2 are closed or redirected to something else (i.e. so they don't share a file description with fd 0.) Interestingly, the EPOLLHUP/EBADF busy loop only happens when stdin is a pipe. When it's a tty, fd 0 keeps generating EPOLLIN events after the close(). I guess I'll have to recheck with the latest mainline (my current kernel is 3.10.10) and report it to the LKML. Working around this in node.js should be relatively straightforward: a) we shouldn't allow fds 0-2 to get closed, and |
Confirmed to also happen with 3.12-rc3. |
Ensure that close() system calls don't close stdio file descriptors because that is almost never the intention. This is also a partial workaround for a kernel bug that seems to affect all Linux kernels when stdin is a pipe that gets closed: fd 0 keeps signalling EPOLLHUP but a subsequent call to epoll_ctl(EPOLL_CTL_DEL) fails with EBADF. See nodejs/node-v0.x-archive#6271 for details and a test case.
After thinking it through some more, I've come to the conclusion that it's not an out-and-out kernel bug but an epoll design flaw. epoll_wait() reports events for file descriptions, not file descriptors. Closing the file descriptor in our process doesn't necessarily close the file description because other processes may still have open file descriptors that refer to the same file description. In other words, this busy loop behavior is not unique to stdio file descriptors. I don't think we can solve this in a general way in v0.10 but maybe we can add a special case for stdio fds because those are most prone to triggering it (and because closing fds 0-2 is usually a bad idea anyway.) In master, this is probably best mitigated (but not fully solved) by switching to edge-triggered I/O. Guess I have a motivation now for finishing up my patch from November 2012. When I say 'not fully solved', I mean that it's still possible to get spurious wakeups (see example below) but libuv is equipped to deal with that and at least you won't get flat-out busy loops that way because the kernel will report the event only once. #include <assert.h>
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/epoll.h>
#include <unistd.h>
#define E(x) do { errno = 0; { x; } if (errno) perror(#x), exit(1); } while (0)
int main(void) {
int pipefd[2];
pid_t pid;
E(pipe(pipefd));
E(pid = fork());
if (pid == 0) {
struct epoll_event e;
int epfd;
int n;
E(epfd = epoll_create1(EPOLL_CLOEXEC));
e.events = EPOLLIN | EPOLLOUT | EPOLLET;
e.data.fd = pipefd[0];
E(epoll_ctl(epfd, EPOLL_CTL_ADD, pipefd[0], &e));
E(close(pipefd[0]));
E(close(pipefd[1]));
for (;;) {
E(do
n = epoll_wait(epfd, &e, 1, -1);
while (n == -1 && errno == EINTR));
assert(n == 1);
printf("child wakeup %d %x\n", e.data.fd, e.events);
}
} else {
char buf[1];
ssize_t n;
for (;;) {
E(do
n = write(pipefd[1], "", 1);
while (n == -1 && errno == EINTR));
usleep(25e4);
E(do
n = read(pipefd[0], buf, sizeof(buf));
while (n == -1 && errno == EINTR));
}
}
return 0;
} When run, it keeps printing |
Ensure that close() system calls don't close stdio file descriptors because that is almost never the intention. This is also a partial workaround for a kernel bug that seems to affect all Linux kernels when stdin is a pipe that gets closed: fd 0 keeps signalling EPOLLHUP but a subsequent call to epoll_ctl(EPOLL_CTL_DEL) fails with EBADF. See nodejs#6271 for details and a test case.
This is not a kernel bug. From epoll(7):
See also: |
@jwalton I agree, but why resurrecting the old issue? |
I didn't reopen it; it was open when I got here. :) I commented on it because we're seeing an epoll busy loop today, so I was looking for open related bugs. Although, I just realized @goffrie's fix made it into Node 0.10.27, and we're still on 0.10.25. :( So you can safely ignore me. :P |
This simple script causes 100% CPU usage by parent process after reaching stdin's EOF:
Checked on
v0.10.12
(arch and ubuntu),v0.10.17
(arch and ubuntu) andv0.10.19
(arch)The text was updated successfully, but these errors were encountered: