Skip to content

Commit 7da2109

Browse files
committed
native: Try hard to not malloc on a forked child
This appears to be causing the BSD bots to lock up when looking at the core dumps I've managed to get. Dropping the `FileDesc` structure triggers the `Arc` it's contained in to get cleaned up, invoking free(). This instead just closes the file descriptor (the arc itself is never cleaned up). I'm still not entirely sure why this is a problem because the pthreads runtime should register hooks for fork() to prevent this sort of deadlock, but perhaps that's only done on linux?
1 parent 9a33330 commit 7da2109

File tree

1 file changed

+31
-1
lines changed

1 file changed

+31
-1
lines changed

src/libnative/io/process.rs

+31-1
Original file line numberDiff line numberDiff line change
@@ -524,7 +524,37 @@ fn spawn_process_os(config: p::ProcessConfig,
524524
Ok(..) => fail!("short read on the cloexec pipe"),
525525
};
526526
}
527-
drop(input);
527+
// And at this point we've reached a special time in the life of the
528+
// child. The child must now be considered hamstrung and unable to
529+
// do anything other than syscalls really. Consider the following
530+
// scenario:
531+
//
532+
// 1. Thread A of process 1 grabs the malloc() mutex
533+
// 2. Thread B of process 1 forks(), creating thread C
534+
// 3. Thread C of process 2 then attempts to malloc()
535+
// 4. The memory of process 2 is the same as the memory of
536+
// process 1, so the mutex is locked.
537+
//
538+
// This situation looks a lot like deadlock, right? It turns out
539+
// that this is what pthread_atfork() takes care of, which is
540+
// presumably implemented across platforms. The first thing that
541+
// threads to *before* forking is to do things like grab the malloc
542+
// mutex, and then after the fork they unlock it.
543+
//
544+
// Despite this information, libnative's spawn has been witnessed to
545+
// deadlock on both OSX and FreeBSD. I'm not entirely sure why, but
546+
// all collected backtraces point at malloc/free traffic in the
547+
// child spawned process.
548+
//
549+
// For this reason, the block of code below should contain 0
550+
// invocations of either malloc of free (or their related friends).
551+
//
552+
// As an example of not having malloc/free traffic, we don't close
553+
// this file descriptor by dropping the FileDesc (which contains an
554+
// allocation). Instead we just close it manually. This will never
555+
// have the drop glue anyway because this code never returns (the
556+
// child will either exec() or invoke libc::exit)
557+
let _ = libc::close(input.fd());
528558

529559
fn fail(output: &mut file::FileDesc) -> ! {
530560
let errno = os::errno();

0 commit comments

Comments
 (0)