-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SIGSEGV and SIGBUS cause an exit signal of SIGILL #26458
Comments
If we can return from SIGSEGV and SIGBUS on some platforms, I support that option. For other platforms we should reset the mask as you suggest. I find it very important to be able to quickly identify a stack overflow, as to not raise confusion with way more scary cases of signals. As long as you stick with safe Rust, you should get friendly error messages. |
I agree with @Zoxc that having the error message is quite nice and I'd prefer to not lose it, but it sounds like the option of just returning from the signal handler on known platforms is a good idea! I like the idea of re-raising the exact same signal. Thanks for the investigation @geofft, and feel free to submit a PR! |
All the non-Windows ports currently in
(There are some overIaps and some of these sources attest to way more, I just wanted to confirm that at least one source attests to each current Rust target. I don't actually think this is arch-specific on any OS.) So this is safe to do unconditionally, as @alexcrichton has done in PR #27338, and I'm happy to let that PR close this. Porters to new UNIXes may need to watch for this for correctness. This is also safe on Solaris (per Breakpad and JVM); I think the only non-supporting UNIX that Rust might realistically get ported to is MINIX 3. |
Awesome, thanks for the investigation @geofft! |
This commit removes all morestack support from the compiler which entails: * Segmented stacks are no longer emitted in codegen. * We no longer build or distribute libmorestack.a * The `stack_exhausted` lang item is no longer required The only current use of the segmented stack support in LLVM is to detect stack overflow. This is no longer really required, however, because we already have guard pages for all threads and registered signal handlers watching for a segfault on those pages (to print out a stack overflow message). Additionally, major platforms (aka Windows) already don't use morestack. This means that Rust is by default less likely to catch stack overflows because if a function takes up more than one page of stack space it won't hit the guard page. This is what the purpose of morestack was (to catch this case), but it's better served with stack probes which have more cross platform support and no runtime support necessary. Until LLVM supports this for all platform it looks like morestack isn't really buying us much. cc rust-lang#16012 (still need stack probes) Closes rust-lang#26458 (a drive-by fix to help diagnostics on stack overflow)
This commit removes all morestack support from the compiler which entails: * Segmented stacks are no longer emitted in codegen. * We no longer build or distribute libmorestack.a * The `stack_exhausted` lang item is no longer required The only current use of the segmented stack support in LLVM is to detect stack overflow. This is no longer really required, however, because we already have guard pages for all threads and registered signal handlers watching for a segfault on those pages (to print out a stack overflow message). Additionally, major platforms (aka Windows) already don't use morestack. This means that Rust is by default less likely to catch stack overflows because if a function takes up more than one page of stack space it won't hit the guard page. This is what the purpose of morestack was (to catch this case), but it's better served with stack probes which have more cross platform support and no runtime support necessary. Until LLVM supports this for all platform it looks like morestack isn't really buying us much. cc #16012 (still need stack probes) Closes #26458 (a drive-by fix to help diagnostics on stack overflow) r? @brson
Is current implementation the intended behavior? On Linux, rust processes don't dump core when receiving SIGSEGV or SIGBUS using e.g. "kill -11 $(pidof process)". Instead, they dump core the second time they receive SIGSEGV/SIGBUS (after the default signal handler has been restored). This is against the Linux semantic of those signals AFAIK. Typically a Linux process dumps core when receiving SIGSEGV. |
This makes the death look exactly like it would if we didn't handle the signal at all. Coredumps will point at the right instruction, segfaults will get logged in dmesg again, etc. Technically POSIX says this is undefined, but if we get a fault, we've already done something undefined anyway ;) Link: rust-lang/rust#26458
As demonstrated on Playpen:
The reason for this is that, when a UNIX signal is being handled, UNIX blocks that signal while the handler is running, by default. However, this particular signal handler tries to re-raise the signal when it's done with processing. This doesn't do anything since the handler is still running, so execution continues with the next line,
intrinsics::abort()
, which inducesSIGILL
. (The code tries to reset the signal disposition to avoid re-entering the same handler, but that's not enough; the mask of blocked signals also needs to be cleared.)There are a couple of possible solutions to this. The straightforward one is to cause the signal not to be masked. I have a patch for this that depends on my signal-FFI-bindings refactoring in #25784; I can submit it as a PR once that lands.
Another one, which I prefer, is to just let the handler terminate instead of re-raising. There was discussion about this previously, where it was noted that glibc's manual says this is undefined for "program error signals" like
SIGSEGV
andSIGBUS
. POSIX agrees with that. However, most platforms in practice define this behavior, and allow you to return from these handlers, so that you can do things like userspace page-fault handling.For example,
SIGSEGV
on Linux and on Solaris. (On Darwin they use Mach exceptions.)SIGSEGV
handler: they use it as a trick for stop-the-world pauses with low overhead. (Periodically each thread will do a single read from a special page, which is ~one instruction. If the GC wants to stop the world, it'll unmap the page to cause each thread to fault, and once it's ready to continue the world, it'll remap the page and make all threads return from their signal handlers.) So any UNIX platform where the JVM works should support this.SIGSEGV
is possible. The PORTING file has a wide list of supported platforms, including Linux, Darwin, FreeBSD, OpenBSD, NetBSD, Solaris, and MinGW.So I think that it's merely the case that returning from a program error handler is unspecified in POSIX, but just about all actual OSes we care about support it.
This has the advantage that the exact signal is re-delivered to kill the program, with the right siginfo (indicating it died because of a memory error, not because someone manually sent
SIGSEGV
), so the last frame in a coredump is right,dmesg
prints a line, etc. We can keep the re-raise for unknown platforms, but for platforms where we know that returning from the handler works, that seems both simpler and better. (Breakpad takes this approach for the same reasons.)The final option is just to remove this code. What it does, as far as I see, is to print an error message and die if the segfault was on the guard page, and just die otherwise. I don't think there's a compelling reason to print a message for stack overflow, especially as caught by
SIGSEGV
(it makes more sense if it's caught by stack probes ormorestack
). In any other systems language, overflowing the stack just gets you killed withSIGSEGV
, and installing a special handler doesn't seem to match the runtime-removal philosophy. It made sense in thelibrustrt
world, but I'd argue it's not useful now. But if other people are finding the handler / the error message useful, that's fine.Cc @Zoxc for advice, as the original author of the segfault handling code.
The text was updated successfully, but these errors were encountered: