-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for setjmp/longjmp on x86_64. #1216
Conversation
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @gnzlbg (or someone else) soon. If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes. Please see the contribution instructions for more information. |
I honestly have no idea of the semantics that calling Is it even possible to use these APIs correctly from Rust ? |
IIUC they will be mostly used for FFI, and other uses should be heavily discouraged. |
@newpavlov independently of what their intended usage is, i am wondering whether they can be used correctly at all. E.g. Is that allowed by LLVM ? In other words: can LLVM perform optimizations that assume that this does not happen? LLVM has its own intrinsics for this: https://llvm.org/docs/ExceptionHandling.html#sjlj-intrinsics, it might well be that LLVM requires this to happen through this intrinsics instead of through a call to an unknown function. |
Seems easy enough to answer the LLVM question: I wrote a simple C program using setjmp/longjmp and compiled it with: clang-6.0 -emit-llvm -S setjmp.c -o setjmp.S The resulting LLVM has:
So it looks like LLVM is capable of handling a call to setjmp/longjmp. setjmp.c:
output when run:
|
I also tested it for use in my FFI use case and it works as I want it to: it can catch longjmp()s coming from C, and turn them into rust panics, which then appropriately call the destructors while unwinding. Is it pretty? No, but working with C is not always pretty. |
Note that just because it appears to sometimes work does not mean that it always work. C++, like Rust, does have destructors, and the C++ standard [support.runtime] states:
That is, the C++ standard includes explicit wording that states that calls to To me it is completely unclear how these "functions" should interact with Rust (e.g. how would |
But in this case clang is generating the calls to libc's setjmp/longjmp. If clang (written by the authors of LLVM) think it's OK to call setjmp/longjmp in LLVM, I don't see how it would be some inherent problem with LLVM.
I understand that. I have no desire to skip over destructors. I only want to use setjmp in the lowest rust frame before an FFI call to catch any longjmp. That protects me from skipping over rust frames, and there is currently no other way to protect me from that.
longjmp is a fact of life in C. If we ignore it, we are allowing those longjmps to unsafely skip over rust destructors. setjmp is the only solution that I see. I also added longjmp for two reasons:
Though I don't need longjmp() quite so much, and can work around it more easily. If you want to only add setjmp, I am fine with that for now. |
Why aren't you doing this in C? That is, why aren't you calling from Rust a C wrapper over postgres that uses
Adding these functions to |
There are two functions: setjmp() and longjmp(). Which has undefined behavior, and by what reasoning is it undefined while other FFI calls are defined? What is the cost of defining this behavior? One of the things that's great about rust is how well it integrates with C or any language that can integrate with C. Adding support for setjmp() and longjmp() seem to fit that spirit. |
All functions called via Rust FFI that we support either return their return value to the caller at the call site, or they never return. In particular, we forbid
To me that looks like RFC territory, it is basically adding a new inter-procedural control flow mechanism to rust that lives along return values, panics, generators, etc. but is different to all of them. |
Also, C11 7.13.1.1p4 states the following about
While C11 7.13.1.1p5 states:
That is, if you wanted to assign the result of Also, functions that use
which specifically mentions An example of how easy it is to trigger undefined behavior is given by this program (playground) which works in debug but fails in release due to a mis-optimization: extern crate libc;
use libc::c_int;
pub type jmp_buf = [i64; 8];
extern "C" {
fn setjmp(env: &mut jmp_buf) -> c_int;
fn longjmp(env: &jmp_buf, val: c_int);
}
unsafe fn foo() -> i32 {
let mut buf: jmp_buf = [0; 8];
let mut x = 42;
if setjmp(&mut buf) != 0 {
// this should always return 13
return x;
}
x = 13;
longjmp(&mut buf, 1);
x // this will never be reached
}
fn main() {
// in debug foo returns 13 correctly, but in release it returns 42
assert_eq!(unsafe { foo() }, 13);
} IIUC, LLVM assumes that the cc @rkruppe EDIT: basically, if we want to add these, we have to add them to |
I am willing to remove |
There is a concern that it's always undefined behavior, no matter how carefully used.
Using AFAIK that would make For that, AFAICT, it cannot be added as a simple extern function. We'd have to add it as a |
Yes, the LLVM IR generated for a call to setjump needs the I assume the severe restrictions the C standard places on where And about |
Note that LLVM also exposes
Rust does not guarantee that destructors run, so skipping them isn't unsound ( This is not a good example, but it has a double drop (playground): fn foo() {
let mut buf: jmp_buf = [0; 8];
let a = A;
if unsafe { setjmp(&mut buf) } != 0 {
return; // drops a again (use-after-move/free, double-drop) => UB
}
std::mem::drop(a); // safe: moves and drops a
unsafe { longjmp(&mut buf, 1) };
}
EDIT: what the example above show is how In any case, skipping destructors is fine, running them twice (or more) isn't. |
Oh, right, hm. Alternatively maybe these could be in the standard library, which sounds odd because they are a C-ism, but OTOH as you note having setjmp/longjump really does extend the language in magical, otherwise-impossible ways. In any case, since something new in rustc is needed and must be stabilized for libc to benefit, I agree that this is RFC territory.
I can't find any such intrinsics, there's only
Good point about running them twice. Wrt skipping: sorry, safe code can indeed skip destructors of values it owns, what I meant is that carefully written code that never gives up ownership of a value can be assured the value will be dropped unless other code diverges -- but notably it will still be dropped if unwinding happens, which matters because unwinding can be stopped. This assured drop is then used in a lot of libraries that run a user-specified closure to be able to "get in a word" after that closure finishes or unwinds -- e.g., to join a thread. |
@jeff-davis I think we should definitely encourage people to "play" with these intrinsics, and for interfacing with C Rust is going to need to provide some kind of support for these. Would you be ok with adding them behind an experimental cargo feature ? (e.g. In the meantime, we should open an issue in rust-lang/rust about what to do about these. We'll probably need to write down what code that uses these is allowed to do to avoid undefined behavior and that's going to be hard. How does that sound? |
So I've opened an issue in the RFC repo: rust-lang/rfcs#2625 @jeff-davis mentioned:
Even if we add The main advantage of writing a C wrapper over postgres that catches postgres exceptions and translates them to Rust error codes (where in Rust you can then raise them again as a panic if you want, or do something else) is that this approach would at least work and that there is experience from doing this same thing to C++ libraries that throw exceptions (we don't have tools in Rust to catch C++ exceptions, C++ thin wrappers that catch them and translate them to error codes need to be written here). I understand that this solution is sub-optimal, but coming up with ways to make |
Is there no way to abuse variable length arguments in C to create one C wrapper function that supports calling all the C functions? |
Thank you for the detailed replies. They still leave a few puzzles: I translated your rust example to C, and I get:
So that means that either (a) the behavior of that pattern of use of setjmp is also undefined in C; or (b) both gcc and clang have the same correctness bug. I have to assume (a), because (b) seems quite unlikely. The example would be more compelling if you showed a correctness bug in rust for well-defined behavior in C. But, I agree that we at least should have the |
@gnzlbg: Yes, the experimental cargo feature sounds like a nice way to proceed for now. Thank you. |
In my translations, both GCC and Clang always return 13 =/ Maybe you forgot to make
one must make Also, Rust does not have a " Basically, there is no way to write Rust code that's equivalent to the correct C code. |
Oh, you're right. I forgot to mark x volatile. |
Fix issue #1208