Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Abort instead of unwinding past FFI functions #52652

Closed
SimonSapin opened this issue Jul 23, 2018 · 66 comments · Fixed by #55982
Closed

Abort instead of unwinding past FFI functions #52652

SimonSapin opened this issue Jul 23, 2018 · 66 comments · Fixed by #55982
Assignees
Labels
A-FFI Area: Foreign function interface (FFI) C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. E-needs-mcve Call for participation: This issue has a repro, but needs a Minimal Complete and Verifiable Example I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness P-high High priority relnotes Marks issues that should be documented in the release notes of the next release. S-tracking-needs-summary Status: It's hard to tell what's been done and what hasn't! Someone should do some investigation. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. WG-ffi-unwind Working group: FFI unwind

Comments

@SimonSapin
Copy link
Contributor

SimonSapin commented Jul 23, 2018

More updated description (2021-03-11)

With #76570 having landed, Rust now aborts when a panic leaves an extern "C" function, so the original soundness issue is fixed. However, there is an attribute to opt-out of this behavior, #[unwind(allowed)]. Using this attribute is currently unsound, so it should be fixed, removed, or require some form of unsafe.

Updated Description

This is a tracking issue for switching the compiler to abort the program whenever a Rust program panics across an FFI boundary, notably causing any non-Rust-ABI function to panic/unwind.

This is being implemented in #55982 and some regressions were found in the wild so this is also being repurposed as a tracking issue for any breakage that comes up and a place to discuss the breakage. If you're a crate author and you find your way here, don't hesitate to ask questions!

Original Description

Doing Rust unwinding into non-Rust stack frames is undefined behavior. This was fixed in 1.24.0, and then reverted (I think by #48380 ?) in 1.24.1 because of a regression that affected rlua.

The latter blog post said:

The solution here is to generate the abort handler, but in a way that longjmp won’t trigger it. It’s not 100% clear if this will make it into Rust 1.25; if the landing is smooth, we may backport, otherwise, this functionality will be back in 1.26.

The link PR has landed, but my understanding is that it does not change FFI functions back to aborting on unwind (though it looks like it does fix the issue that affected rlua).

This UB is not mentioned in any later release notes.

This issue has been assigned to @BatmanAoD via this comment.

@SimonSapin SimonSapin added T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. relnotes Marks issues that should be documented in the release notes of the next release. I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness labels Jul 23, 2018
@SimonSapin
Copy link
Contributor Author

CC @alexcrichton, @nikomatsakis

@liigo

This comment has been minimized.

@alexcrichton
Copy link
Member

cc @diwic

I think @SimonSapin your description is accurate. The backport to stable and backport to beta were both minor patches, and the main one to nightly (at the time) was #48380 as you mentioned.

AFAIK we should be good to go again to do this.

@Mark-Simulacrum
Copy link
Member

I think I would prefer that instead of immediately landing a change to the default unwind behavior we just stabilize the unwind attribute and wait for a few cycles to let the ecosystem migrate to unwind(allowed) or unwind(abort) across extern "C" functions. It would be nice to have some docs providing guidance here as well.

@SimonSapin
Copy link
Contributor Author

Why is this attribute useful? If the issue that affected rlua is fixed, what can make the answer to "do you want to protect against some potential UB" be anything other than "yes, obviously"?

@Mark-Simulacrum
Copy link
Member

There are cases where unwinding is fine, for example if you are using a extern "C" function to talk between different Rust crates or extern "Rust" functions to do the same. There are cases where you legitimately are fine with unwinding (and IMO it shouldn't be UB). I don't think the decision whether unwinding should abort or continue should be based on the extern-ness of the function. unwind(allowed) or unwind(aborts) may be useful in normal Rust as well if you want to do -Cpanic=abort on a smaller (non-library) scale, though it's less useful to an extent in that case due to catch_unwind existing.

@diwic
Copy link
Contributor

diwic commented Jul 24, 2018

Good to see some movement here.

For me it's bikeshedding whether we go with #[unwind], #[unwind(allowed)] or even extern "C-unwind" as long as we default to aborting for every non-Rust ABI. But it does seem it would be helpful for the libjpeg / libpng bindings.

@SimonSapin
Copy link
Contributor Author

Alright, stabilizing some attribute (perhaps with unsafe in its name) to support this libjpeg use case sounds fine, but I don’t see a reason to wait several cycles to change the default back when the plan was to do it in 1.25.

@Mark-Simulacrum
Copy link
Member

The primary point we reverted back then was because of breakage due to changing the default. I'm not convinced that the breakage would not happen again; there's currently no way to explicitly document your requirements on stable. I see this as the same situation as we normally have for library features, where the unstable/deprecated solution isn't removed until the stable alternative lands in stable.

I also am unclear why we would/should rush this. People have waited, people can continue to wait. Stabilizing unwind seems uncontroversial; we can do so immediately.

@diwic
Copy link
Contributor

diwic commented Jul 25, 2018

@Mark-Simulacrum

unwind(allowed) or unwind(aborts) may be useful in normal Rust as well if you want to do -Cpanic=abort on a smaller (non-library) scale, though it's less useful to an extent in that case due to catch_unwind existing.

I'm not sure what this means, but if -Cpanic=abort is specified I don't think you should be able to override that with a #[unwind(allowed)].

I also am unclear why we would/should rush this. People have waited, people can continue to wait.

Well, why should fixing I-unsound bugs have high priority in general? I think they should have high priority, but not everyone shares that sentiment.

Stabilizing unwind seems uncontroversial; we can do so immediately.

Are there some unresolved questions here w r t:

  • Which ABIs should allow #[unwind]? Should we do this for all ABIs, or just "C"?
  • In a -Cpanic=abort scenario should #[unwind] be able to override this for Rust ABI and nevertheless allow unwinding? (I prefer not.)

@Mark-Simulacrum
Copy link
Member

I'm not sure what this means, but if -Cpanic=abort is specified I don't think you should be able to override that with a #[unwind(allowed)].

I mean the reverse; if you've specified -Cpanic=unwind but for whatever reason want a specific function to not unwind, you can today do so with catch_unwind but I could see us also supporting #[unwind(abort)] annotations as an alternative. I'm not sure there's a need, though.

Well, why should fixing I-unsound bugs have high priority in general? I think they should have high priority, but not everyone shares that sentiment.

This isn't an unsoundness bug though. It is currently UB to unwind through stack frames not generated by the same Rust compiler (and will always be, at least in the short term) but I don't see how that is unsound. I've removed that tag.

Are there some unresolved questions here w r t ...

I'd be happy stabilizing #[unwind] for all extern functions regardless of ABI; it seems like a feature orthogonal to the ABI. I don't think it's possible to make #[unwind] override -Cpanic=abort anyway, or at least not trivial. That is to say, it shouldn't be supported, but perhaps should also not be an error since there's not currently a way to cfg around -Cpanic...

@SimonSapin
Copy link
Contributor Author

This isn't an unsoundness bug though

It is indeed not rustc or std that is unsound, but the current situation is that it is very difficult to write sound FFI code. You need to carefully audit every callback either to convince yourself that it cannot ever panic, or use catch_unwind (again in every callback) and then carefully audit the code that is outside of the catch_unwind closure that wraps stuff to be UnwindSafe or deals with the returned Result. (Don’t use .unwrap() !)

So I’d say this is a soundness-related issue.

@Mark-Simulacrum
Copy link
Member

I don't disagree; writing sound FFI code is hard. I'm just trying to say that changing the default without providing a way to use the old behavior seems wrong and I'm against doing it. Furthermore, I'd prefer at least one cycle that users can explicitly opt-in to unwind(allowed) for FFI functions where unwinding isn't UB (i.e., they're unwinding into Rust code). I'm okay (though not happy) with us making a change to the default behavior so long as the opt-out is stabilized along side it.

@diwic
Copy link
Contributor

diwic commented Jul 25, 2018

@Mark-Simulacrum

This isn't an unsoundness bug though.

Oh yes it is. Please don't remove I-unsound tags without thoroughly understanding the issue.

  1. Everything that is UB is also unsound, simply because the behavior is undefined, i e, not defined as being sound.

  2. Extern C functions (as well as all other non-Rust ABIs IIRC) are - unless #[unwind] - marked as nounwind in the LLVM IR. Unwinding from a nounwind function is UB, according to LLVM docs.

@diwic
Copy link
Contributor

diwic commented Jul 25, 2018

I mean the reverse; if you've specified -Cpanic=unwind but for whatever reason want a specific function to not unwind, you can today do so with catch_unwind but I could see us also supporting #[unwind(abort)] annotations as an alternative. I'm not sure there's a need, though.

Okay. I don't think there is much need. Maybe it would make the crate take_mut use a fewer less characters.

I'd be happy stabilizing #[unwind] for all extern functions regardless of ABI; it seems like a feature orthogonal to the ABI.

I see unwinding as part of the ABI, but it's possible (I'm not sure) that for all practical purposes one can see it as an orthogonal feature.

@SimonSapin

the current situation is that it is very difficult to write sound FFI code. You need to carefully audit every callback either to convince yourself that it cannot ever panic, or use catch_unwind (again in every callback) and then carefully audit the code that is outside of the catch_unwind closure that wraps stuff to be UnwindSafe or deals with the returned Result. (Don’t use .unwrap() !)

Indeed. I even found such a mistake in std once. And just to clarify, this is the practical reason why were doing the "abort" solution rather than just stop marking C ABI functions as nounwind in LLVM.

@nikomatsakis
Copy link
Contributor

nikomatsakis commented Jul 26, 2018

@diwic I would classify a "soundness bug" as a case where "safe Rust" can generate UB. Are you saying that this is the case here? (I'm legitimately unsure)

@nikomatsakis
Copy link
Contributor

But in any case I think that @Mark-Simulacrum's suggestion of "do not deprecate without at least offering some way to get back the old default" makes sense to me. I think I share these sentiments pretty much exactly:

I'm just trying to say that changing the default without providing a way to use the old behavior seems wrong and I'm against doing it. Furthermore, I'd prefer at least one cycle that users can explicitly opt-in to unwind(allowed) for FFI functions where unwinding isn't UB (i.e., they're unwinding into Rust code). I'm okay (though not happy) with us making a change to the default behavior so long as the opt-out is stabilized along side it.

@diwic
Copy link
Contributor

diwic commented Jul 26, 2018

@nikomatsakis

I would classify a "soundness bug" as a case where "safe Rust" can generate UB. Are you saying that this is the case here? (I'm legitimately unsure)

Yes, this is UB as of current Rust:

extern "C" fn bad() { panic!() }
fn main() { bad() }

Here's the LLVM IR (notice the nounwind attr) :

; playground::bad
; Function Attrs: nounwind uwtable
define internal void @_ZN10playground3bad17hd96a1f8f707dc090E() unnamed_addr #3 !dbg !676 {
start:
; call std::panicking::begin_panic
  call void @_ZN3std9panicking11begin_panic17h23bac1aac94eb1e7E([0 x i8]* noalias nonnull readonly bitcast (<{ [14 x i8] }>* @byte_str.7 to [0 x i8]*), i64 14, { [0 x i64], { [0 x i8]*, i64 }, [0 x i32], i32, [0 x i32], i32, [0 x i32] }* noalias readonly dereferenceable(24) bitcast (<{ i8*, [16 x i8] }>* @byte_str.6 to { [0 x i64], { [0 x i8]*, i64 }, [0 x i32], i32, [0 x i32], i32, [0 x i32] }*)), !dbg !678
  unreachable, !dbg !678
}

LLVM reference:

nounwind
This function attribute indicates that the function never raises an exception. If the function does raise an exception, its runtime behavior is undefined.

teor2345 pushed a commit to teor2345/tor that referenced this issue Sep 21, 2018
Until rust-lang/rust#52652 is fixed,
unwinding on panic is potentially unsound in a mixed C/Rust codebase.

The codebase is supposed to be panic-free already, but just to be safe.

This started mattering at commit d1820c1.

Fixes #27199; bugfix on tor-0.3.3.1-alpha.
teor2345 pushed a commit to teor2345/tor that referenced this issue Oct 22, 2018
Until rust-lang/rust#52652 is fixed,
unwinding on panic is potentially unsound in a mixed C/Rust codebase.

The codebase is supposed to be panic-free already, but just to be safe.

This started mattering at commit d1820c1.

Fixes #27199; bugfix on tor-0.3.3.1-alpha.
@alexcrichton
Copy link
Member

I've opened a PR for this at #55982 which does the "easy thing" of basically flipping the defaults.

I personally disagree that we need to take any further action at this time. We've got a long history of a "stable default" in Rust where the opt-out is unstable (allocators are the first that come to mind). Eventually the opt-out is stabilized in its own right, but for now we have an unstable control (the #[unwind] attribute).

Additionally I don't think this can really bake without actually testing, this will remain virtually undetected unless everyone opts-in to testing it, so the only real way to get testing is to actually flip the defaults and see what happens. That's what we did last time and it's easy to always flip the defaults back if something comes up!

@Centril Centril added the A-FFI Area: Foreign function interface (FFI) label Nov 20, 2018
@sdroege
Copy link
Contributor

sdroege commented Mar 14, 2021

extern "C" functions defined in Rust automatically get an abort-on-unwind shim, yes. For functions defined in other languages and called from Rust, it is your responsibility to ensure that they do not unwind.

Ok, so that's also implemented as part of that PR? Then I don't think anything else is needed here.

Right now in stable it's still UB to unwind across extern "C" functions. Usually it aborts, sometimes it does very interesting things.

I didn't notice anything in the PR description or the commit messages that suggested that it adds something to ensure that it always aborts, that's why I was asking.

@RalfJung
Copy link
Member

Ok, so that's also implemented as part of that PR?

Yes. This is controlled by this function.

@sdroege
Copy link
Contributor

sdroege commented Mar 15, 2021

Thanks for the confirmation, that's great news. I'm waiting for that to come back for years now :)

@RalfJung
Copy link
Member

Me too. :)

@elichai
Copy link
Contributor

elichai commented Jan 6, 2022

Is it possible to now write somewhere "official" that unwinding across extern "C" is well defined to abort?

@Amanieu
Copy link
Member

Amanieu commented Jan 12, 2022

The abort behavior is currently only enabled with #![feature(c_unwind)].

@pnkfelix
Copy link
Member

pnkfelix commented Feb 1, 2022

(I'm assigning @BatmanAoD as owner of this issue, in accordance with T-compiler's shift towards a policy where all P-high/P-critical issues are either tagged with a WG-* label or have an explicit owner who can be contacted for status updates.)

@rustbot assign @BatmanAoD

@rustbot rustbot self-assigned this Feb 1, 2022
@jackh726 jackh726 added the WG-none Indicates that no working group is assigned to an issue, but it *does* have an active owner label Feb 3, 2022
@wesleywiser wesleywiser added wg-ffi-unwind and removed WG-none Indicates that no working group is assigned to an issue, but it *does* have an active owner labels Jun 24, 2022
@jackh726 jackh726 added WG-ffi-unwind Working group: FFI unwind and removed wg-ffi-unwind labels Jun 24, 2022
@wesleywiser wesleywiser added the S-tracking-needs-summary Status: It's hard to tell what's been done and what hasn't! Someone should do some investigation. label Jun 24, 2022
@RalfJung
Copy link
Member

RalfJung commented Jul 3, 2022

Maybe we should leave the issue open while #[unwind(allowed)] still exists, which is an unsound attribute.

This attribute is gone now, as far as I can see.
#74990 is the tracking issue for the c_unwind feature that will solve this.
I am not sure if this issue here is still tracking anything else than that?

@RalfJung
Copy link
Member

RalfJung commented Jul 3, 2022

OTOH, the behavior without any feature gates is still unsound AFAIK, so there should be some I-unsound issue open to track that.

@BatmanAoD
Copy link
Member

BatmanAoD commented Jul 3, 2022 via email

@jackh726 jackh726 added the E-needs-mcve Call for participation: This issue has a repro, but needs a Minimal Complete and Verifiable Example label Oct 11, 2023
@bstrie
Copy link
Contributor

bstrie commented Jul 1, 2024

Since #116088 has merged, can this issue be closed?

@RalfJung RalfJung closed this as completed Jul 2, 2024
@RalfJung
Copy link
Member

RalfJung commented Jul 2, 2024

Yes indeed :)

@SimonSapin
Copy link
Contributor Author

The description of #116088 is not super clear: does it also change extern "C" to abort if attempting to unwind across FFI?

Taking a step back: what is the guidance now for writing correct FFI code with callbacks? Where is that / should that be documented?

@RalfJung
Copy link
Member

RalfJung commented Jul 2, 2024

does it also change extern "C" to abort if attempting to unwind across FFI?

Yes.

Taking a step back: what is the guidance now for writing correct FFI code with callbacks?

If you write extern "C" functions in Rust, all outgoing panics will be turned into aborts.
If you call extern "C" functions from Rust, the compiler assumes that they cannot unwind (i.e., unwinding would be UB).

So if Rust code passes callbacks to a C library, extern "C" functions are the safe choice as they will never unwind. If the C library supports callbacks that unwind, you can define extern "C-unwind" functions instead.

Where is that / should that be documented?

Not sure, probably somewhere in the reference?

@BatmanAoD
Copy link
Member

Reference PR: rust-lang/reference#1226

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-FFI Area: Foreign function interface (FFI) C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. E-needs-mcve Call for participation: This issue has a repro, but needs a Minimal Complete and Verifiable Example I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness P-high High priority relnotes Marks issues that should be documented in the release notes of the next release. S-tracking-needs-summary Status: It's hard to tell what's been done and what hasn't! Someone should do some investigation. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. WG-ffi-unwind Working group: FFI unwind
Projects
None yet
Development

Successfully merging a pull request may close this issue.