Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Give the nounwind LLVM attribute to all public APIs #208

Closed
wants to merge 1 commit into from

Conversation

gnzlbg
Copy link
Contributor

@gnzlbg gnzlbg commented Jul 9, 2019

This PR gives all public APIs the nounwind LLVM attribute by making them extern "C". Note that extern "C" only changes the "call ABI" of the functions, it does not change name mangling. A different attribute, like #[no_mangle], would be required change name mangling to C name mangling, but that is not done here.

For all types that this library uses, the call ABI of the Rust and the C calling convention is the same, so this PR does not change that either.

@gnzlbg gnzlbg changed the title Give the LLVM attribute to all public APIs Give the nounwind LLVM attribute to all public APIs Jul 9, 2019
@alexcrichton
Copy link
Member

Is there a measurable benefit to this PR? I understand what nounwind is and its possible benefits, but given that these functions are rarely called and the expected usage of this crate I don't really understand what benefit this will bring. A downside is that it makes this crate a bit more odd to read/write and it's "one more thing" that's different from idiomatic Rust

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Jul 10, 2019

Is there a measurable benefit to this PR?

This removed some panic code in #198 .

ut given that these functions are rarely called

I wouldn't say rarely. Calling math functions in hot loops is something a lot of code does, and this is required to replace these functions via compiler-builtins.

A downside is that it makes this crate a bit more odd to read/write and it's "one more thing" that's different from idiomatic Rust

#198 automatically enforced this, but that enforcement is not part of this PR.

@alexcrichton
Copy link
Member

Is there a measurable benefit to this PR?

This removed some panic code in #198 .

FWIW this is a somewhat frustrating sentiment to read. There's a massive number of changes in #198 and attributing one small portion of that can't really easily be verified and it could easily be mistaken for any other change in the PR. I've theorized that what's happening here is if extern causes panics to be removed then it's papering over an existing bug, but the response is "but it works"?

Calling math functions in hot loops is something a lot of code does, and this is required to replace these functions via compiler-builtins.

This is misunderstanding how these functions are called though. No one is explicitly calling these functions but it's typically through LLVM's lowering of intrinsics, which don't at all take into account nounwind since they blindly assume it. Changing this crate has zero effects on callers.

A downside is that it makes this crate a bit more odd to read/write and it's "one more thing" that's different from idiomatic Rust

#198 automatically enforced this, but that enforcement is not part of this PR.

Enforcing it isn't really addressing the issue though? It's just enforcing that it looks weird, but papers over the fact that it still looks different than normal idiomatic code. Like it's definitely better to verify it's all consistent, but I continue to not understand why this change is being made in the first place.

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Jul 10, 2019

I've theorized that what's happening here is if extern causes panics to be removed then it's papering over an existing bug, but the response is "but it works"?

I don't know how that theory fits with the fact that #[inline] also removes the panics.

My theory is that, when some functions are too large, LLVM might fail to infer nounwind as a function attribute, for the whole function (e.g. because constant propagation from the call site is missing). Since large functions are often not inlined, we end up with a function call to a "unwind" function inside another function, and LLVM needs to emit panic handling code.

This is why both, inlining, and extern "C", separately, solve the issue. One inlines the code, so LLVM does not need to emit any panic handling code "just in case", and the other tells llvm that unwinding cannot happen, so it can remove such code.


This is misunderstanding how these functions are called though.

I said that these functions are called via compiler-builtins. LLVM math intrinsics, when not optimized away, generate calls to the C libm symbols, and compiler-builtins provides these symbols, such that linking doesn't fail. These symbols need to be extern "C" and have special link names.

What I don't understand is why these functions are not made extern "C" here. What's the point ?

For example, you mentioned that making this extern "C" could be "papering over an existing bug", but what compiler-builtins does is:

pub extern "C" fn libm_api(...) -> ... { libm::libm_api(...) }

If libm::libm_api gets inlined, nounwind would apply to it, triggering UB if it indeed had a bug. If this is the only call to libm::libm_api in the binary, it would be sound for LLVM to make libm::libm_api nounwind and optimize it accordingly. This also could introduce undefined behavior in case of a bug.

So I don't see the point of marking these functions as nounwind when we actually use them, but not here, where we actually test them. That feels subtly complicated to me, and since the plan is to only call these via the llvm.intrinsics, the "Rust" ABI that we are testing these with will never be actually used.

This also complicates things if we want to meaningfully test the library, by running a libm testsuite for C. We could wrap the library in a C API here as well for these tests... but then we need to "duplicate" part of the compiler-builtins code for that.

@alexcrichton
Copy link
Member

Er ok hm so there's a lot do digest there, I'll try responding to all the pieces though:

My theory is that, when some functions are too large, LLVM might fail to infer nounwind as a function attribute

This isn't quite true. LLVM has pretty trivial inference of the nounwind attribute (it's about as naive as you might think). If a function only calls other functions that are nounwind (or it calls 0 functions) then it itself is nounwind. That's basically it AFAIK. (maybe some tiny nuance with intrinsics but "meh")

This is why both, inlining, and extern "C", separately, solve the issue.

I've been thinking about why panics are "going away", and I think it just gets down to #[no_panic] isn't a great attribute. It's as good as it can get, but it's not great in general.

The reason #[inline] seems to remove panics is it just means that each function definition doesn't actually call any other functions. It's just pure math so trivially it's nounwind as inferred by LLVM, so the attribute works.

The reason extern seems to remove panics is that rustc will emit nounwind annotations itself, causing LLVM to infer that the function only calls nounwind functions so is itself nounwind and therefore doesn't panic.

I quite strongly feel #[inline] is the wrong solution, and extern could work but it's not really buying us much because these functions all already don't panic. It seems a shame to lose this sort of annotation for not panicking, but it's already pretty unergonomic as well with lots of #[cfg_attr] nonsense. I think I'd prefer to just remove the assertion check since it's not really buying us anything other than pain on CI, and it's not exactly preventing regressions or catching them.

If libm::libm_api gets inlined, nounwind would apply to it, triggering UB if it indeed had a bug

This is not true, if an extern function panics it aborts the program, it is not UB. This is something fixed in more recent verisons of Rust, and has forever been the intention of these functions.

That feels subtly complicated to me, and since the plan is to only call these via the llvm.intrinsics, the "Rust" ABI that we are testing these with will never be actually used.

I think this isn't quite right? I merely inherited this crate and the main use case I'm aware of is that it's integrated into compiler-builtins. There's way more intrinsics in this crate than compiler-builtins uses, and there are PRs to add more. I assume that means someone is using this code for something else, such as an external crate which just calls into the Rust ABI.

This also complicates things if we want to meaningfully test the library, by running a libm testsuite for C. We could wrap the library in a C API here as well for these tests... but then we need to "duplicate" part of the compiler-builtins code for that.

As I've mentioned before, we always have to wrap APIs. No matter what we do with the ABI we can't change the symbol mangling, so an unmangled wrapper is always required.

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Jul 11, 2019

This is not true, if an extern function panics it aborts the program, it is not UB. This is something fixed in more recent verisons of Rust, and has forever been the intention of these functions.

Note that this is not fixed yet, and it is unclear when a fix will land. Right now, if an extern "C" function panics, the panic escapes from a nounwind function, and you get UB.


I quite strongly feel #[inline] is the wrong solution,

I agree. #[inline] shouldn't be necessary.


This isn't quite true. LLVM has pretty trivial inference of the nounwind attribute (it's about as naive as you might think). If a function only calls other functions that are nounwind (or it calls 0 functions) then it itself is nounwind. That's basically it AFAIK. (maybe some tiny nuance with intrinsics but "meh")

So I think I didn't properly explain what I meant. Given:

fn foo(x: f32) {
    // ... huge function  ..
    if always_false(x) { panic!() }
}

fn libm_api(x: f32) {
    // ...huge function...
    foo( map_such_that_foo_neven_panics(x) ); 
}

In foo there is a panic!, which cannot ever trigger in practice for any inputs. LLVM doesn't optimize this away, because the function is large, and the manipulations on x that cause the panic never to trigger are not visible in foo. So foo doesn't get nounwind, and therefore, libm_api doesn't get nounwind either.

When foo is inlined into libm_api, the panic branch is removed, and foo becomes nounwind.

So arguably, the error here is in the code that inserted the panic branch in foo. When porting C code to Rust, unsigned arithmetic should be Wrapping, and signed arithmetic should be Unchecked. This is what fixed the segfaults in the fma functions.

I think we should identify using no_panic the functions where this kind of situation is going on, and try to fix those. We could initially use #[inline] for those functions in which they are needed, and then rewrite those slowly to use Wrapping or similar over time to remove the #[inline].


There's way more intrinsics in this crate than compiler-builtins uses, and there are PRs to add more. I assume that means someone is using this code for something else, such as an external crate which just calls into the Rust ABI.

AFAICT, compiler-builtins doesn't expose all intrinsics required, and if we were to expose all float methods in libcore, which we don't yet do, we would run into linker errors until that is fixed.

People using this crate were using the F32Ext and F64Ext traits to add the libstd float methods to the libcore types. Those traits are gone, so this use case isn't supported anymore.

We should expose the methods in libcore, and tell people to use those directly. I think this is a much better solution, since it allows LLVM to perform higher-level optimizations on floating-point math.

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Jul 11, 2019

As I've mentioned before, we always have to wrap APIs. No matter what we do with the ABI we can't change the symbol mangling, so an unmangled wrapper is always required.

You mean that a #[no_mangle] wrapper is always required, right ?

@burrbull
Copy link
Contributor

since the plan is to only call these via the llvm.intrinsics

WTF!?

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Jul 11, 2019

@burrbull you would call f32.sqrt() in libcore, that generates a llvm.sqrt(f32) call, which LLVM can optimize. After optimizations, that might be replaced with a call to the _sqrtf symbol, which libm provides by linking it properly (*).

Were you to instead call libm::sqrtf directly instead, LLVM does not know that this function does a sqrtf, so you lose LLVM high-level math optimizations.

I think this is the main use case for the crate.

(*) right now this is done in compiler-builtins for this libm, but I wouldn't mind being able to link whatever libm you want, e.g., there are better libms available than the system one, and other compilers like clang let you pick them. I feel its a bit weird that this particular libm is special.

@alexcrichton
Copy link
Member

Er sorry there's still a lot to respond to, so I'm going to try to continue to go piece by piece:

Right now, if an extern "C" function panics, the panic escapes from a nounwind function, and you get UB.

This is not the case on nightly, and the patch has been perpetually backed out for "reasons" on stable/beta for like 5 releases. I think there's a PR to actually revert the nightly behavior and leave it as UB. In any case this really feels like an extreme case of lawyering that isn't super productive for this conversation. The fact of the matter is that this is a bug in the language/compiler and we shouldn't be designing around it (especially as there's no practical impact right now).

I think we should identify using no_panic the functions where this kind of situation is going on, and try to fix those

My point boils down to no_panic has false positives. If you slap no_panic on something and it says it might panic, that's all it means is that it might panic, not that it will panic for some input. The attribute is just super brittle and probably not worth it to keep in this repository due to the great lengths one has to go to to actually use it.

Those traits are gone, so this use case isn't supported anymore.

I don't think this is true, the functions are still public. The functions were always a different style of using these methods and presumably someone is using this crate because APIs keep getting added.

We should expose the methods in libcore, and tell people to use those directly. I think this is a much better solution, since it allows LLVM to perform higher-level optimizations on floating-point math.

No, we have intentionally not done this. Many routines here have more optimized versions in libm and we do not want to regress performance. We can't move functions to libcore unless the performance is on parity with known implementations.


As a note, I've literally spent at least half an hour every day this week dealing with this PR and related issues. This is way too much time to be spending on whether or not we should tag APIs with extern. There's so many other higher value things to be doing in Rust...

I'm quite exhausted and don't really want to have to keep debating this. I'm very seriously considering just removing no-panic entirely as a dependency and forgetting about this.

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Jul 12, 2019

This is not the case on nightly,

Which nightly? This is with today's nightly: https://play.rust-lang.org/?version=nightly&mode=debug&edition=2018&gist=3cf9bf07128c54c92daee6ffd6537b92

If you slap no_panic on something and it says it might panic, that's all it means is that it might panic, not that it will panic for some input.

The only thing no_panic does is check whether a function is nounwind. If it isn't nounwind, it fails. This does not mean, that the function can panic. It does however mean, that if the function cannot panic, LLVM has failed to infer nounwind.

We can't move functions to libcore unless the performance is on parity with known implementations.

I didn't suggest this ? Where are you getting this ?

@gnzlbg gnzlbg closed this Jul 12, 2019
@gnzlbg
Copy link
Contributor Author

gnzlbg commented Jul 12, 2019

I'll just use a fork, I agree that this is costing too much time.

I might need to fork compiler-builtins as well, but such is life.

@alexcrichton
Copy link
Member

The "command dumped core" when you run that is the llvm.trap() running which forcibly aborts the process if an extern function panics. If you run it on stable it panics as usual.

I interpreted "We should expose the methods in libcore" as "we should move the functions into compiler-builtins so the intrinsic calls can be moved from libstd to libcore so we can access all this functionality from libcore". That leads to the performance issues I mentioned.

I'll just use a fork, I agree that this is costing too much time.

I might need to fork compiler-builtins as well, but such is life.

To be clear I think it won't be productive to have entire forks of this. It'll just end up causing confusion for users about which to use. AFAIK this is all just internal implementation details of this crate and isn't anything fundamental, which is why I'm not super interested in litigating this further. If you see a reason for a full fork to proceed independently that's different and I would prefer to avoid that if possible since it doesn't really end up benefitting many in the long run

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Jul 12, 2019

The "command dumped core" when you run that is the llvm.trap() running which forcibly aborts the process if an extern function panics. If you run it on stable it panics as usual.

That makes sense.

I interpreted "We should expose the methods in libcore" as "we should move the functions into compiler-builtins so the intrinsic calls can be moved from libstd to libcore so we can access all this functionality from libcore". That leads to the performance issues I mentioned.

I did mean that we should move the intrinsics from libstd to libcore. What causes the performance issues you mention is using this libm unconditionally for all targets and binaries, but I did not suggest doing that.

We should have an option to denote that a crate links libm, and if no crate in the dependency graph links libm, then and only then we should link a libm. For most targets, if libstd is linked, that would be the system libm, but for some targets (e.g. wasm) or #![no_std] binaries, that could be this libm. Sometimes, the user wants to specify which libm should be linked, and it should be as easy as adding a crate that links libm to the dependency graph.

Clang and gcc both allow doing this for C (and much more, like working around ABI differences across libm implementations - we could work around those in the crate that links libm).

I don't know if this makes sense, but I need to be able to do floating-point math from libcore, and I want to properly do so by calling LLVM intrinsics, and then resolving these symbols at link time, like all other toolchains do. I am currently manually calling this (and other) libm functions manually in #![no_std], and that's just wrong. I'm missing basic LLVM optimizations, I can't dynamically override the libm because the symbols differ, if a system libm is available it is not picked, I can't query from a Rust crate whether libstd will be linked into the final binary easily to decide whether a library should link a libm or not, etc.

@Schultzer
Copy link
Contributor

I actually thought the whole point was to provide math functions to libcore and, if these are not the most optimized functions then at least it should link to those there are.

It's kinda weird that libcore don't have any math functions, I would argue thats an essential part of any programming language.

@alexcrichton
Copy link
Member

We can basically design w/e system we want for linking libm, but that'll likely require new features/support from rustc itself. Today we're stuck with "if the symbols are in compiler-builtins then libcore unconditionally uses them" which isn't always what we want.

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Jul 12, 2019

@alexcrichton This might be a dumb question, but how does memcpy work ?

memcpy is available in libcore, and that lowers to llvm.memcpy. There are memcpy symbols in compiler-builtins, but each C library provides its own version, which should be preferred if available.

This is very similar to the float intrinsics. So how does that work? Is the compiler-builtins version used unconditionally, or is the libc version used when available ?

@alexcrichton
Copy link
Member

The memcpy symbols in compiler-builtins are conditionally compiled in and aren't present on platforms where libc typically has them. They're only there for platforms like embedded arm/wasm.

(to be specific they're actually always compiled in apparently checking locally, but they're only #[no_mangle] conditionally on these platforms).

Float intrinsics are sort of the same way but much newer in some regards. On platforms like wasm the intrinsics are always compiled in, on other platforms they're omitted (but this time entirely I think).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants