Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only emit #[link_name] attribute if necessary. #1558

Merged
merged 3 commits into from
May 14, 2019

Conversation

michaelwoerister
Copy link
Member

@michaelwoerister michaelwoerister commented May 7, 2019

This PR makes bindgen not emit a #[link_name] attribute if it detects that it is not necessary.

Why do we want to do this?

Because when ThinLTO is performed across Rust/C/C++ language boundaries, the linker plugin will see both sides of the code at the LLVM IR level (as opposed to the machine code level). The #[link_name] attribute causes calls from Rust to C/C++ code to reference symbol names with backend-mangling applied to them (e.g. _foo@4 for a stdcall function) while the definition of the function on the C/C++ side has the name before this kind of mangling is applied (i.e. just foo for that same function). The linker (or respectively, the LLVM linker plugin) will thus not find any function named _foo@4 and report an undefined symbol error. The changes in this PR try to keep symbol names in sync at the LLVM IR level too (in more cases than before).

How does it work?

Before emitting an #[link_name] attribute, bindgen now checks if the Rust name will end up as the intended name anyway and in that case just not emit the attribute. It does so by looking at the Rust name (canonical_name), the proposed link_name, and the calling convention of a function. If the Rust name, with calling convention specific mangling applied to it, is equal to the link_name, then the attribute is not needed.

What are the downsides of this approach?

This approach does not work for C++ manglings with additional backend mangling applied, like for example __ZN3bar3FOOE (Itanium mangled C++ global with additional macOS-specific leading underscore). I think this applies to anything C++ on macOS :/

Could we do better?

Ideally, we'd add a #[link_name] attribute with the mangled name without the backend-specific mangling and without the leading \0 char. Then LLVM could just do the right thing. Unfortunately it seems that libClang does not provide this.

We could also try to actively remove backend part of mangled names; but doing so robustly on all platforms might be tricky.

@emilio, what do you think?

@michaelwoerister michaelwoerister force-pushed the link-name branch 5 times, most recently from a4cd80a to d466754 Compare May 8, 2019 10:27
@michaelwoerister michaelwoerister changed the title [WIP] Only emit #[link_name] attribute if necessary. Only emit #[link_name] attribute if necessary. May 10, 2019
@emilio
Copy link
Contributor

emilio commented May 12, 2019

This looks reasonable to me. As far as I can tell, regarding this bit:

I think this applies to anything C++ on macOS :/

This wouldn't be a regression, right? I agree that trying to un-apply the mangling depending on current target would be a bit brittle. It's what we were doing before doing we switched to #[link_name]...

We could try to add an API to LLVM for this, if somebody ends up needing it. But all the high-level APIs that libclang exposes also don't expose any options for this.


// This is something we don't recognize, stay on the safe side
// by emitting the `#[link_name]` attribute
Some(_) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe worth logging a debug! message or such? Otherwise no braces.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't think of a good message, so I removed the braces.


// Check that the suffix starts with '@' and is all ASCII decimals
// after that.
if suffix[0] != b'@' || !suffix[1..].iter().all(u8::is_ascii_digit)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be good to have tests for this. Do we have them? I see call-conv-field has a change for @0, do we have one for longer suffixes?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add one with a longer suffix.

@michaelwoerister
Copy link
Member Author

I think this applies to anything C++ on macOS :/

This wouldn't be a regression, right?

Apparently (according to https://bugzilla.mozilla.org/show_bug.cgi?id=1486042#c42) things already work on macOS (that is things seem to already work on macOs without this patch). That is a bit surprising to me. I need to double check if Clang includes the leading underscore there already in LLVM IR (instead of it being added by the backend).

I agree that trying to un-apply the mangling depending on current target would be a bit brittle. It's what we were doing before doing we switched to #[link_name]...

We could try to add an API to LLVM for this, if somebody ends up needing it. But all the high-level APIs that libclang exposes also don't expose any options for this.

LLVM has a few useful methods for this on DataLayout, e.g.:

  • hasMicrosoftFastStdCallMangling()
  • doNotMangleLeadingQuestionMark()
  • hasLinkerPrivateGlobalPrefix()
  • getGlobalPrefix()

Replicating the logic would still be complicated but at least the actual per-target settings would be defined in LLVM. Unfortunately, LLVM's C interface does not expose any of these methods.

@michaelwoerister
Copy link
Member Author

I've added some fastcall and stdcall tests but they seem to fail for LLVM 3.8. Looking at other tests, it seems like it is expected for LLVM 3.8 to behave differently. Is there a way I can make a test apply to all LLVM versions except one? Or do I have to duplicate the test case for all LLVM versions?

@michaelwoerister
Copy link
Member Author

For reference: I just verified that the current version of this PR solves the cross-language ThinLTO problem for Firefox on x86 Windows.

Copy link
Contributor

@emilio emilio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, for some reason I didn't get the notification from Github that you had pushed.

@emilio emilio merged commit 01a3e62 into rust-lang:master May 14, 2019
@michaelwoerister
Copy link
Member Author

Sorry, for some reason I didn't get the notification from Github that you had pushed.

Yeah, GH doesn't notify about that. I planned to ping you but wanted to wait until travis went green.

Can I help with publishing this on crates.io somehow?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants