Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keep codegen units unmerged when building compiler builtins #70846

Merged
merged 1 commit into from
Apr 7, 2020

Conversation

tmiasko
Copy link
Contributor

@tmiasko tmiasko commented Apr 6, 2020

Make it possible to control how mono items are partitioned into code generation
units, when compiling the compiler builtins, by retaining the original partitioning.

Helps with #48625, #61063, #67960, #70489.

r? @alexcrichton

@rust-highfive
Copy link
Collaborator

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @alexcrichton (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Apr 6, 2020
Copy link
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems plausible to me! I think though this isn't exactly what we want, but we can still work with it as-is. Ideally we want one-symbol-per-object file in compiler-builtins, but this is only preventing merging codegen units. In the ideal world the compiler's partitioning algorithm would place all #[no_mangle] symbols in compiler-builtins in their own CGU.

That being said I think this is by far the easiest way to implement this in the compiler, and I'd probably say we should keep this. To fully fix the referenced issues, though, we'll need to make changes to compiler-builtins too. I think we'll want to ensure that we have a module-per-symbol in compiler-builtins to ensure that we have a CGU-per-symbol.

Does that make sense?

if tcx.is_compiler_builtins(LOCAL_CRATE) {
// Compiler builtins require some degree of control over how mono items
// are partitioned into compilation units. Provide it by keeping the
// original partitioning when compiling the compiler builtins crate.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this comment be updated to basically indicate that for the compiler-builtins crate specifically we want one-symbol-per-object-file like most C implementations do? And comment as well about how that appeases linkers where the goal is that we're trying to avoid duplicate symbol errors when other compiler-rt libraries probably have the same symbols but highly likely the same API too

@tmiasko
Copy link
Contributor Author

tmiasko commented Apr 6, 2020

The goal would be to obtain similar partitioning as other libraries we are likely to link-in (libgcc primarily). Usually an individual object file for a builtin, but not always. The approach here provides enough flexibility for that, although it does require additional changes in compiler builtins crate.

Placing each symbol individually would create similar issues in other direction.

@alexcrichton
Copy link
Member

Ah yeah that's a good point! All-in-all I'm ok with this change as-is for rustc, but I think we'll likely need to edit some things on the compiler-builtins side of things after this lands as well (or in parallel, either way!)

@petrochenkov
Copy link
Contributor

@tmiasko
Could you explain the background here a bit, why does keeping symbols in compiler-builtins in separate object files help with the linked issues?

I've seen these conflicts between compiler-builtins and libgcc myself and mitigated them with --allow-multiple-definition, and I've also seen this this code

// Many of the symbols defined in compiler-rt are also defined in libgcc.
// Android's linker doesn't like that by default.
base.pre_link_args
.get_mut(&LinkerFlavor::Gcc)
.unwrap()
.push("-Wl,--allow-multiple-definition".to_string());

applying that workaround globally on Android, so I kind of thought the conflicts are unavoidable and "by design".

@alexcrichton
Copy link
Member

Oh wow I had no idea we did that. Turns out the inclusion of that flag dates back quite awhile... (and I even r+'d it)

In any case the reason for this change has to do with how linkers work with conflicting symbols. The compiler-builtins crate defines symbols that are highly likely to clash with other implementations (think memcpy). One way the clash can arise is if you generate a staticlib (which includes compiler-builtins) and then you link that with your native compiler which brings in something like libgcc/libcompiler-rt. This basically sets up a situation for a symbol clash.

The way the linker works is that it only loads an object file from an archive if it actually needs a symbol in the object file. The linker then generates an error (by default) if the same symbol is defined twice. The reason we don't hit this error all over the place today is mostly by luck rather than anything else. One common way this can arise is a scenario like this:

  • First the native code you're compiling with references an intrinsic Rust doesn't need, so it's loaded from the system libgcc. This system libgcc object file which defines the symbol may define another symbol as well.
  • Next you load an unrelated intrinsic which is defined with Rust (say, for example, a 128-bit arithmetic intrinsic).
  • An error happens if the Rust object file happens to define one of the same symbols of the original C object file. Even if you don't use the symbol it still generates an error.

The goal of this PR is to allow compiler-builtins to work similarly to other libgcc/libcompiler-rt implementations which is to isolate symbols to their own object file (most of the time, not always). This means that if the linker loads a symbol from compiler-builtins in Rust it'll only get that one symbol, no extras. That way it's guaranteed to not clash with other objects.

I believe some "uber portable" C libraries often do this where they'll define a function-per-object file. Some of this is for symbol visibility, but historically some of it has also been for smaller linked binaries when you aren't otherwise passing -ffunction-sections and -Wl,--gc-sections.

tl;dr; this'll give fine-grained control over object file-to-symbol allocation for compiler-builtins to allow isolating symbols to their own object file as necessary to avoid clashes with system libraries which define the same intrinsics.


My personal preference is to not pass flags like --allow-multiple-definition since it requires all consumers to pass it. If we can solve this ourselves without that flag then no one has to pass it, including rustc. (we might even be able to remove that ancient block for android)

@petrochenkov
Copy link
Contributor

@alexcrichton
Thanks!
This part

linker works is that it only loads an object file from an archive if it actually needs a symbol in the object file.

was new to me, I assumed that if you link libfoo.a and libbar.a together they will conflict if they have symbols with the same name in any of their object files.

I agree that --allow-multiple-definition is rather a workaround than a proper solution and should be avoided if it's possible.

@alexcrichton
Copy link
Member

Yeah if all it took was symbols in an archive to overlap with another archive, we would have had major problems long ago with this issue (compiler-builtins basically always duplicates at least one symbol in libgcc). I sort of suspect that "feature" was actually an optimization to linkers at some point which ended up becoming a defining feature, but hey it's what we've got to work with.

@alexcrichton
Copy link
Member

@bors: r+

@bors
Copy link
Contributor

bors commented Apr 6, 2020

📌 Commit c8b83ba has been approved by alexcrichton

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 6, 2020
Centril added a commit to Centril/rust that referenced this pull request Apr 7, 2020
…nits, r=alexcrichton

Keep codegen units unmerged when building compiler builtins

Make it possible to control how mono items are partitioned into code generation
units, when compiling the compiler builtins, by retaining the original partitioning.

Helps with rust-lang#48625, rust-lang#61063, rust-lang#67960, rust-lang#70489.

r? @alexcrichton
bors added a commit to rust-lang-ci/rust that referenced this pull request Apr 7, 2020
…ievink

Rollup of 5 pull requests

Successful merges:

 - rust-lang#70201 (Small tweaks in ToOwned::clone_into)
 - rust-lang#70762 (Miri leak check: memory reachable through globals is not leaked)
 - rust-lang#70846 (Keep codegen units unmerged when building compiler builtins)
 - rust-lang#70854 (Use assoc int submodules)
 - rust-lang#70857 (Don't import integer and float modules, use assoc consts 2)

Failed merges:

r? @ghost
@bors bors merged commit f56fb51 into rust-lang:master Apr 7, 2020
@tmiasko tmiasko deleted the compiler-builtins-codegen-units branch April 7, 2020 17:07
JohnTitor added a commit to JohnTitor/rust that referenced this pull request Apr 23, 2020
… r=cramertj

Ignore -Zprofile when building compiler_builtins

rust-lang#70846 made the `compiler_builtins` crate ignore the default codegen-units setting and instead always split each function into a different codegen unit.

This unfortunately breaks `-Zprofile` which requires a single codegen unit per crate (see rust-lang#71283). You can notice this when building with `cargo -Zbuild-std` and `RUSTFLAGS` containing `-Zprofile`.

This PR works around this issue by just ignoring `-Zprofile` for the `compiler-builtins` crate.
Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this pull request Apr 25, 2020
… r=cramertj

Ignore -Zprofile when building compiler_builtins

rust-lang#70846 made the `compiler_builtins` crate ignore the default codegen-units setting and instead always split each function into a different codegen unit.

This unfortunately breaks `-Zprofile` which requires a single codegen unit per crate (see rust-lang#71283). You can notice this when building with `cargo -Zbuild-std` and `RUSTFLAGS` containing `-Zprofile`.

This PR works around this issue by just ignoring `-Zprofile` for the `compiler-builtins` crate.
alexcrichton added a commit to alexcrichton/rust that referenced this pull request Jun 15, 2020
This commit intends to fix an accidental regression from rust-lang#70846. The
goal of rust-lang#70846 was to build compiler-builtins with a maximal number of
CGUs to ensure that each module in the source corresponds to an object
file. This high degree of control for compiler-builtins is desirable to
ensure that there's at most one exported symbol per CGU, ideally
enabling compiler-builtins to not conflict with the system libgcc as
often.

In rust-lang#70846, however, only part of the compiler understands that
compiler-builtins is built with many CGUs. The rest of the compiler
thinks it's building with `sess.codegen_units()`. Notably the
calculation of `sess.lto()` consults `sess.codegen_units()`, which when
there's only one CGU it disables ThinLTO. This means that
compiler-builtins is built without ThinLTO, which is quite harmful to
performance! This is the root of the cause from rust-lang#73135 where intrinsics
were found to not be inlining trivial functions.

The fix applied in this commit is to remove the special-casing of
compiler-builtins in the compiler. Instead the build system is now
responsible for special-casing compiler-builtins. It doesn't know
exactly how many CGUs will be needed but it passes a large number that
is assumed to be much greater than the number of source-level modules
needed. After reading the various locations in the compiler source, this
seemed like the best solution rather than adding more and more special
casing in the compiler for compiler-builtins.

Closes rust-lang#73135
Manishearth added a commit to Manishearth/rust that referenced this pull request Jun 16, 2020
…ins, r=Mark-Simulacrum

Change how compiler-builtins gets many CGUs

This commit intends to fix an accidental regression from rust-lang#70846. The
goal of rust-lang#70846 was to build compiler-builtins with a maximal number of
CGUs to ensure that each module in the source corresponds to an object
file. This high degree of control for compiler-builtins is desirable to
ensure that there's at most one exported symbol per CGU, ideally
enabling compiler-builtins to not conflict with the system libgcc as
often.

In rust-lang#70846, however, only part of the compiler understands that
compiler-builtins is built with many CGUs. The rest of the compiler
thinks it's building with `sess.codegen_units()`. Notably the
calculation of `sess.lto()` consults `sess.codegen_units()`, which when
there's only one CGU it disables ThinLTO. This means that
compiler-builtins is built without ThinLTO, which is quite harmful to
performance! This is the root of the cause from rust-lang#73135 where intrinsics
were found to not be inlining trivial functions.

The fix applied in this commit is to remove the special-casing of
compiler-builtins in the compiler. Instead the build system is now
responsible for special-casing compiler-builtins. It doesn't know
exactly how many CGUs will be needed but it passes a large number that
is assumed to be much greater than the number of source-level modules
needed. After reading the various locations in the compiler source, this
seemed like the best solution rather than adding more and more special
casing in the compiler for compiler-builtins.

Closes rust-lang#73135
Manishearth added a commit to Manishearth/rust that referenced this pull request Jun 16, 2020
…ins, r=Mark-Simulacrum

Change how compiler-builtins gets many CGUs

This commit intends to fix an accidental regression from rust-lang#70846. The
goal of rust-lang#70846 was to build compiler-builtins with a maximal number of
CGUs to ensure that each module in the source corresponds to an object
file. This high degree of control for compiler-builtins is desirable to
ensure that there's at most one exported symbol per CGU, ideally
enabling compiler-builtins to not conflict with the system libgcc as
often.

In rust-lang#70846, however, only part of the compiler understands that
compiler-builtins is built with many CGUs. The rest of the compiler
thinks it's building with `sess.codegen_units()`. Notably the
calculation of `sess.lto()` consults `sess.codegen_units()`, which when
there's only one CGU it disables ThinLTO. This means that
compiler-builtins is built without ThinLTO, which is quite harmful to
performance! This is the root of the cause from rust-lang#73135 where intrinsics
were found to not be inlining trivial functions.

The fix applied in this commit is to remove the special-casing of
compiler-builtins in the compiler. Instead the build system is now
responsible for special-casing compiler-builtins. It doesn't know
exactly how many CGUs will be needed but it passes a large number that
is assumed to be much greater than the number of source-level modules
needed. After reading the various locations in the compiler source, this
seemed like the best solution rather than adding more and more special
casing in the compiler for compiler-builtins.

Closes rust-lang#73135
Manishearth added a commit to Manishearth/rust that referenced this pull request Jun 16, 2020
…ins, r=Mark-Simulacrum

Change how compiler-builtins gets many CGUs

This commit intends to fix an accidental regression from rust-lang#70846. The
goal of rust-lang#70846 was to build compiler-builtins with a maximal number of
CGUs to ensure that each module in the source corresponds to an object
file. This high degree of control for compiler-builtins is desirable to
ensure that there's at most one exported symbol per CGU, ideally
enabling compiler-builtins to not conflict with the system libgcc as
often.

In rust-lang#70846, however, only part of the compiler understands that
compiler-builtins is built with many CGUs. The rest of the compiler
thinks it's building with `sess.codegen_units()`. Notably the
calculation of `sess.lto()` consults `sess.codegen_units()`, which when
there's only one CGU it disables ThinLTO. This means that
compiler-builtins is built without ThinLTO, which is quite harmful to
performance! This is the root of the cause from rust-lang#73135 where intrinsics
were found to not be inlining trivial functions.

The fix applied in this commit is to remove the special-casing of
compiler-builtins in the compiler. Instead the build system is now
responsible for special-casing compiler-builtins. It doesn't know
exactly how many CGUs will be needed but it passes a large number that
is assumed to be much greater than the number of source-level modules
needed. After reading the various locations in the compiler source, this
seemed like the best solution rather than adding more and more special
casing in the compiler for compiler-builtins.

Closes rust-lang#73135
RalfJung added a commit to RalfJung/rust that referenced this pull request Jun 19, 2020
…ins, r=Mark-Simulacrum

Change how compiler-builtins gets many CGUs

This commit intends to fix an accidental regression from rust-lang#70846. The
goal of rust-lang#70846 was to build compiler-builtins with a maximal number of
CGUs to ensure that each module in the source corresponds to an object
file. This high degree of control for compiler-builtins is desirable to
ensure that there's at most one exported symbol per CGU, ideally
enabling compiler-builtins to not conflict with the system libgcc as
often.

In rust-lang#70846, however, only part of the compiler understands that
compiler-builtins is built with many CGUs. The rest of the compiler
thinks it's building with `sess.codegen_units()`. Notably the
calculation of `sess.lto()` consults `sess.codegen_units()`, which when
there's only one CGU it disables ThinLTO. This means that
compiler-builtins is built without ThinLTO, which is quite harmful to
performance! This is the root of the cause from rust-lang#73135 where intrinsics
were found to not be inlining trivial functions.

The fix applied in this commit is to remove the special-casing of
compiler-builtins in the compiler. Instead the build system is now
responsible for special-casing compiler-builtins. It doesn't know
exactly how many CGUs will be needed but it passes a large number that
is assumed to be much greater than the number of source-level modules
needed. After reading the various locations in the compiler source, this
seemed like the best solution rather than adding more and more special
casing in the compiler for compiler-builtins.

Closes rust-lang#73135
Mark-Simulacrum pushed a commit to Mark-Simulacrum/rust that referenced this pull request Jul 10, 2020
This commit intends to fix an accidental regression from rust-lang#70846. The
goal of rust-lang#70846 was to build compiler-builtins with a maximal number of
CGUs to ensure that each module in the source corresponds to an object
file. This high degree of control for compiler-builtins is desirable to
ensure that there's at most one exported symbol per CGU, ideally
enabling compiler-builtins to not conflict with the system libgcc as
often.

In rust-lang#70846, however, only part of the compiler understands that
compiler-builtins is built with many CGUs. The rest of the compiler
thinks it's building with `sess.codegen_units()`. Notably the
calculation of `sess.lto()` consults `sess.codegen_units()`, which when
there's only one CGU it disables ThinLTO. This means that
compiler-builtins is built without ThinLTO, which is quite harmful to
performance! This is the root of the cause from rust-lang#73135 where intrinsics
were found to not be inlining trivial functions.

The fix applied in this commit is to remove the special-casing of
compiler-builtins in the compiler. Instead the build system is now
responsible for special-casing compiler-builtins. It doesn't know
exactly how many CGUs will be needed but it passes a large number that
is assumed to be much greater than the number of source-level modules
needed. After reading the various locations in the compiler source, this
seemed like the best solution rather than adding more and more special
casing in the compiler for compiler-builtins.

Closes rust-lang#73135
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants