Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using LTO while building as static library causes conflicts #44322

Open
CryZe opened this issue Sep 4, 2017 · 18 comments
Open

Using LTO while building as static library causes conflicts #44322

CryZe opened this issue Sep 4, 2017 · 18 comments
Labels
A-linkage Area: linking into static, shared libraries and binaries C-bug Category: This is a bug. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@CryZe
Copy link
Contributor

CryZe commented Sep 4, 2017

If you build your Rust project as a static library to link it into a C codebase, you usually want to crank up the optimizations, as also indicated by the librsvg blogpost (towards the end):

https://people.gnome.org/~federico/blog/librsvg-build-infrastructure.html

This is all fine, until you want to link in another library built with Rust, which is very likely if you want to incrementally introduce Rust into your codebase. At that point the linker will start complaining about collisions caused by symbols defined by the standard library:

./liblib2.a(lib2-91c10be1e28e3b68.0.o): In function `std::heap::__default_lib_allocator::__rdl_alloc':
/checkout/src/libstd/heap.rs:31: multiple definition of `__rdl_alloc'
./liblib1.a(lib1-25b44f601e0ef586.0.o):/checkout/src/libstd/heap.rs:31: first defined here
./liblib2.a(lib2-91c10be1e28e3b68.0.o): In function `__rdl_alloc_excess':
lib2.cgu-0.rs:(.text.__rdl_alloc_excess+0x0): multiple definition of `__rdl_alloc_excess'
./liblib1.a(lib1-25b44f601e0ef586.0.o):lib1.cgu-0.rs:(.text.__rdl_alloc_excess+0x0): first defined here
./liblib2.a(lib2-91c10be1e28e3b68.0.o): In function `__rdl_alloc_zeroed':
lib2.cgu-0.rs:(.text.__rdl_alloc_zeroed+0x0): multiple definition of `__rdl_alloc_zeroed'
./liblib1.a(lib1-25b44f601e0ef586.0.o):lib1.cgu-0.rs:(.text.__rdl_alloc_zeroed+0x0): first defined here
./liblib2.a(lib2-91c10be1e28e3b68.0.o): In function `__rdl_dealloc':
lib2.cgu-0.rs:(.text.__rdl_dealloc+0x0): multiple definition of `__rdl_dealloc'
./liblib1.a(lib1-25b44f601e0ef586.0.o):lib1.cgu-0.rs:(.text.__rdl_dealloc+0x0): first defined here
./liblib2.a(lib2-91c10be1e28e3b68.0.o): In function `__rdl_grow_in_place':
lib2.cgu-0.rs:(.text.__rdl_grow_in_place+0x0): multiple definition of `__rdl_grow_in_place'
./liblib1.a(lib1-25b44f601e0ef586.0.o):lib1.cgu-0.rs:(.text.__rdl_grow_in_place+0x0): first defined here
./liblib2.a(lib2-91c10be1e28e3b68.0.o): In function `__rdl_oom':
lib2.cgu-0.rs:(.text.__rdl_oom+0x0): multiple definition of `__rdl_oom'
./liblib1.a(lib1-25b44f601e0ef586.0.o):lib1.cgu-0.rs:(.text.__rdl_oom+0x0): first defined here
./liblib2.a(lib2-91c10be1e28e3b68.0.o): In function `__rdl_realloc':
lib2.cgu-0.rs:(.text.__rdl_realloc+0x0): multiple definition of `__rdl_realloc'
./liblib1.a(lib1-25b44f601e0ef586.0.o):lib1.cgu-0.rs:(.text.__rdl_realloc+0x0): first defined here
./liblib2.a(lib2-91c10be1e28e3b68.0.o): In function `__rdl_realloc_excess':
lib2.cgu-0.rs:(.text.__rdl_realloc_excess+0x0): multiple definition of `__rdl_realloc_excess'
./liblib1.a(lib1-25b44f601e0ef586.0.o):lib1.cgu-0.rs:(.text.__rdl_realloc_excess+0x0): first defined here
./liblib2.a(lib2-91c10be1e28e3b68.0.o): In function `__rdl_shrink_in_place':
lib2.cgu-0.rs:(.text.__rdl_shrink_in_place+0x0): multiple definition of `__rdl_shrink_in_place'
./liblib1.a(lib1-25b44f601e0ef586.0.o):lib1.cgu-0.rs:(.text.__rdl_shrink_in_place+0x0): first defined here
./liblib2.a(lib2-91c10be1e28e3b68.0.o): In function `__rdl_usable_size':
lib2.cgu-0.rs:(.text.__rdl_usable_size+0x0): multiple definition of `__rdl_usable_size'
./liblib1.a(lib1-25b44f601e0ef586.0.o):lib1.cgu-0.rs:(.text.__rdl_usable_size+0x0): first defined here
./liblib2.a(lib2-91c10be1e28e3b68.0.o): In function `rust_eh_personality':
lib2.cgu-0.rs:(.text.rust_eh_personality+0x0): multiple definition of `rust_eh_personality'
./liblib1.a(lib1-25b44f601e0ef586.0.o):lib1.cgu-0.rs:(.text.rust_eh_personality+0x0): first defined here

This only happens if you set lto = true.

@TimNN
Copy link
Contributor

TimNN commented Sep 17, 2017

I don't think linking two rust static libraries into the same binary is actively supported today, see also @alexcrichton's #44283 (comment) in another issue.

Still marking as a bug for the moment since it works without LTO.

@TimNN TimNN added C-bug Category: This is a bug. A-linkage Area: linking into static, shared libraries and binaries T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Sep 17, 2017
@alexcrichton
Copy link
Member

@CryZe can you provide a script or repository to reproduce this? In theory linking together two staticlibs with LTO should work, although there may be bugs preventing it from doing so. In the long run you probably don't want to do this, though, as each Rust library you link in will bring in a new copy of the standard library, which ideally everyone would share.

@steveklabnik
Copy link
Member

Triage: Hey @CryZe , are you still seeing this? Alex asked for a reproduction a while ago, if we can't reproduce, then we can't fix this :)

@CryZe
Copy link
Contributor Author

CryZe commented Sep 12, 2019

Alright I made a repository here: https://github.com/CryZe/multi-lib-lto-rust-bug-repro
Just run make and it should fail:

gcc main.c -Llib1/target/release/ -Llib2/target/release/ -llib1 -llib2 -lpthread -ldl
lib2/target/release//liblib2.a(lib2-929bb870274e5eac.lib2.8t8q9ckg-cgu.0.rcgu.o): In function `rust_eh_personality':
/rustc/eae3437dfe991621e8afdc82734f4a172d7ddf9b//src/libpanic_abort/lib.rs:106: multiple definition of `rust_eh_personality'
lib1/target/release//liblib1.a(lib1-d10e44ca50aa3d7b.lib1.eqtft63h-cgu.0.rcgu.o):/rustc/eae3437dfe991621e8afdc82734f4a172d7ddf9b//src/libpanic_abort/lib.rs:106: first defined here
collect2: error: ld returned 1 exit status
Makefile:2: recipe for target 'a.out' failed
make: *** [a.out] Error 1

@awakecoding
Copy link

I am also investigating this issue. I did a bunch of tests, and it looks like lto = true is the only way to avoid having the __rdl_alloc symbols defined without name mangling. lto = false or lto = "off" both get them defined and cause the conflict. While my primary concern is with iOS, I did notice that the multi-lib-lto-rust-bug-repro doesn't link on macOS because of a symbol conflict on _rust_eh_personality. This symbol specifically doesn't seem to be affected by the lto option :/

@awakecoding
Copy link

@CryZe did you get any further on this? We are desperate for a workaround to make it work on iOS, where using shared libraries is not an option :(

@awakecoding
Copy link

I came up with a "solution" that is wrong on so many levels, but at least makes the @CryZe reproduction sample link:

PKG_NAME=lib1
cd $PKG_NAME
cargo clean
cargo build --release
cd target/release
ar x lib${PKG_NAME}.a
LANG= sed -i ‘’ “s/rust_eh_personality/rust_eh_personaliti/g” $(ls ${PKG_NAME}-*.rcgu.o)
ar cr lib${PKG_NAME}.a *.o

This effectively builds the first library, extracts the object files, does a search & replace on "rust_eh_personality" to rename it to "rust_eh_personaliti", and rebuilds the static library from the object files. By forcing the symbol change on one of the two libraries, the symbol conflict is avoided.

Now this is obviously a very ugly hack, and I don't know the impact of forcing a symbol name change on rust_eh_personality. Technically, does it need to keep this name, or could it be name mangled like the rest?

@awakecoding
Copy link

After digging through some older tickets, I found #43415 and it mentioned a different result when building in debug vs release. I modified the reproduction sample to build in debug, and there is no linker issue. It does significantly change the layout of internal object files and where the symbols are (extracting the .a with 'ar' to see where symbols are).

One thing I noticed is that the linker doesn't complain about duplicated symbols if they are identical, and inside an object file of the same name (and possibly identical in totality I'm guessing). When building in debug, rust_eh_personality is defined in the same object file for both lib1 and lib2:

panic_unwind-bf192f19d9586a96.panic_unwind.67c9yyht-cgu.0.rcgu.o

When built in release, rust_eh_personality is defined in lib1-93605d7bc2291864.lib1.1hx46jnd-cgu.0.rcgu.o and lib2-3edec0329c3e4919.lib2.64tkgp8c-cgu.0.rcgu.o respectively. We decompiled the symbols to see if they were identical, and they are in our case (empty function stubs) but the linker sees them as different. I'm pretty sure that if rust_eh_personality was compiled inside an object file of the same name like what happens when you build in debug, the linker problem would go away.

This would still mean that in order to get things to build and link correctly, one should use exactly the same version of Rust to avoid issues, but it would still be a much better option that being unable to link multiple static libraries together.

@awakecoding
Copy link

Nevermind about building as debug as a way to get it working, as it fails the same way as in release mode when using panic="abort":

[profile.dev]
panic = "abort"
lto = true

@awakecoding
Copy link

@alexcrichton @TimNN any idea what affects the compilation of the rust_eh_personality symbol specifically? There is this: https://github.com/rust-lang/rust/blob/master/src/librustc_codegen_ssa/back/symbol_export.rs#L118

I'm trying to figure out if there is a way to get this symbol definition built as part of a separate object file, like what happens when it gets built as part of panic_unwind.o (+ the name mangling). I am pretty sure this would get rid of the duplicated symbol issue once and for all.

It may seem like a hack, but it would actually work quite well. The recommendation would be to build with the same version of Rust to avoid potential issues with symbols that have slight differences, but this is acceptable in my mind. It is very similar to building static libraries against a compatible MSVC static runtime on Windows, except we're talking about the Rust core libraries here.

This type of issue needs to be fixed for cases where shared libraries are not a suitable option, like iOS. It is only going to become more common as more C libraries get rewritten to use Rust under the hood, causing conflicts when two of such libraries get linked together.

@kleisauke
Copy link
Contributor

kleisauke commented Jan 1, 2021

I also experience this bug when linking against two Rust staticlibs (i.e. librsvg and rav1e) which was built with "fat" LTO enabled. I could workaround this with:

LDFLAGS="-Wl,--allow-multiple-definition"

But it's something I'd rather avoid.

fwiw, the above mentioned link is now 404. It was probably linking to:

@kleisauke
Copy link
Contributor

FWIW, I had hoped that PR #92845 would fix this, but it seems that this issue is still reproducible with the latest nightly.

$ git clone https://github.com/CryZe/multi-lib-lto-rust-bug-repro.git
$ cd multi-lib-lto-rust-bug-repro/
$ make
gcc main.c -Llib1/target/release/ -Llib2/target/release/ -llib1 -llib2 -lpthread -ldl
lib2/target/release//liblib2.a(lib2-f2e7454a5f282da7.lib2.2398f8e6-cgu.1.rcgu.o): In function `rust_eh_personality':
/rustc/bc4b39c271bbd36736cbf1c0a1ac23d5df38d365/library/std/src/personality/gcc.rs:244: multiple definition of `rust_eh_personality'
lib1/target/release//liblib1.a(lib1-bfa64102e7d38c18.lib1.c0e19170-cgu.1.rcgu.o):/rustc/bc4b39c271bbd36736cbf1c0a1ac23d5df38d365/library/std/src/personality/gcc.rs:244: first defined here
lib2/target/release//liblib2.a(lib2-f2e7454a5f282da7.lib2.2398f8e6-cgu.1.rcgu.o):(.init_array.00099+0x0): multiple definition of `std::sys::unix::args::imp::ARGV_INIT_ARRAY'
lib1/target/release//liblib1.a(lib1-bfa64102e7d38c18.lib1.c0e19170-cgu.1.rcgu.o):(.init_array.00099+0x0): first defined here

@sdroege
Copy link
Contributor

sdroege commented Nov 10, 2022

I was just going to report the same thing after running into it and making a testcase. Funny enough, my testcase is almost the same as @CryZe's. This definitely still happens with latest stable and nightly as of today.

Depending on what the Rust staticlibs are doing, there are also a couple of other symbols that are conflicting:

  • ___rdl_alloc
  • ___rdl_alloc_zeroed
  • ___rdl_dealloc
  • ___rdl_oom
  • ___rdl_realloc
  • ___rg_oom
  • ___rust_alloc
  • ___rust_alloc_error_handler
  • ___rust_alloc_error_handler_should_panic
  • ___rust_alloc_zeroed
  • ___rust_dealloc
  • ___rust_drop_panic
  • ___rust_foreign_exception
  • ___rust_realloc
  • ___rust_start_panic
  • _rust_begin_unwind
  • _rust_eh_personality
  • _rust_oom
  • _rust_panic

This is a bit of a problem for GStreamer now that we started shipping plugins that are written in Rust. Both shared (cdylib) and static (staticlib) plugins are shipped, and both are built with LTO. For the cdylib ones there is obviously no problem here.

The staticlib ones however will cause exactly this symbol conflict here if linked in an application that also links to more than a single plugin, or probably also if such a staticlib (one or multiple) is linked into a Rust application.

Using LDFLAGS="-Wl,--allow-multiple-definition" would be a workaround for this but this only works reliable if all the Rust staticlibs are using the same version of the toolchain. Otherwise this would rely on the symbols to be 100% compatible between releases, which I don't think is guaranteed at all.

Alternatively, disabling LTO for the staticlibs also works but I'm not entirely sure why: the same symbols are still in each library so why don't they conflict when linking?

azat added a commit to azat/ClickHouse that referenced this issue May 7, 2023
Results are the same:

  upstream: SELECT hex(BLAKE3('bar')): F2E897EED7D206CD855D441598FA521ABC75AA96953E97C030C9612C30C1293D
  llvm:     SELECT hex(BLAKE3('bar')): F2E897EED7D206CD855D441598FA521ABC75AA96953E97C030C9612C30C1293D

Query for benchmark:

    SELECT ignore(BLAKE3(materialize('Lorem ipsum dolor sit amet, consectetur adipiscing elit'))) FROM numbers(1000000000) FORMAT `Null`

  upstream           : Elapsed: 2.494 sec. Processed 31.13 million rows, 249.08 MB (12.48 million rows/s., 99.86 MB/s.)
  upstream + rust lto:                                                             ~14.00 million rows/s
  llvm               : Elapsed: 3.053 sec. Processed 43.24 million rows, 345.88 MB (14.16 million rows/s., 113.28 MB/s.)

And note, that now, integrating_rust_libraries.md became deprecated.

P.S. LLVM implementation had been choosen over Rust + LTO, because there
are issues with linking multiple Rust libraries together with LTO:
- https://alanwu.space/post/symbol-hygiene/
- rust-lang/rust#44322
@Lupus
Copy link

Lupus commented Jun 26, 2023

Leveraging Rust in OCaml community is a growing trend nowadays, and we're running into the same problem. Individual OCaml libraries are dragging statically compiled lto-enabled .a Rust libs into the final linking phase, which then fails with ld complaints about multiple definition of. In my case the following symbols are reported:

  • rust_eh_personality
  • std::sys::unix::args::imp::ARGV_INIT_ARRAY

@bjorn3
Copy link
Member

bjorn3 commented Jun 26, 2023

See also #104707. If you are able to use dynamically linked libraries, the cdylib crate type doesn't have this issue.

@Lupus
Copy link

Lupus commented Jun 26, 2023

Unfortunately OCaml supports only C stubs as .a archives for compilation to native code, .so options is used for bytecode compilation, which is not really pratcial for production use as it's slow.

Actually after setting lto to off, I still see the same linking conflicts. Probably I'm running into the same issues, as described here: #73632 (comment)

@bjorn3
Copy link
Member

bjorn3 commented Jun 26, 2023

I see. It is really unfortunate that .a archives don't have a way to hide symbols.

@nirbheek
Copy link

This issue seems to be taking a while to get fixed, so I thought I'd share an ugly+hacky workaround that worked for us at GStreamer: rename the symbols. The proper way to do that involves using something like goblin, but in our case we just used ... sed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-linkage Area: linking into static, shared libraries and binaries C-bug Category: This is a bug. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

10 participants