Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

macos: linking errors with split-debuginfo and switching toolchains #9353

Closed
ehuss opened this issue Apr 12, 2021 · 7 comments · Fixed by #9418
Closed

macos: linking errors with split-debuginfo and switching toolchains #9353

ehuss opened this issue Apr 12, 2021 · 7 comments · Fixed by #9418
Labels
C-bug Category: bug

Comments

@ehuss
Copy link
Contributor

ehuss commented Apr 12, 2021

I am getting linking errors when switching between different toolchains on macos using split-debuginfo.

Reproduction

Create a new binary cargo project foo and add the following to Cargo.toml:

[profile.dev]
split-debuginfo = "unpacked"

Run in this order:

  1. cargo +stable build
  2. cargo +nightly build
  3. cargo +stable build

Where stable is 1.51 and nightly is 1.53. This will also happen with the inverse order (nighty/stable/nightly) or using beta 1.52. The last steps results in the error:

Compiling foo v0.1.0 (/Users/eric/Temp/foo)
error: linking with cc failed: exit code: 1
|
= note: "cc" "-m64" "-arch" "x86_64" "-L" "/Users/eric/.rustup/toolchains/stable-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib" "/Users/eric/Temp/foo/target/debug/deps/foo.133wdq044u90e7l6.rcgu.o" "/Users/eric/Temp/foo/target/debug/deps/foo.14w9ua3dropu18oz.rcgu.o" "/Users/eric/Temp/foo/target/debug/deps/foo.1hu7398ixp3j8hr7.rcgu.o" "/Users/eric/Temp/foo/target/debug/deps/foo.2w3g7gdejx58fp4w.rcgu.o" "/Users/eric/Temp/foo/target/debug/deps/foo.400r4iosyqzwmm34.rcgu.o" "/Users/eric/Temp/foo/target/debug/deps/foo.4ew8o84rdo0pqe67.rcgu.o" "/Users/eric/Temp/foo/target/debug/deps/foo.4pil1rvk4hlgrvxn.rcgu.o" "/Users/eric/Temp/foo/target/debug/deps/foo.59wg0cocv7vr0nej.rcgu.o" "-o" "/Users/eric/Temp/foo/target/debug/deps/foo" "/Users/eric/Temp/foo/target/debug/deps/foo.2f3bbrl2dzwum699.rcgu.o" "-Wl,-dead_strip" "-nodefaultlibs" "-L" "/Users/eric/Temp/foo/target/debug/deps" "-L" "/Users/eric/.rustup/toolchains/stable-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib" "/Users/eric/.rustup/toolchains/stable-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/libstd-349f286494d73b18.rlib" "/Users/eric/.rustup/toolchains/stable-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/libpanic_unwind-0c9fcc24a503d489.rlib" "/Users/eric/.rustup/toolchains/stable-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/libobject-70419d92d1ba4b1d.rlib" "/Users/eric/.rustup/toolchains/stable-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/libaddr2line-65e88774cb68bd46.rlib" "/Users/eric/.rustup/toolchains/stable-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/libgimli-3849b3781a19a398.rlib" "/Users/eric/.rustup/toolchains/stable-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/librustc_demangle-0dbb03fa66ca6d84.rlib" "/Users/eric/.rustup/toolchains/stable-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/libhashbrown-65edff8661311c85.rlib" "/Users/eric/.rustup/toolchains/stable-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/librustc_std_workspace_alloc-599e707cd7ee7216.rlib" "/Users/eric/.rustup/toolchains/stable-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/libunwind-40cb05f6c516791a.rlib" "/Users/eric/.rustup/toolchains/stable-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/libcfg_if-7a0a923a4d37a048.rlib" "/Users/eric/.rustup/toolchains/stable-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/liblibc-7e047938e88325ef.rlib" "/Users/eric/.rustup/toolchains/stable-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/liballoc-02542d835be27c0f.rlib" "/Users/eric/.rustup/toolchains/stable-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/librustc_std_workspace_core-63712b18a1365082.rlib" "/Users/eric/.rustup/toolchains/stable-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/libcore-1196a2a060497e71.rlib" "/Users/eric/.rustup/toolchains/stable-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/libcompiler_builtins-10db70d883838cbc.rlib" "-lSystem" "-lresolv" "-lc" "-lm"
= note: Undefined symbols for architecture x86_64:
"std::rt::lang_start::h101df5f7d98767d0", referenced from:
_main in foo.4pil1rvk4hlgrvxn.rcgu.o
"core::fmt::Arguments::new_v1::h8105d8d713b7c82d", referenced from:
foo::main::hb50b1ea1f0945408 in foo.4pil1rvk4hlgrvxn.rcgu.o
"std::io::stdio::_print::h0aab2456a28edb0d", referenced from:
foo::main::hb50b1ea1f0945408 in foo.4pil1rvk4hlgrvxn.rcgu.o
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)

I expected the 3rd build to rebuild from scratch and replace the output binary with success.

One curious thing I noticed is that the different toolchains produce mostly different .o filenames, but two of the filenames are the same. foo.4pil1rvk4hlgrvxn.o is one of those files. I am a bit confused, as I would assume rustc would have completely replaced any overlapping files, so I'm uncertain how conflicting filenames could be a problem here. I could imagine other scenarios involving rlibs where this would be a problem, but not with just a single binary.

Out of curiosity, I collected the unmangled names and here is the comparison:

nightly stable
foo.core.alx3lauo-in-foo.7k7dwitq-fmt.rcgu.o foo.core.3y4jtb65-in-foo.7k7dwitq-fmt.rcgu.o
foo.core.alx3lauo-in-foo.7k7dwitq-hint.volatile.rcgu.o foo.core.3y4jtb65-in-foo.7k7dwitq-hint.volatile.rcgu.o
foo.foo.7k7dwitq.rcgu.o foo.foo.7k7dwitq.rcgu.o
foo.foo.7k7dwitq-fallback.cgu.rcgu.o foo.foo.7k7dwitq-fallback.cgu.rcgu.o
foo.std.83sbzbw6-in-foo.7k7dwitq-process.rcgu.o foo.std.76rm7o7m-in-foo.7k7dwitq-process.rcgu.o
foo.std.83sbzbw6-in-foo.7k7dwitq-rt.volatile.rcgu.o foo.std.76rm7o7m-in-foo.7k7dwitq-rt.volatile.rcgu.o
foo.std.83sbzbw6-in-foo.7k7dwitq-sys-unix-process-process_common.rcgu.o foo.std.76rm7o7m-in-foo.7k7dwitq-sys-unix-process-process_common.rcgu.o
foo.std.83sbzbw6-in-foo.7k7dwitq-sys_common-backtrace.volatile.rcgu.o foo.std.76rm7o7m-in-foo.7k7dwitq-sys_common-backtrace.volatile.rcgu.o
foo.foo.7k7dwitq-crate.allocator.rcgu.o foo.foo.7k7dwitq-crate.allocator.rcgu.o

I think the underlying problem is that the -C metadata flag is the same across toolchains for binaries on macos. This causes the crate disambiguator to be the same, causing the cgu hash to be the same. This is exposed here. I think the solution will be to include the rustc version in the target_short_hash, though I don't remember if that needs to be stable across versions.

There are a few other scenarios where Cargo reuses metadata hashes (described here) that will need to be investigated if they also suffer from this issue.

I'm still a bit confused why rustc doesn't just overwrite the files.

@ehuss ehuss added the C-bug Category: bug label Apr 12, 2021
@alexcrichton
Copy link
Member

This looks like an odd interaction with incremental, if you set CARGO_INCREMENTAL=0 it works ok (or if you blow away target/debug/incremental between compiles). I think rustc is using the same incremental directory and with the same format on stable/nightly so it's copying out the stale object file?

I agree though that the fix here is to probably ensure that -Cmetadata is different on nightly/stable where it currently isn't.

@ehuss
Copy link
Contributor Author

ehuss commented Apr 20, 2021

I think rustc is using the same incremental directory and with the same format on stable/nightly so it's copying out the stale object file?

This has me worried that there might be a bigger underlying problem, but I don't see how it is possible. My understanding of the incremental cache is that it validates the rust version and if it doesn't match, it deletes the incremental cache. Any ideas how stale cache data could infect the new build?

@alexcrichton
Copy link
Member

Oh I was just assuming that was the case since I think object files are stored separately than other metadata in the cache and so it may bypass other forms of version-checking logic? (I'm not sure if a version check mismatch invalidates just one piece of data or the entire incremental directory that was specified)

@ehuss
Copy link
Contributor Author

ehuss commented Apr 20, 2021

Oh, I think I see what is happening! The .o file is hard-linked, and the hard link ends up forcing the .o file the be the same file between the two incremental directories. That is:

target/debug/incremental/foo-STABLE/.../HASH.o
and
target/debug/incremental/foo-NIGHTLY/.../HASH.o
and
target/debug/deps/foo.HASH.o

all end up being the same inode. Running the second command ends up overwriting the contents of another incremental directory, and then running the third command doesn't know its cache got modified.

@alexcrichton
Copy link
Member

Oh interesting! I suppose this is a case where rustc is not removing a file before opening it for writing?

@joshtriplett
Copy link
Member

I'm confused why those files would be hardlinked. I don't think we should be sharing object files between different versions of Rust; a build with cargo +nightly build should make sure all objects were built with that nightly toolchain.

@ehuss
Copy link
Contributor Author

ehuss commented Apr 22, 2021

To clarify, the sequence is:

  1. first build with rustc stable:
    1. rustc stable writes .o files in the output directory (target/debug/deps)
    2. the .o files are hard-linked into the stable incremental directory copy_cgu_workproduct_to_incr_comp_cache_dir
  2. second build with rustc nightly:
    1. rustc nightly writes .o files in the output directory (target/debug/deps). This does not unlink the file, but instead opens the existing file and truncates it LLVMRustWriteOutputFile. Since this file is hard-linked into the stable incremental directory, this means it is writing (corrupting) the stable incremental cache.
    2. the .o files are hard-linked into the nightly incremental directory
  3. third build with rustc stable:
    1. Loading and reusing the data from the incremental cache, but the cache contains the wrong data, so the linking fails.

I haven't finished my investigation on reworking the hash behavior. I think using the correct -C metadata flags should solve the problem. Another option is to change rustc to unlink the .o file before writing to it instead of truncating, but I'm not sure if that is necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants