Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to link regex crate with "-Z instrument-coverage" on x86_64-pc-windows-msvc #85461

Closed
ririsoft opened this issue May 19, 2021 · 20 comments · Fixed by #91470
Closed

Unable to link regex crate with "-Z instrument-coverage" on x86_64-pc-windows-msvc #85461

ririsoft opened this issue May 19, 2021 · 20 comments · Fixed by #91470
Labels
A-code-coverage Area: Source-based code coverage (-Cinstrument-coverage) C-bug Category: This is a bug. O-windows-msvc Toolchain: MSVC, Operating system: Windows

Comments

@ririsoft
Copy link

Hello,

Linking regex crate on x86_64-pc-windows-msvc fails when using -Z instrument-coverage.
The link succeeds when using -Z instrument-coverage=except-unused-functions

# Cargo.toml
[package]
name = "foobar"
version = "0.1.0"
authors = ["foobar"]
edition = "2018"

[dependencies]
regex = "1.5.4"
// main.rs
fn main() {
    use regex::Regex;
    let re = Regex::new(r"^\d{4}-\d{2}-\d{2}$").unwrap();
    assert!(re.is_match("2014-01-01"));
}

I expected to see this happen: linking successful.

Instead, this happened:

error: linking with `link.exe` failed: exit code: 1227

note: libmemchr-40eebd4a74e50761.rlib(memchr-40eebd4a74e50761.memchr.2b5ledse-cgu.14.rcgu.o) : fatal error LNK1227: conflicting weak extern definition for '_RINvNtNtNtCs838Isxeq56n_6memchr6memmem9prefilter11genericsimd14find_in_chunk3pEB8_'.  New default '.weak._RINvNtNtNtCs838Isxeq56n_6memchr6memmem9prefilter11genericsimd14find_in_chunk3pEB8_.default._RINvMNtCs93I3MPx9COt_4core5sliceSh13get_uncheckedINtNtNtB5_3ops5range7RangeTojEECs838Isxeq56n_6memchr' conflicts with old default '.weak._RINvNtNtNtCs838Isxeq56n_6memchr6memmem9prefilter11genericsimd14find_in_chunk3pEB8_.default._RINvMNtNtCs93I3MPx9COt_4core3ptr9const_ptrPh4castuECs838Isxeq56n_6memchr' in libmemchr-40eebd4a74e50761.rlib(memchr-40eebd4a74e50761.memchr.2b5ledse-cgu.3.rcgu.o)

Meta

rustc --version --verbose:

rustc +nightly --version --verbose       
rustc 1.54.0-nightly (3e99439f4 2021-05-17)
binary: rustc
commit-hash: 3e99439f4dacc8ba0d2ca48d221694362d587927
commit-date: 2021-05-17
host: x86_64-pc-windows-msvc
release: 1.54.0-nightly
LLVM version: 12.0.1
@ririsoft ririsoft added the C-bug Category: This is a bug. label May 19, 2021
@jonas-schievink jonas-schievink added the A-code-coverage Area: Source-based code coverage (-Cinstrument-coverage) label May 19, 2021
@richkadel
Copy link
Contributor

I'm adding a note here because my brain gets confused by github's bold-text "mentioned this issue" notes. I see the "No counters provided the source-hash" and jump to the wrong conclusion that this is related. (I think I've done this more than once.)

So, NOTE TO SELF, this issue is NOT related to the "No counters provided the source hash" bug...

Thanks and sorry for the churn... 😄

@ririsoft
Copy link
Author

ririsoft commented Jun 1, 2021

I'm adding a note here because my brain gets confused by github's bold-text "mentioned this issue" notes. I see the "No counters provided the source-hash" and jump to the wrong conclusion that this is related. (I think I've done this more than once.)

So, NOTE TO SELF, this issue is NOT related to the "No counters provided the source hash" bug...

Thanks and sorry for the churn... smile

Even worth, I suspect you deleted your previous comment 😋. In my mail box I got from github on your behalf, I have the message "This should be fixed. There was a brief window in the nightly builds (less than a week I think) where this was a known bug. It was resolved."), but I have no trace of it in the conversation, so I am even more lost with just above message. If you wish you can edit messages instead of deleting. Users can see that your message was edited and they can access edition history to see what changed.

But considering all the effort you are giving with instrument coverage, and how important this is to Rust (in my humble opinion), you are fully forgiven 😄

@EFanZh
Copy link
Contributor

EFanZh commented Oct 24, 2021

I see instrument-coverage is about to be stabilized: #90132. Will this issue be solved before stabilization?

@richkadel
Copy link
Contributor

@wesleywiser I rarely have a useable windows configuration for debugging coverage. Any chance you can look into this?

Also note the theory and workaround suggested in microsoft/windows-rs#1006 (comment)

I think stabilizing the flag does not imply every edge case is understood and resolved so it may not block stabilization, but I haven't had any time to look into this one since it was filed. I suspect it has to do with generating function symbols for unused functions in the covmap generation code but not sure if something else is involved (macros were mentioned in the linked comment).

Since I'm out for 3 weeks I'm not going to have a chance to look at this for at least that long. Seems very rare too.

Thanks!

Cc: @tmandry

@wesleywiser
Copy link
Member

@richkadel Yep, I'll take a look this week!

@BlackHoleFox
Copy link
Contributor

BlackHoleFox commented Nov 10, 2021

Seems very rare too.

I hit this with the webpki crate today as well, so any project with rustls in its tree (ureq for me) may hit it. That said, only three (known) cases isn't a lot of evidence towards rarity or against it...

xd009642 added a commit to xd009642/tarpaulin that referenced this issue Nov 10, 2021
* -C codegen-units=1 rust-lang/rust#85461
* Look for profraws in workspace folders
* Upgrade llvm-profparser to one that handles merging no files
bors bot added a commit to taiki-e/cargo-llvm-cov that referenced this issue Nov 13, 2021
101: Force 1 codegen unit r=taiki-e a=taiki-e

Workaround for rust-lang/rust#85461
See also microsoft/windows-rs#1006 (comment)

Co-authored-by: Taiki Endo <te316e89@gmail.com>
bors bot added a commit to taiki-e/cargo-llvm-cov that referenced this issue Nov 13, 2021
101: Force 1 codegen unit on windows r=taiki-e a=taiki-e

Workaround for rust-lang/rust#85461
See also microsoft/windows-rs#1006 (comment)

Co-authored-by: Taiki Endo <te316e89@gmail.com>
@richkadel
Copy link
Contributor

@wesleywiser - Did you have any opportunity to look into this?

@wesleywiser
Copy link
Member

wesleywiser commented Dec 1, 2021

I'm still looking into this but I have managed to construct a minimal (I believe) repro:

# Cargo.toml
[package]
name = "memchr"
version = "2.4.1"
edition = "2018"
// src/main.rs
use memchr::memchr;

fn main() {
    memchr();
}
// src/lib.rs
mod cow {
  #[derive(Debug)]
  struct CowBytes<'a>(&'a ());

  #[derive(Debug)]
  enum Imp<'a> {
      Owned(&'a [u8]),
  }
}

pub fn memchr() {
  static FN: std::sync::Once = std::sync::Once::new();

  let _f = &FN.is_completed();
}

mod memmem;
// src/memmem/mod.rs
mod genericsimd {
    use crate::memmem::PrefilterFnTy;
    
    #[inline(always)]
    pub(crate) fn find(
        haystack: &(),
        fallback: PrefilterFnTy,
    ) { }
    
    fn find_in_chunk2() { }    
}

mod x86 {
    pub(crate) mod avx {
        use core::arch::x86_64::__m256i;
    
        use crate::memmem::PrefilterFnTy;
        
        const _: PrefilterFnTy = find;
        
        pub(crate) fn find(
            haystack: &(),
        ) {
            super::super::genericsimd::find(
                haystack,
                super::sse::find2,
            )
        }
    }
    
    pub(crate) mod sse {
        use core::arch::x86_64::__m128i;

        pub(crate) fn find2(
            haystack: &(),
        ) {
            super::super::genericsimd::find(
                haystack,
                |_| { },
            )
        }   
    }    
}

#[derive(Debug)]
struct NeedleInfo { }

type PrefilterFnTy = fn(
    haystack: &(),
);

Which, with the appropriate flags, will then trigger the link error:

> $env::RUSTFLAGS="-Zinstrument-coverage -Ccodegen-units=16"
> cargo +nightly build
Compiling memchr v2.4.1 (D:\code\temp\memchr)
error: linking with link.exe failed: exit code: 1227
|
= note: libmemchr-c0a7af6bade6d9a3.rlib(memchr-c0a7af6bade6d9a3.58nah2n3aa285jtv.rcgu.o) : fatal error LNK1227: conflicting weak extern definition for '_RNvNtNtCsgYqfNtRWHVc_6memchr6memmem11genericsimd14find_in_chunk2'. New default '.weak._RNvNtNtCsgYqfNtRWHVc_6memchr6memmem11genericsimd14find_in_chunk2.default._RINvNtCsjJUIXhzXbX_4core3ptr13drop_in_placeRRShECsgYqfNtRWHVc_6memchr' conflicts with old default '.weak._RNvNtNtCsgYqfNtRWHVc_6memchr6memmem11genericsimd14find_in_chunk2.default._RNvCsgYqfNtRWHVc_6memchr6memchr' in libmemchr-c0a7af6bade6d9a3.rlib(memchr-c0a7af6bade6d9a3.4b1up16ivtuwqgl4.rcgu.o)

There's quite a lot of dead code in the "minimal" repro but removing any of it or even moving it around causes the error to go away. We need basically this exact structure to get the codegen units to be placed into the state which triggers this bug.

@richkadel
Copy link
Contributor

Thank you! Having a reproducible example like this is still very helpful.

It's not a stretch to assume the trigger relates to the CGU-specific logic in

fn add_unused_functions<'ll, 'tcx>(cx: &CodegenCx<'ll, 'tcx>) {

My educated guess (still a guess though) is that we're trying to define_unused-fn for the same unused def_id from multiple CGUs.

I will try your example on Linux as well, just to be certain this reproducible case really is Windows specific.

If it is the add_unused_function() CGU logic, and is Windows-specific, the question I have is, what's different between Windows and other platforms like Linux?

Some half-baked theories:

  • Codegen on Linux is not actually using multiple codegen units? (Or not as commonly/extensively as on Windows?) I assume this is not the case.
  • Some subtle difference in the parallelism between Windows and Linux is making this situation more likely on Windows? I can't picture how it would be that deterministic on Windows.
  • Differences between the linkers and/or linker parameters, and how they handle duplicate symbols. If so, is this a "bug" in rustc that we can "fix" to make linking on Windows more like linking on the other platforms?

@richkadel
Copy link
Contributor

It might also be worth logging the def_ids in that define_unused_fn call on line 374, maybe along with identifying the CGU, to confirm that more than one CGU is trying to define the same def_id (unless you already know this to be the case).

@richkadel
Copy link
Contributor

@wesleywiser - What is the filename for the last chunk of rust code? It doesn't appear to be referenced from any of the other files, unless I'm missing something. I wonder why it gets pulled into the compilation?

@wesleywiser
Copy link
Member

@richkadel Woops, sorry about that! I've updated the comment to include the file name. It's pulled in by the mod memmem; statement in src/lib.rs.

Codegen on Linux is not actually using multiple codegen units? (Or not as commonly/extensively as on Windows?) I assume this is not the case.

Codegen on Linux does use multiple cgus (especially when -Cincremental is passed). At least from my tests, -Ccodegen-units=16 is the minimum number of CGUs required to trigger this.

Some subtle difference in the parallelism between Windows and Linux is making this situation more likely on Windows? I can't picture how it would be that deterministic on Windows.

I haven't had any issue reproducing with this source (ie, it's not intermittent when the source code is not changed). So it doesn't feel like a parallelism/determinism issue to me.

Differences between the linkers and/or linker parameters, and how they handle duplicate symbols. If so, is this a "bug" in rustc that we can "fix" to make linking on Windows more like linking on the other platforms?

This seems like the most likely issue to me. I'm digging into the symbol table and LLVM IR ...

@richkadel
Copy link
Contributor

FYI, I just compiled this on my Linux machine, using -Ccodegen-units=16, and I added a debug message. It does indeed appear to define the same def_id from different CGUs, but it links successfully.

This log is printed twice, with two different cgu names:

define_unused_fn(DefId(0:12 ~ memchr[f04d]::memmem::genericsimd::find_in_chunk2)) in CGU: <cgu name>

@richkadel
Copy link
Contributor

(In case you'd like to try the same, I added this line right before the call to define_unused_fn(...):

            debug!("define_unused_fn({:?}) in CGU: {}", def_id, cx.codegen_unit.name());

@richkadel
Copy link
Contributor

richkadel commented Dec 1, 2021

Here's another (surprising? to me anyway) clue:

I added a debug log here:

    let mut first_covered_def_id_by_file: FxHashMap<Symbol, DefId> = FxHashMap::default();
    for &def_id in sorted_codegenned_def_ids.iter() {
        if let Some(covered_file_name) = tcx.covered_file_name(def_id) {
            // Only add files known to have unused functions
            if unused_def_ids_by_file.contains_key(covered_file_name) {
                debug!(
                    "CGU {} compiles {:?} from {}", 
                    cx.codegen_unit.name(),
                    def_id,
                    covered_file_name
                );
                first_covered_def_id_by_file.entry(*covered_file_name).or_insert(def_id);
            }
        }
    }

And two CGU's (4e9rgm7ajlxolwev shown below and a different CGU) printed identical results:

160:DEBUG rustc_codegen_llvm::coverageinfo::mapgen CGU 4e9rgm7ajlxolwev compiles DefId(0:11 ~ memchr[f04d]::memmem::genericsimd::find) from src/memmem/mod.rs
161:DEBUG rustc_codegen_llvm::coverageinfo::mapgen CGU 4e9rgm7ajlxolwev compiles DefId(0:22 ~ memchr[f04d]::memmem::x86::avx::find) from src/memmem/mod.rs
162:DEBUG rustc_codegen_llvm::coverageinfo::mapgen CGU 4e9rgm7ajlxolwev compiles DefId(0:27 ~ memchr[f04d]::memmem::x86::sse::find2) from src/memmem/mod.rs
163:DEBUG rustc_codegen_llvm::coverageinfo::mapgen CGU 4e9rgm7ajlxolwev compiles DefId(0:28 ~ memchr[f04d]::memmem::x86::sse::find2::{closure#0}) from src/memmem/mod.rs
164:DEBUG rustc_codegen_llvm::coverageinfo::mapgen CGU 4e9rgm7ajlxolwev compiles DefId(0:47 ~ memchr[f04d]::memmem::{impl#0}::fmt) from src/memmem/mod.rs
165:DEBUG rustc_codegen_llvm::coverageinfo::mapgen CGU 4e9rgm7ajlxolwev: define_unused_fn(DefId(0:12 ~ memchr[f04d]::memmem::genericsimd::find_in_chunk2)) from src/memmem/mod.rs

Doesn't this mean that the two CGUs are redundantly generating code for the same 5 functions (6 including the undefined function coverage is adding)? Is there something about the symbols generated for the "used" functions shown here that makes them unique within their CGU, so Windows is not getting linker errors on the used functions as well?

Wishful thinking, maybe this identifies an opportunity to improve compiler performance, but more likely I just misunderstood the relationship between the CGUs and the results from the call to tcx.codegened_and_inlined_items(()).

@wesleywiser
Copy link
Member

Ok, that was super helpful and pointed me in the direction of the actual minimal repro. Thanks @richkadel!

// lib.rs
mod foo {
    #[inline(always)]
    pub fn called() { }

    fn uncalled() { }
}

pub mod bar {
    pub fn call_me() {
        super::foo::called();
    }
}

pub mod baz {
    pub fn call_me() {
        super::foo::called();
    }
}
// repro.rs
use lib::{bar, baz};

fn main() {
    bar::call_me();
    baz::call_me();
}
> rustc +nightly --crate-type lib -Zinstrument-coverage --crate-name lib lib.rs
> rustc +nightly --edition 2021 --extern lib=liblib.rlib repro.rs
error: linking with `link.exe` failed: exit code: 1227
  |
  = note: liblib.rlib(lib.lib.f2c86790-cgu.0.rcgu.o) : fatal error LNK1227: conflicting weak extern definition for '_RNvNtCslAVHmXkNn7E_3lib3foo8uncalled'.  New default '.weak._RNvNtCslAVHmXkNn7E_3lib3foo8uncalled.default._RNvNtCslAVHmXkNn7E_3lib3baz7call_me' conflicts with old default '.weak._RNvNtCslAVHmXkNn7E_3lib3foo8uncalled.default._RNvNtCslAVHmXkNn7E_3lib3bar7call_me' in liblib.rlib(lib.lib.f2c86790-cgu.1.rcgu.o)

Since called is marked #[inline(always)], it's included in the CGU for lib::bar and the CGU for lib::baz. Therefore, both CGUs think they should be responsible for generating the unused function and there are two copies of it generated. While the functions are generated as weak symbols, the MSVC linker seems to treat weak symbols differently than unix-y linkers do.

I'm a bit curious why we use weak/hidden instead of private/default for these functions?

llvm::set_linkage(llfn, llvm::Linkage::WeakAnyLinkage);
llvm::set_visibility(llfn, llvm::Visibility::Hidden);

Using private/default seems to produce the correct results (at least in this simple test):

PS D:\code\temp> D:\code\rust4\build\x86_64-pc-windows-msvc\llvm\bin\llvm-cov.exe show -Xdemangler=rustfilt .\repro.exe -instr-profile="default.profdata"
lib.rs:
    1|       |mod foo {
    2|       |    #[inline(always)]
    3|      2|    pub fn called() { }
    4|       |
    5|      0|    fn uncalled() { }
    6|       |}
    7|       |
    8|       |pub mod bar {
    9|      1|    pub fn call_me() {
   10|      1|        super::foo::called();
   11|      1|    }
   12|       |}
   13|       |
   14|       |pub mod baz {
   15|      1|    pub fn call_me() {
   16|      1|        super::foo::called();
   17|      1|    }
   18|       |}
   19|       |

repro.rs:
    1|       |use lib::{bar, baz};
    2|       |
    3|      1|fn main() {
    4|      1|    bar::call_me();
    5|      1|    baz::call_me();
    6|      1|}

Changing to use those attributes instead seems like a quick, easy fix to resolve this for now. A better solution might be to reserve one CGU and put all of the dead code in that CGU which would sidestep the entire issue by ensuring there is only one CGU with these functions in it.

@richkadel
Copy link
Contributor

Changing to use those attributes instead seems like a quick, easy fix to resolve this for now.

Let's try it! Are you planning to submit the PR or do you want me to.

A better solution might be to reserve one CGU and put all of the dead code in that CGU which would sidestep the entire issue by ensuring there is only one CGU with these functions in it.

I think this is harder than it might seem on the surface. If I recall correctly, this was something I looked into, but the information required was not where I needed it, after CGU partitioning. Someone may know a way to do it, but I just want to caution that it might not be that trivial.

@wesleywiser
Copy link
Member

Let's see if CI passes 🤞 #91470

@richkadel
Copy link
Contributor

Historical context: See #83307 (comment) ...

My depth of understanding of all of the different linkage options is still more shallow than I'd like. I read the LLVM documentation descriptions several times, but the number of combinations, and their effects, are still hard for me to understand. Some choices I had to make were an "educated guess" on my part. As you can see, MOST of the time, the current linkage works, but if your suggested changes resolve these issues without breaking anything else, that's fantastic.

@wesleywiser
Copy link
Member

Ah, thanks! I won't claim my understanding of the linkage options is complete either, but I think private/default will cause the dead code functions' symbols to be made private to the CGU they're declared in and prevent any name collisions by renaming if necessary.

@wesleywiser wesleywiser added the O-windows-msvc Toolchain: MSVC, Operating system: Windows label Dec 3, 2021
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Dec 10, 2021
…r, r=tmandry

code-cov: generate dead functions with private/default linkage

As discovered in rust-lang#85461, the MSVC linker treats weak symbols slightly
differently than unix-y linkers do. This causes link.exe to fail with
LNK1227 "conflicting weak extern definition" where as other targets are
able to link successfully.

This changes the dead functions from being generated as weak/hidden to
private/default which, as the LLVM reference says:

> Global values with “private” linkage are only directly accessible by
objects in the current module. In particular, linking code into a module
with a private global value may cause the private to be renamed as
necessary to avoid collisions. Because the symbol is private to the
module, all references can be updated. This doesn’t show up in any
symbol table in the object file.

This fixes the conflicting weak symbols but doesn't address the reason
*why* we have conflicting symbols for these dead functions. The test
cases added in this commit contain a minimal repro of the fundamental
issue which is that the logic used to decide what dead code functions
should be codegen'd in the current CGU doesn't take into account that
functions can be duplicated across multiple CGUs (for instance, in the
case of `#[inline(always)]` functions).

Fixing that is likely to be a more complex change (see
rust-lang#85461 (comment)).

Fixes rust-lang#85461
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Dec 10, 2021
…r, r=tmandry

code-cov: generate dead functions with private/default linkage

As discovered in rust-lang#85461, the MSVC linker treats weak symbols slightly
differently than unix-y linkers do. This causes link.exe to fail with
LNK1227 "conflicting weak extern definition" where as other targets are
able to link successfully.

This changes the dead functions from being generated as weak/hidden to
private/default which, as the LLVM reference says:

> Global values with “private” linkage are only directly accessible by
objects in the current module. In particular, linking code into a module
with a private global value may cause the private to be renamed as
necessary to avoid collisions. Because the symbol is private to the
module, all references can be updated. This doesn’t show up in any
symbol table in the object file.

This fixes the conflicting weak symbols but doesn't address the reason
*why* we have conflicting symbols for these dead functions. The test
cases added in this commit contain a minimal repro of the fundamental
issue which is that the logic used to decide what dead code functions
should be codegen'd in the current CGU doesn't take into account that
functions can be duplicated across multiple CGUs (for instance, in the
case of `#[inline(always)]` functions).

Fixing that is likely to be a more complex change (see
rust-lang#85461 (comment)).

Fixes rust-lang#85461
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Dec 10, 2021
…r, r=tmandry

code-cov: generate dead functions with private/default linkage

As discovered in rust-lang#85461, the MSVC linker treats weak symbols slightly
differently than unix-y linkers do. This causes link.exe to fail with
LNK1227 "conflicting weak extern definition" where as other targets are
able to link successfully.

This changes the dead functions from being generated as weak/hidden to
private/default which, as the LLVM reference says:

> Global values with “private” linkage are only directly accessible by
objects in the current module. In particular, linking code into a module
with a private global value may cause the private to be renamed as
necessary to avoid collisions. Because the symbol is private to the
module, all references can be updated. This doesn’t show up in any
symbol table in the object file.

This fixes the conflicting weak symbols but doesn't address the reason
*why* we have conflicting symbols for these dead functions. The test
cases added in this commit contain a minimal repro of the fundamental
issue which is that the logic used to decide what dead code functions
should be codegen'd in the current CGU doesn't take into account that
functions can be duplicated across multiple CGUs (for instance, in the
case of `#[inline(always)]` functions).

Fixing that is likely to be a more complex change (see
rust-lang#85461 (comment)).

Fixes rust-lang#85461
@bors bors closed this as completed in d5f6b9c Dec 11, 2021
wesleywiser added a commit to wesleywiser/rust that referenced this issue Dec 20, 2021
The issue here is that the logic used to determine which CGU to put the
dead function stubs in doesn't handle cases where a module is never
assigned to a CGU.

The partitioning logic also caused issues in rust-lang#85461 where inline
functions were duplicated into multiple CGUs resulting in duplicate
symbols.

This commit fixes the issue by removing the complex logic used to assign
dead code stubs to CGUs and replaces it with a much simplier model: we
pick one CGU to hold all the dead code stubs. We pick a CGU which has
exported items which increases the likelihood the linker won't throw
away our dead functions and we pick the smallest to minimize the impact
on compilation times for crates with very large CGUs.

Fixes rust-lang#86177
Fixes rust-lang#85718
Fixes rust-lang#79622
BlackHoleFox added a commit to BlackHoleFox/tarpaulin that referenced this issue Jan 3, 2022
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Jan 5, 2022
…ioning, r=tmandry

[code coverage] Fix missing dead code in modules that are never called

The issue here is that the logic used to determine which CGU to put the dead function stubs in doesn't handle cases where a module is never assigned to a CGU (which is what happens when all of the code in the module is dead).

The partitioning logic also caused issues in rust-lang#85461 where inline functions were duplicated into multiple CGUs resulting in duplicate symbols.

This commit fixes the issue by removing the complex logic used to assign dead code stubs to CGUs and replaces it with a much simpler model: we pick one CGU to hold all the dead code stubs. We pick a CGU which has exported items which increases the likelihood the linker won't throw away our dead functions and we pick the smallest to minimize the impact on compilation times for crates with very large CGUs.

Fixes rust-lang#91661
Fixes rust-lang#86177
Fixes rust-lang#85718
Fixes rust-lang#79622

r? `@tmandry`
cc `@richkadel`

This PR is not urgent so please don't let it interrupt your holidays! 🎄 🎁
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Jan 12, 2022
…ioning, r=tmandry

[code coverage] Fix missing dead code in modules that are never called

The issue here is that the logic used to determine which CGU to put the dead function stubs in doesn't handle cases where a module is never assigned to a CGU (which is what happens when all of the code in the module is dead).

The partitioning logic also caused issues in rust-lang#85461 where inline functions were duplicated into multiple CGUs resulting in duplicate symbols.

This commit fixes the issue by removing the complex logic used to assign dead code stubs to CGUs and replaces it with a much simpler model: we pick one CGU to hold all the dead code stubs. We pick a CGU which has exported items which increases the likelihood the linker won't throw away our dead functions and we pick the smallest to minimize the impact on compilation times for crates with very large CGUs.

Fixes rust-lang#91661
Fixes rust-lang#86177
Fixes rust-lang#85718
Fixes rust-lang#79622

r? `@tmandry`
cc `@richkadel`

This PR is not urgent so please don't let it interrupt your holidays! 🎄 🎁
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Jan 13, 2022
…ioning, r=tmandry

[code coverage] Fix missing dead code in modules that are never called

The issue here is that the logic used to determine which CGU to put the dead function stubs in doesn't handle cases where a module is never assigned to a CGU (which is what happens when all of the code in the module is dead).

The partitioning logic also caused issues in rust-lang#85461 where inline functions were duplicated into multiple CGUs resulting in duplicate symbols.

This commit fixes the issue by removing the complex logic used to assign dead code stubs to CGUs and replaces it with a much simpler model: we pick one CGU to hold all the dead code stubs. We pick a CGU which has exported items which increases the likelihood the linker won't throw away our dead functions and we pick the smallest to minimize the impact on compilation times for crates with very large CGUs.

Fixes rust-lang#91661
Fixes rust-lang#86177
Fixes rust-lang#85718
Fixes rust-lang#79622

r? ``@tmandry``
cc ``@richkadel``

This PR is not urgent so please don't let it interrupt your holidays! 🎄 🎁
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Jan 13, 2022
…ioning, r=tmandry

[code coverage] Fix missing dead code in modules that are never called

The issue here is that the logic used to determine which CGU to put the dead function stubs in doesn't handle cases where a module is never assigned to a CGU (which is what happens when all of the code in the module is dead).

The partitioning logic also caused issues in rust-lang#85461 where inline functions were duplicated into multiple CGUs resulting in duplicate symbols.

This commit fixes the issue by removing the complex logic used to assign dead code stubs to CGUs and replaces it with a much simpler model: we pick one CGU to hold all the dead code stubs. We pick a CGU which has exported items which increases the likelihood the linker won't throw away our dead functions and we pick the smallest to minimize the impact on compilation times for crates with very large CGUs.

Fixes rust-lang#91661
Fixes rust-lang#86177
Fixes rust-lang#85718
Fixes rust-lang#79622

r? ```@tmandry```
cc ```@richkadel```

This PR is not urgent so please don't let it interrupt your holidays! 🎄 🎁
xd009642 added a commit to xd009642/tarpaulin that referenced this issue Jan 16, 2022
* -C codegen-units=1 rust-lang/rust#85461
* Look for profraws in workspace folders
* Upgrade llvm-profparser to one that handles merging no files
xd009642 added a commit to xd009642/tarpaulin that referenced this issue Apr 9, 2022
* -C codegen-units=1 rust-lang/rust#85461
* Look for profraws in workspace folders
* Upgrade llvm-profparser to one that handles merging no files
xd009642 added a commit to xd009642/tarpaulin that referenced this issue May 31, 2022
* -C codegen-units=1 rust-lang/rust#85461
* Look for profraws in workspace folders
* Upgrade llvm-profparser to one that handles merging no files
bors bot added a commit to taiki-e/cargo-llvm-cov that referenced this issue Jul 24, 2022
199: Don't emit `codegen-units=1` when `stable_coverage` is set r=taiki-e a=ldm0

Do the same thing in <https://github.com/taiki-e/cargo-llvm-cov/pull/130/files#diff-42cb6807ad74b3e201c5a7ca98b911c5fa08380e942be6e4ac5807f8377f87fcR204> since rust-lang/rust#85461 is already closed.

Co-authored-by: Liu Dingming <liudingming@bytedance.com>
xd009642 added a commit to xd009642/tarpaulin that referenced this issue Aug 21, 2022
* -C codegen-units=1 rust-lang/rust#85461
* Look for profraws in workspace folders
* Upgrade llvm-profparser to one that handles merging no files
xd009642 added a commit to xd009642/tarpaulin that referenced this issue Sep 11, 2022
* -C codegen-units=1 rust-lang/rust#85461
* Look for profraws in workspace folders
* Upgrade llvm-profparser to one that handles merging no files
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-code-coverage Area: Source-based code coverage (-Cinstrument-coverage) C-bug Category: This is a bug. O-windows-msvc Toolchain: MSVC, Operating system: Windows
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants