Skip to content

Commit

Permalink
Auto merge of #83307 - richkadel:cov-unused-functions-1.1, r=tmandry
Browse files Browse the repository at this point in the history
coverage bug fixes and optimization support

Adjusted LLVM codegen for code compiled with `-Zinstrument-coverage` to
address multiple, somewhat related issues.

Fixed a significant flaw in prior coverage solution: Every counter
generated a new counter variable, but there should have only been one
counter variable per function. This appears to have bloated .profraw
files significantly. (For a small program, it increased the size by
about 40%. I have not tested large programs, but there is anecdotal
evidence that profraw files were way too large. This is a good fix,
regardless, but hopefully it also addresses related issues.

Fixes: #82144

Invalid LLVM coverage data produced when compiled with -C opt-level=1

Existing tests now work up to at least `opt-level=3`. This required a
detailed analysis of the LLVM IR, comparisons with Clang C++ LLVM IR
when compiled with coverage, and a lot of trial and error with codegen
adjustments.

The biggest hurdle was figuring out how to continue to support coverage
results for unused functions and generics. Rust's coverage results have
three advantages over Clang's coverage results:

1. Rust's coverage map does not include any overlapping code regions,
   making coverage counting unambiguous.
2. Rust generates coverage results (showing zero counts) for all unused
   functions, including generics. (Clang does not generate coverage for
   uninstantiated template functions.)
3. Rust's unused functions produce minimal stubbed functions in LLVM IR,
   sufficient for including in the coverage results; while Clang must
   generate the complete LLVM IR for each unused function, even though
   it will never be called.

This PR removes the previous hack of attempting to inject coverage into
some other existing function instance, and generates dedicated instances
for each unused function. This change, and a few other adjustments
(similar to what is required for `-C link-dead-code`, but with lower
impact), makes it possible to support LLVM optimizations.

Fixes: #79651

Coverage report: "Unexecuted instantiation:..." for a generic function
from multiple crates

Fixed by removing the aforementioned hack. Some "Unexecuted
instantiation" notices are unavoidable, as explained in the
`used_crate.rs` test, but `-Zinstrument-coverage` has new options to
back off support for either unused generics, or all unused functions,
which avoids the notice, at the cost of less coverage of unused
functions.

Fixes: #82875

Invalid LLVM coverage data produced with crate brotli_decompressor

Fixed by disabling the LLVM function attribute that forces inlining, if
`-Z instrument-coverage` is enabled. This attribute is applied to
Rust functions with `#[inline(always)], and in some cases, the forced
inlining breaks coverage instrumentation and reports.

FYI: `@wesleywiser`

r? `@tmandry`
  • Loading branch information
bors committed Mar 25, 2021
2 parents 26c7e55 + 0859cec commit dbc37a9
Show file tree
Hide file tree
Showing 62 changed files with 3,065 additions and 342 deletions.
2 changes: 1 addition & 1 deletion compiler/rustc_codegen_llvm/src/base.rs
Original file line number Diff line number Diff line change
Expand Up @@ -143,7 +143,7 @@ pub fn compile_codegen_unit(

// Finalize code coverage by injecting the coverage map. Note, the coverage map will
// also be added to the `llvm.used` variable, created next.
if cx.sess().opts.debugging_opts.instrument_coverage {
if cx.sess().instrument_coverage() {
cx.coverageinfo_finalize();
}

Expand Down
8 changes: 4 additions & 4 deletions compiler/rustc_codegen_llvm/src/context.rs
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,7 @@ pub struct CodegenCx<'ll, 'tcx> {
pub pointee_infos: RefCell<FxHashMap<(Ty<'tcx>, Size), Option<PointeeInfo>>>,
pub isize_ty: &'ll Type,

pub coverage_cx: Option<coverageinfo::CrateCoverageContext<'tcx>>,
pub coverage_cx: Option<coverageinfo::CrateCoverageContext<'ll, 'tcx>>,
pub dbg_cx: Option<debuginfo::CrateDebugContext<'ll, 'tcx>>,

eh_personality: Cell<Option<&'ll Value>>,
Expand Down Expand Up @@ -280,7 +280,7 @@ impl<'ll, 'tcx> CodegenCx<'ll, 'tcx> {

let (llcx, llmod) = (&*llvm_module.llcx, llvm_module.llmod());

let coverage_cx = if tcx.sess.opts.debugging_opts.instrument_coverage {
let coverage_cx = if tcx.sess.instrument_coverage() {
let covctx = coverageinfo::CrateCoverageContext::new();
Some(covctx)
} else {
Expand Down Expand Up @@ -331,7 +331,7 @@ impl<'ll, 'tcx> CodegenCx<'ll, 'tcx> {
}

#[inline]
pub fn coverage_context(&'a self) -> Option<&'a coverageinfo::CrateCoverageContext<'tcx>> {
pub fn coverage_context(&'a self) -> Option<&'a coverageinfo::CrateCoverageContext<'ll, 'tcx>> {
self.coverage_cx.as_ref()
}
}
Expand Down Expand Up @@ -712,7 +712,7 @@ impl CodegenCx<'b, 'tcx> {
ifn!("llvm.va_end", fn(i8p) -> void);
ifn!("llvm.va_copy", fn(i8p, i8p) -> void);

if self.sess().opts.debugging_opts.instrument_coverage {
if self.sess().instrument_coverage() {
ifn!("llvm.instrprof.increment", fn(i8p, t_i64, t_i32, t_i32) -> void);
}

Expand Down
228 changes: 97 additions & 131 deletions compiler/rustc_codegen_llvm/src/coverageinfo/mapgen.rs

Large diffs are not rendered by default.

168 changes: 153 additions & 15 deletions compiler/rustc_codegen_llvm/src/coverageinfo/mod.rs
Original file line number Diff line number Diff line change
@@ -1,63 +1,102 @@
use crate::llvm;

use crate::abi::{Abi, FnAbi};
use crate::builder::Builder;
use crate::common::CodegenCx;

use libc::c_uint;
use llvm::coverageinfo::CounterMappingRegion;
use rustc_codegen_ssa::coverageinfo::map::{CounterExpression, FunctionCoverage};
use rustc_codegen_ssa::traits::{
BaseTypeMethods, CoverageInfoBuilderMethods, CoverageInfoMethods, MiscMethods, StaticMethods,
BaseTypeMethods, BuilderMethods, ConstMethods, CoverageInfoBuilderMethods, CoverageInfoMethods,
MiscMethods, StaticMethods,
};
use rustc_data_structures::fx::FxHashMap;
use rustc_hir as hir;
use rustc_hir::def_id::DefId;
use rustc_llvm::RustString;
use rustc_middle::bug;
use rustc_middle::mir::coverage::{
CodeRegion, CounterValueReference, ExpressionOperandId, InjectedExpressionId, Op,
};
use rustc_middle::ty;
use rustc_middle::ty::layout::FnAbiExt;
use rustc_middle::ty::subst::InternalSubsts;
use rustc_middle::ty::Instance;

use std::cell::RefCell;
use std::ffi::CString;

use std::iter;
use tracing::debug;

pub mod mapgen;

const UNUSED_FUNCTION_COUNTER_ID: CounterValueReference = CounterValueReference::START;

const VAR_ALIGN_BYTES: usize = 8;

/// A context object for maintaining all state needed by the coverageinfo module.
pub struct CrateCoverageContext<'tcx> {
pub struct CrateCoverageContext<'ll, 'tcx> {
// Coverage data for each instrumented function identified by DefId.
pub(crate) function_coverage_map: RefCell<FxHashMap<Instance<'tcx>, FunctionCoverage<'tcx>>>,
pub(crate) pgo_func_name_var_map: RefCell<FxHashMap<Instance<'tcx>, &'ll llvm::Value>>,
}

impl<'tcx> CrateCoverageContext<'tcx> {
impl<'ll, 'tcx> CrateCoverageContext<'ll, 'tcx> {
pub fn new() -> Self {
Self { function_coverage_map: Default::default() }
Self {
function_coverage_map: Default::default(),
pgo_func_name_var_map: Default::default(),
}
}

pub fn take_function_coverage_map(&self) -> FxHashMap<Instance<'tcx>, FunctionCoverage<'tcx>> {
self.function_coverage_map.replace(FxHashMap::default())
}
}

impl CoverageInfoMethods for CodegenCx<'ll, 'tcx> {
impl CoverageInfoMethods<'tcx> for CodegenCx<'ll, 'tcx> {
fn coverageinfo_finalize(&self) {
mapgen::finalize(self)
}
}

impl CoverageInfoBuilderMethods<'tcx> for Builder<'a, 'll, 'tcx> {
/// Calls llvm::createPGOFuncNameVar() with the given function instance's mangled function name.
/// The LLVM API returns an llvm::GlobalVariable containing the function name, with the specific
/// variable name and linkage required by LLVM InstrProf source-based coverage instrumentation.
fn create_pgo_func_name_var(&self, instance: Instance<'tcx>) -> Self::Value {
let llfn = self.cx.get_fn(instance);
let mangled_fn_name = CString::new(self.tcx.symbol_name(instance).name)
.expect("error converting function name to C string");
unsafe { llvm::LLVMRustCoverageCreatePGOFuncNameVar(llfn, mangled_fn_name.as_ptr()) }
fn get_pgo_func_name_var(&self, instance: Instance<'tcx>) -> &'ll llvm::Value {
if let Some(coverage_context) = self.coverage_context() {
debug!("getting pgo_func_name_var for instance={:?}", instance);
let mut pgo_func_name_var_map = coverage_context.pgo_func_name_var_map.borrow_mut();
pgo_func_name_var_map
.entry(instance)
.or_insert_with(|| create_pgo_func_name_var(self, instance))
} else {
bug!("Could not get the `coverage_context`");
}
}

/// Functions with MIR-based coverage are normally codegenned _only_ if
/// called. LLVM coverage tools typically expect every function to be
/// defined (even if unused), with at least one call to LLVM intrinsic
/// `instrprof.increment`.
///
/// Codegen a small function that will never be called, with one counter
/// that will never be incremented.
///
/// For used/called functions, the coverageinfo was already added to the
/// `function_coverage_map` (keyed by function `Instance`) during codegen.
/// But in this case, since the unused function was _not_ previously
/// codegenned, collect the coverage `CodeRegion`s from the MIR and add
/// them. The first `CodeRegion` is used to add a single counter, with the
/// same counter ID used in the injected `instrprof.increment` intrinsic
/// call. Since the function is never called, all other `CodeRegion`s can be
/// added as `unreachable_region`s.
fn define_unused_fn(&self, def_id: DefId) {
let instance = declare_unused_fn(self, &def_id);
codegen_unused_fn_and_counter(self, instance);
add_unused_function_coverage(self, instance, def_id);
}
}

impl CoverageInfoBuilderMethods<'tcx> for Builder<'a, 'll, 'tcx> {
fn set_function_source_hash(
&mut self,
instance: Instance<'tcx>,
Expand Down Expand Up @@ -145,6 +184,104 @@ impl CoverageInfoBuilderMethods<'tcx> for Builder<'a, 'll, 'tcx> {
}
}

fn declare_unused_fn(cx: &CodegenCx<'ll, 'tcx>, def_id: &DefId) -> Instance<'tcx> {
let tcx = cx.tcx;

let instance = Instance::new(
*def_id,
InternalSubsts::for_item(tcx, *def_id, |param, _| {
if let ty::GenericParamDefKind::Lifetime = param.kind {
tcx.lifetimes.re_erased.into()
} else {
tcx.mk_param_from_def(param)
}
}),
);

let llfn = cx.declare_fn(
&tcx.symbol_name(instance).name,
&FnAbi::of_fn_ptr(
cx,
ty::Binder::dummy(tcx.mk_fn_sig(
iter::once(tcx.mk_unit()),
tcx.mk_unit(),
false,
hir::Unsafety::Unsafe,
Abi::Rust,
)),
&[],
),
);

llvm::set_linkage(llfn, llvm::Linkage::WeakAnyLinkage);
llvm::set_visibility(llfn, llvm::Visibility::Hidden);

assert!(cx.instances.borrow_mut().insert(instance, llfn).is_none());

instance
}

fn codegen_unused_fn_and_counter(cx: &CodegenCx<'ll, 'tcx>, instance: Instance<'tcx>) {
let llfn = cx.get_fn(instance);
let mut bx = Builder::new_block(cx, llfn, "unused_function");
let fn_name = bx.get_pgo_func_name_var(instance);
let hash = bx.const_u64(0);
let num_counters = bx.const_u32(1);
let index = bx.const_u32(u32::from(UNUSED_FUNCTION_COUNTER_ID));
debug!(
"codegen intrinsic instrprof.increment(fn_name={:?}, hash={:?}, num_counters={:?},
index={:?}) for unused function: {:?}",
fn_name, hash, num_counters, index, instance
);
bx.instrprof_increment(fn_name, hash, num_counters, index);
bx.ret_void();
}

fn add_unused_function_coverage(
cx: &CodegenCx<'ll, 'tcx>,
instance: Instance<'tcx>,
def_id: DefId,
) {
let tcx = cx.tcx;

let mut function_coverage = FunctionCoverage::unused(tcx, instance);
for (index, &code_region) in tcx.covered_code_regions(def_id).iter().enumerate() {
if index == 0 {
// Insert at least one real counter so the LLVM CoverageMappingReader will find expected
// definitions.
function_coverage.add_counter(UNUSED_FUNCTION_COUNTER_ID, code_region.clone());
}
// Add a Zero Counter for every code region.
//
// Even though the first coverage region already has an actual Counter, `llvm-cov` will not
// always report it. Re-adding an unreachable region (zero counter) for the same region
// seems to help produce the expected coverage.
function_coverage.add_unreachable_region(code_region.clone());
}

if let Some(coverage_context) = cx.coverage_context() {
coverage_context.function_coverage_map.borrow_mut().insert(instance, function_coverage);
} else {
bug!("Could not get the `coverage_context`");
}
}

/// Calls llvm::createPGOFuncNameVar() with the given function instance's
/// mangled function name. The LLVM API returns an llvm::GlobalVariable
/// containing the function name, with the specific variable name and linkage
/// required by LLVM InstrProf source-based coverage instrumentation. Use
/// `bx.get_pgo_func_name_var()` to ensure the variable is only created once per
/// `Instance`.
fn create_pgo_func_name_var(
cx: &CodegenCx<'ll, 'tcx>,
instance: Instance<'tcx>,
) -> &'ll llvm::Value {
let mangled_fn_name = CString::new(cx.tcx.symbol_name(instance).name)
.expect("error converting function name to C string");
let llfn = cx.get_fn(instance);
unsafe { llvm::LLVMRustCoverageCreatePGOFuncNameVar(llfn, mangled_fn_name.as_ptr()) }
}

pub(crate) fn write_filenames_section_to_buffer<'a>(
filenames: impl IntoIterator<Item = &'a CString>,
buffer: &RustString,
Expand Down Expand Up @@ -177,6 +314,7 @@ pub(crate) fn write_mapping_to_buffer(
);
}
}

pub(crate) fn hash_str(strval: &str) -> u64 {
let strval = CString::new(strval).expect("null error converting hashable str to C string");
unsafe { llvm::LLVMRustCoverageHashCString(strval.as_ptr()) }
Expand Down
2 changes: 1 addition & 1 deletion compiler/rustc_codegen_ssa/src/back/link.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1746,7 +1746,7 @@ fn linker_with_args<'a, B: ArchiveBuilder<'a>>(
);

// OBJECT-FILES-NO, AUDIT-ORDER
if sess.opts.cg.profile_generate.enabled() || sess.opts.debugging_opts.instrument_coverage {
if sess.opts.cg.profile_generate.enabled() || sess.instrument_coverage() {
cmd.pgo_gen();
}

Expand Down
4 changes: 1 addition & 3 deletions compiler/rustc_codegen_ssa/src/back/symbol_export.rs
Original file line number Diff line number Diff line change
Expand Up @@ -188,9 +188,7 @@ fn exported_symbols_provider_local(
}
}

if tcx.sess.opts.debugging_opts.instrument_coverage
|| tcx.sess.opts.cg.profile_generate.enabled()
{
if tcx.sess.instrument_coverage() || tcx.sess.opts.cg.profile_generate.enabled() {
// These are weak symbols that point to the profile version and the
// profile name, which need to be treated as exported so LTO doesn't nix
// them.
Expand Down
2 changes: 1 addition & 1 deletion compiler/rustc_codegen_ssa/src/back/write.rs
Original file line number Diff line number Diff line change
Expand Up @@ -176,7 +176,7 @@ impl ModuleConfig {

// The rustc option `-Zinstrument_coverage` injects intrinsic calls to
// `llvm.instrprof.increment()`, which requires the LLVM `instrprof` pass.
if sess.opts.debugging_opts.instrument_coverage {
if sess.instrument_coverage() {
passes.push("instrprof".to_owned());
}
passes
Expand Down
25 changes: 21 additions & 4 deletions compiler/rustc_codegen_ssa/src/coverageinfo/map.rs
Original file line number Diff line number Diff line change
Expand Up @@ -31,27 +31,44 @@ pub struct Expression {
pub struct FunctionCoverage<'tcx> {
instance: Instance<'tcx>,
source_hash: u64,
is_used: bool,
counters: IndexVec<CounterValueReference, Option<CodeRegion>>,
expressions: IndexVec<InjectedExpressionIndex, Option<Expression>>,
unreachable_regions: Vec<CodeRegion>,
}

impl<'tcx> FunctionCoverage<'tcx> {
/// Creates a new set of coverage data for a used (called) function.
pub fn new(tcx: TyCtxt<'tcx>, instance: Instance<'tcx>) -> Self {
Self::create(tcx, instance, true)
}

/// Creates a new set of coverage data for an unused (never called) function.
pub fn unused(tcx: TyCtxt<'tcx>, instance: Instance<'tcx>) -> Self {
Self::create(tcx, instance, false)
}

fn create(tcx: TyCtxt<'tcx>, instance: Instance<'tcx>, is_used: bool) -> Self {
let coverageinfo = tcx.coverageinfo(instance.def_id());
debug!(
"FunctionCoverage::new(instance={:?}) has coverageinfo={:?}",
instance, coverageinfo
"FunctionCoverage::new(instance={:?}) has coverageinfo={:?}. is_used={}",
instance, coverageinfo, is_used
);
Self {
instance,
source_hash: 0, // will be set with the first `add_counter()`
is_used,
counters: IndexVec::from_elem_n(None, coverageinfo.num_counters as usize),
expressions: IndexVec::from_elem_n(None, coverageinfo.num_expressions as usize),
unreachable_regions: Vec::new(),
}
}

/// Returns true for a used (called) function, and false for an unused function.
pub fn is_used(&self) -> bool {
self.is_used
}

/// Sets the function source hash value. If called multiple times for the same function, all
/// calls should have the same hash value.
pub fn set_function_source_hash(&mut self, source_hash: u64) {
Expand Down Expand Up @@ -128,8 +145,8 @@ impl<'tcx> FunctionCoverage<'tcx> {
&'a self,
) -> (Vec<CounterExpression>, impl Iterator<Item = (Counter, &'a CodeRegion)>) {
assert!(
self.source_hash != 0,
"No counters provided the source_hash for function: {:?}",
self.source_hash != 0 || !self.is_used,
"No counters provided the source_hash for used function: {:?}",
self.instance
);

Expand Down
2 changes: 1 addition & 1 deletion compiler/rustc_codegen_ssa/src/mir/coverageinfo.rs
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ impl<'a, 'tcx, Bx: BuilderMethods<'a, 'tcx>> FunctionCx<'a, 'tcx, Bx> {

let coverageinfo = bx.tcx().coverageinfo(instance.def_id());

let fn_name = bx.create_pgo_func_name_var(instance);
let fn_name = bx.get_pgo_func_name_var(instance);
let hash = bx.const_u64(function_source_hash);
let num_counters = bx.const_u32(coverageinfo.num_counters);
let index = bx.const_u32(u32::from(id));
Expand Down
Loading

0 comments on commit dbc37a9

Please sign in to comment.