Skip to content

Encode only MIR reachable from other crates #115306

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Sep 10, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
43 changes: 31 additions & 12 deletions compiler/rustc_metadata/src/rmeta/encoder.rs
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,8 @@ use rustc_data_structures::temp_dir::MaybeTempDir;
use rustc_hir as hir;
use rustc_hir::def::DefKind;
use rustc_hir::def_id::{
CrateNum, DefId, DefIndex, LocalDefId, CRATE_DEF_ID, CRATE_DEF_INDEX, LOCAL_CRATE,
CrateNum, DefId, DefIndex, LocalDefId, LocalDefIdSet, CRATE_DEF_ID, CRATE_DEF_INDEX,
LOCAL_CRATE,
};
use rustc_hir::definitions::DefPathData;
use rustc_hir::lang_items::LangItem;
Expand Down Expand Up @@ -50,7 +51,6 @@ pub(super) struct EncodeContext<'a, 'tcx> {
opaque: opaque::FileEncoder,
tcx: TyCtxt<'tcx>,
feat: &'tcx rustc_feature::Features,

tables: TableBuilders,

lazy_state: LazyState,
Expand Down Expand Up @@ -1002,15 +1002,31 @@ fn should_encode_stability(def_kind: DefKind) -> bool {
}
}

/// Whether we should encode MIR.
/// Whether we should encode MIR. Return a pair, resp. for CTFE and for LLVM.
///
/// Computing, optimizing and encoding the MIR is a relatively expensive operation.
/// We want to avoid this work when not required. Therefore:
/// - we only compute `mir_for_ctfe` on items with const-eval semantics;
/// - we skip `optimized_mir` for check runs.
/// - we only encode `optimized_mir` that could be generated in other crates, that is, a code that
/// is either generic or has inline hint, and is reachable from the other crates (contained
/// in reachable set).
///
/// Note: Reachable set describes definitions that might be generated or referenced from other
/// crates and it can be used to limit optimized MIR that needs to be encoded. On the other hand,
/// the reachable set doesn't have much to say about which definitions might be evaluated at compile
/// time in other crates, so it cannot be used to omit CTFE MIR. For example, `f` below is
/// unreachable and yet it can be evaluated in other crates:
///
/// Return a pair, resp. for CTFE and for LLVM.
fn should_encode_mir(tcx: TyCtxt<'_>, def_id: LocalDefId) -> (bool, bool) {
/// ```
/// const fn f() -> usize { 0 }
/// pub struct S { pub a: [usize; f()] }
/// ```
fn should_encode_mir(
tcx: TyCtxt<'_>,
reachable_set: &LocalDefIdSet,
def_id: LocalDefId,
) -> (bool, bool) {
match tcx.def_kind(def_id) {
// Constructors
DefKind::Ctor(_, _) => {
Expand All @@ -1027,14 +1043,15 @@ fn should_encode_mir(tcx: TyCtxt<'_>, def_id: LocalDefId) -> (bool, bool) {
// Full-fledged functions + closures
DefKind::AssocFn | DefKind::Fn | DefKind::Closure => {
let generics = tcx.generics_of(def_id);
let needs_inline = (generics.requires_monomorphization(tcx)
|| tcx.codegen_fn_attrs(def_id).requests_inline())
&& tcx.sess.opts.output_types.should_codegen();
let opt = tcx.sess.opts.unstable_opts.always_encode_mir
|| (tcx.sess.opts.output_types.should_codegen()
&& reachable_set.contains(&def_id)
&& (generics.requires_monomorphization(tcx)
|| tcx.codegen_fn_attrs(def_id).requests_inline()));
// The function has a `const` modifier or is in a `#[const_trait]`.
let is_const_fn = tcx.is_const_fn_raw(def_id.to_def_id())
|| tcx.is_const_default_method(def_id.to_def_id());
let always_encode_mir = tcx.sess.opts.unstable_opts.always_encode_mir;
(is_const_fn, needs_inline || always_encode_mir)
(is_const_fn, opt)
}
// Generators require optimized MIR to compute layout.
DefKind::Generator => (false, true),
Expand Down Expand Up @@ -1580,9 +1597,10 @@ impl<'a, 'tcx> EncodeContext<'a, 'tcx> {
}

let tcx = self.tcx;
let reachable_set = tcx.reachable_set(());

let keys_and_jobs = tcx.mir_keys(()).iter().filter_map(|&def_id| {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we filter more aggressively using a .filter(|def_id| reachable_set.contains(def_id))?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reachable_set is used to filter the optimized MIR only. It might still be necessary to encode MIR for CTFE.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's my question: should we filter MIR for CTFE too according to the same rule?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reachable set cannot be used for the purpose. It contains items externally reachable in the linkage sense, or that could be code generated while building other crates, but it doesn't have much to say about items which might evaluated in other crates. Implementing this would require a separate analysis (I assume we don't want to include those in reachable set since that would be pessimization).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a line explaining the logic around reachable_set and this exception to the doc-comment on should_encode_mir?
r=me then

let (encode_const, encode_opt) = should_encode_mir(tcx, def_id);
let (encode_const, encode_opt) = should_encode_mir(tcx, reachable_set, def_id);
if encode_const || encode_opt { Some((def_id, encode_const, encode_opt)) } else { None }
});
for (def_id, encode_const, encode_opt) in keys_and_jobs {
Expand Down Expand Up @@ -2067,8 +2085,9 @@ fn prefetch_mir(tcx: TyCtxt<'_>) {
return;
}

let reachable_set = tcx.reachable_set(());
par_for_each_in(tcx.mir_keys(()), |&def_id| {
let (encode_const, encode_opt) = should_encode_mir(tcx, def_id);
let (encode_const, encode_opt) = should_encode_mir(tcx, reachable_set, def_id);

if encode_const {
tcx.ensure_with_value().mir_for_ctfe(def_id);
Expand Down
4 changes: 4 additions & 0 deletions compiler/rustc_passes/src/reachable.rs
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,10 @@ impl<'tcx> Visitor<'tcx> for ReachableContext<'tcx> {
.typeck_results()
.type_dependent_def(expr.hir_id)
.map(|(kind, def_id)| Res::Def(kind, def_id)),
hir::ExprKind::Closure(&hir::Closure { def_id, .. }) => {
self.reachable_symbols.insert(def_id);
None
}
_ => None,
};

Expand Down
2 changes: 1 addition & 1 deletion tests/mir-opt/funky_arms.rs
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ use core::num::flt2dec;
use std::fmt::{Formatter, Result};

// EMIT_MIR funky_arms.float_to_exponential_common.ConstProp.diff
fn float_to_exponential_common<T>(fmt: &mut Formatter<'_>, num: &T, upper: bool) -> Result
pub fn float_to_exponential_common<T>(fmt: &mut Formatter<'_>, num: &T, upper: bool) -> Result
where
T: flt2dec::DecodableFloat,
{
Expand Down