Improve fast rejection. #97136

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Closed

nnethercote wants to merge 2 commits into rust-lang:master from nnethercote:improve-fast-rejection

Contributor

nnethercote commented May 18, 2022

In particular, for bitmaps-3.1.0 which does a huge number of
comparisons between BitsImpl<M> and BitsImpl<N> where M and N
are const integers.

rust-highfive assigned lcnr

rustbot added the T-compiler label

rust-highfive added the S-waiting-on-review label

Contributor Author

nnethercote commented May 18, 2022

Currently, this has the smallest possible change that helps bitmaps-3.1.0, to the tune of a 40% reduction in instruction counts on a check full build. Something less ad hoc would be nice.

nalgebra stresses the compiler in similar ways to bitmaps and I was hoping that it might be improved too, but local tests indicated not. I will look into nalgebra more tomorrow to see if there are similar cases that can be easily captured.

Contributor Author

nnethercote commented May 18, 2022

@bors try @rust-timer queue

Collaborator

rust-timer commented May 18, 2022

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

rustbot added the S-waiting-on-perf label

Collaborator

bors commented May 18, 2022

⌛ Trying commit 0374729b9ba6b4f7f054d3adeb9c62ed5634c0d5 with merge 75978414bb55092c36addd0a84ead8b1c19d82f6...

Collaborator

bors commented May 18, 2022

☀️ Try build successful - checks-actions
Build commit: 75978414bb55092c36addd0a84ead8b1c19d82f6 (75978414bb55092c36addd0a84ead8b1c19d82f6)

Collaborator

rust-timer commented May 18, 2022

Queued 75978414bb55092c36addd0a84ead8b1c19d82f6 with parent 77972d2, future comparison URL.

Collaborator

rust-timer commented May 18, 2022

Finished benchmarking commit (75978414bb55092c36addd0a84ead8b1c19d82f6): comparison url.

Summary:

Primary benchmarks: 🎉 relevant improvements found
Secondary benchmarks: no relevant changes found

	Regressions 😿 (primary)	Regressions 😿 (secondary)	Improvements 🎉 (primary)	Improvements 🎉 (secondary)	All 😿 🎉 (primary)
count¹	0	0	13	0	13
mean²	N/A	N/A	-22.6%	N/A	-22.6%
max	N/A	N/A	-39.5%	N/A	-39.5%

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf -perf-regression

number of relevant changes ↩
the arithmetic mean of the percent change ↩

rustbot removed the S-waiting-on-perf label


          Improve fast rejection.

af68005

In particular, for `bitmaps-3.1.0` which does a huge number of
comparisons between `BitsImpl<M>` and `BitsImpl<N>` where `M` and `N`
are const integers.

nnethercote force-pushed the improve-fast-rejection branch from 0374729 to af68005 Compare

May 23, 2022 06:55

Contributor Author

nnethercote commented May 23, 2022

@lcnr: I have updated the code to avoid SimplifiedType, as you requested. I tried a couple of different ways of writing this code. I settled on this way because the matches are exhaustive, which seems important. That way, if new variants are added to Ty, people can't forget to update this code.

As for the unreachable parts: I originally didn't have them, and then I was curious about whether every variant could be reached. So I did testing and was surprised to see how many weren't executed. I've left them unreachable because I don't want to merge code that hasn't been tested, which is what we'd have if they weren't unreachable.

lcnr reviewed

View reviewed changes

Contributor

lcnr left a comment

I realized that I have some pretty strong opinions here 😅

I hope that you don't mind, otherwise I would be also fine with implementing this myself and leaving you review it.

compiler/rustc_trait_selection/src/traits/select/mod.rs

@@ @@ -214,6 +213,122 @@ enum BuiltinImplConditions<'tcx> { @@
                   Ambiguous,
               }
+              fn tys_may_be_equivalent<'tcx>(oblig_ty: Ty<'tcx>, impl_ty: Ty<'tcx>) -> bool {

Contributor

lcnr May 23, 2022

please move this into the fast_reject module

compiler/rustc_trait_selection/src/traits/select/mod.rs

+                  // First, look for obvious equivalences, e.g. the types are equal. The most
+                  // common case is to find such an equivalence trivially, e.g. `ty::Bool`
+                  // and `ty::Bool`.
+                  let mut maybe_equiv = match oblig_ty.kind() {

Contributor

lcnr May 23, 2022

I prefer one large match (oblig_ty.kind(), impl_ty.kind()) here, starting with

(_, ty::Param(_) | ty::Projection(_) | ty::Error(_)) => true,

Contributor Author

nnethercote May 23, 2022

This is what I was referring to when I said I chose to structure it with exhaustive matches. Using _ here makes it easy to overlook updating this when new variants are added to Ty.

compiler/rustc_trait_selection/src/traits/select/mod.rs

Comment on lines +234 to +240

+                                      substs1.len() == 1
+                                      && substs2.len() == 1
+                                      && let ty::subst::GenericArgKind::Const(c1) = substs1[0].unpack()
+                                      && let ty::subst::GenericArgKind::Const(c2) = substs2[0].unpack()
+                                      && let Some(s1) = c1.val().try_to_scalar_int()
+                                      && let Some(s2) = c2.val().try_to_scalar_int()
+                                      && s1 != s2

Contributor

lcnr May 23, 2022

please iterate over all generic arguments here.

For types, you can again call tys_may_be_equalivalent while for constant you can also return false for ConstKind::Param on the lhs and anything apart from Param | Unevaluated on the rhs.

I think it makes sense to add fn generic_arg_may_be_equivalent for that

compiler/rustc_trait_selection/src/traits/select/mod.rs

+                          _ => false,
+                      },
+                      ty::Str => matches!(k, ty::Str),
+                      ty::Array(..) => matches!(k, ty::Array(..)),

Contributor

lcnr May 23, 2022

you can recurse into the element type and length here

compiler/rustc_trait_selection/src/traits/select/mod.rs

+                      },
+                      ty::Str => matches!(k, ty::Str),
+                      ty::Array(..) => matches!(k, ty::Array(..)),
+                      ty::Slice(..) => matches!(k, ty::Slice(..)),

Contributor

lcnr May 23, 2022

elem type

compiler/rustc_trait_selection/src/traits/select/mod.rs

+                              trait_info.principal_def_id() == trait_info2.principal_def_id()
+                          )
+                      }
+                      ty::Ref(_, _, mutbl) => matches!(k, ty::Ref(_, _, mutbl2) if mutbl == mutbl2),

Contributor

lcnr May 23, 2022

target ty

compiler/rustc_trait_selection/src/traits/select/mod.rs

+                          )
+                      }
+                      ty::Ref(_, _, mutbl) => matches!(k, ty::Ref(_, _, mutbl2) if mutbl == mutbl2),
+                      ty::FnDef(def_id, _) => matches!(k, ty::FnDef(def_id2, _) if def_id == def_id2),

Contributor

lcnr May 23, 2022

... for all other types

compiler/rustc_trait_selection/src/traits/select/mod.rs

Comment on lines +277 to +282

+                      ty::Param(_) => {
+                          // Note, we simplify parameters for the obligation but not the impl
+                          // so that we do not reject a blanket impl but do reject more
+                          // concrete impls if we're searching for `T: Trait`.
+                          matches!(k, ty::Placeholder(..))
+                      }

Contributor

lcnr May 23, 2022

this can just be

(ty::Param(_), _) => false,

parameters only unify with themselves (or inference vars/projections which get normalized/inferred to that param)

compiler/rustc_trait_selection/src/traits/select/mod.rs

Comment on lines +283 to +289

+                      ty::Projection(_) => {
+                          // When treating `ty::Param` as a placeholder, projections also
+                          // don't unify with anything else as long as they are fully normalized.
+                          //
+                          // We will have to be careful with lazy normalization here.
+                          oblig_ty.has_infer_types_or_consts() || matches!(k, ty::Placeholder(..))
+                      }

Contributor

lcnr May 23, 2022

same here

(ty::Projection(_), _) => oblig_ty.has_infer_types_or_consts(),

compiler/rustc_trait_selection/src/traits/select/mod.rs

+                      }
+                      ty::Infer(_) => true,
+                      ty::GeneratorWitness(..) | ty::Placeholder(..) | ty::Bound(..) | ty::Error(_) => {

Contributor

lcnr May 23, 2022

surprised that ty::Placeholder is unreachable here, that variant should be reachable afaik


          Move fast reject test out of SelectionContext::match_impl.

bf47503

`match_impl` has two call sites. For one of them (within `rematch_impl`)
the fast reject test isn't necessary, because any rejection would
represent a compiler bug.

This commit moves the fast reject test to the other `match_impl` call
site, in `assemble_candidates_from_impls`. This lets us move the fast
reject test outside the `probe` call in that function. This avoids the
taking of useless snapshots when the fast reject test succeeds, which
gives a performance win when compiling the `bitmaps` and `nalgebra`
crates.

Contributor Author

nnethercote commented May 23, 2022

@lcnr: thanks for the comments. It's late here, I will get to them tomorrow. But note that I've added a new commit that gives an additional performance boost.

Contributor Author

nnethercote commented May 23, 2022

I hope that you don't mind, otherwise I would be also fine with implementing this myself and leaving you review it.

It'll likely be faster that way, I'm fine if you want to take over.

lcnr mentioned this pull request

add a deep fast_reject routine #97345

Merged

Contributor Author

nnethercote commented May 25, 2022

This has been superseded by #97166.

nnethercote closed this

nnethercote deleted the improve-fast-rejection branch

May 25, 2022 00:53

bors added a commit to rust-lang-ci/rust that referenced this pull request


          Auto merge of rust-lang#97345 - lcnr:fast_reject, r=nnethercote

4a99c5f

add a deep fast_reject routine

continues the work on rust-lang#97136.

r? `@nnethercote`

Actually agree with you on the match structure 😆 let's see how that impacted perf 😅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

S-waiting-on-review T-compiler