Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve fast rejection. #97136

Closed

Conversation

nnethercote
Copy link
Contributor

In particular, for bitmaps-3.1.0 which does a huge number of
comparisons between BitsImpl<M> and BitsImpl<N> where M and N
are const integers.

r? @lcnr

@rustbot rustbot added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label May 18, 2022
@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label May 18, 2022
@nnethercote
Copy link
Contributor Author

Currently, this has the smallest possible change that helps bitmaps-3.1.0, to the tune of a 40% reduction in instruction counts on a check full build. Something less ad hoc would be nice.

nalgebra stresses the compiler in similar ways to bitmaps and I was hoping that it might be improved too, but local tests indicated not. I will look into nalgebra more tomorrow to see if there are similar cases that can be easily captured.

@nnethercote
Copy link
Contributor Author

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label May 18, 2022
@bors
Copy link
Contributor

bors commented May 18, 2022

⌛ Trying commit 0374729b9ba6b4f7f054d3adeb9c62ed5634c0d5 with merge 75978414bb55092c36addd0a84ead8b1c19d82f6...

@bors
Copy link
Contributor

bors commented May 18, 2022

☀️ Try build successful - checks-actions
Build commit: 75978414bb55092c36addd0a84ead8b1c19d82f6 (75978414bb55092c36addd0a84ead8b1c19d82f6)

@rust-timer
Copy link
Collaborator

Queued 75978414bb55092c36addd0a84ead8b1c19d82f6 with parent 77972d2, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (75978414bb55092c36addd0a84ead8b1c19d82f6): comparison url.

Summary:

  • Primary benchmarks: 🎉 relevant improvements found
  • Secondary benchmarks: no relevant changes found
Regressions 😿
(primary)
Regressions 😿
(secondary)
Improvements 🎉
(primary)
Improvements 🎉
(secondary)
All 😿 🎉
(primary)
count1 0 0 13 0 13
mean2 N/A N/A -22.6% N/A -22.6%
max N/A N/A -39.5% N/A -39.5%

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf -perf-regression

Footnotes

  1. number of relevant changes

  2. the arithmetic mean of the percent change

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label May 18, 2022
In particular, for `bitmaps-3.1.0` which does a huge number of
comparisons between `BitsImpl<M>` and `BitsImpl<N>` where `M` and `N`
are const integers.
@nnethercote nnethercote force-pushed the improve-fast-rejection branch from 0374729 to af68005 Compare May 23, 2022 06:55
@nnethercote
Copy link
Contributor Author

@lcnr: I have updated the code to avoid SimplifiedType, as you requested. I tried a couple of different ways of writing this code. I settled on this way because the matches are exhaustive, which seems important. That way, if new variants are added to Ty, people can't forget to update this code.

As for the unreachable parts: I originally didn't have them, and then I was curious about whether every variant could be reached. So I did testing and was surprised to see how many weren't executed. I've left them unreachable because I don't want to merge code that hasn't been tested, which is what we'd have if they weren't unreachable.

Copy link
Contributor

@lcnr lcnr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realized that I have some pretty strong opinions here 😅

I hope that you don't mind, otherwise I would be also fine with implementing this myself and leaving you review it.

@@ -214,6 +213,122 @@ enum BuiltinImplConditions<'tcx> {
Ambiguous,
}

fn tys_may_be_equivalent<'tcx>(oblig_ty: Ty<'tcx>, impl_ty: Ty<'tcx>) -> bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please move this into the fast_reject module

// First, look for obvious equivalences, e.g. the types are equal. The most
// common case is to find such an equivalence trivially, e.g. `ty::Bool`
// and `ty::Bool`.
let mut maybe_equiv = match oblig_ty.kind() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer one large match (oblig_ty.kind(), impl_ty.kind()) here, starting with

(_, ty::Param(_) | ty::Projection(_) | ty::Error(_)) => true,

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what I was referring to when I said I chose to structure it with exhaustive matches. Using _ here makes it easy to overlook updating this when new variants are added to Ty.

Comment on lines +234 to +240
substs1.len() == 1
&& substs2.len() == 1
&& let ty::subst::GenericArgKind::Const(c1) = substs1[0].unpack()
&& let ty::subst::GenericArgKind::Const(c2) = substs2[0].unpack()
&& let Some(s1) = c1.val().try_to_scalar_int()
&& let Some(s2) = c2.val().try_to_scalar_int()
&& s1 != s2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please iterate over all generic arguments here.

For types, you can again call tys_may_be_equalivalent while for constant you can also return false for ConstKind::Param on the lhs and anything apart from Param | Unevaluated on the rhs.

I think it makes sense to add fn generic_arg_may_be_equivalent for that

_ => false,
},
ty::Str => matches!(k, ty::Str),
ty::Array(..) => matches!(k, ty::Array(..)),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can recurse into the element type and length here

},
ty::Str => matches!(k, ty::Str),
ty::Array(..) => matches!(k, ty::Array(..)),
ty::Slice(..) => matches!(k, ty::Slice(..)),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

elem type

trait_info.principal_def_id() == trait_info2.principal_def_id()
)
}
ty::Ref(_, _, mutbl) => matches!(k, ty::Ref(_, _, mutbl2) if mutbl == mutbl2),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

target ty

)
}
ty::Ref(_, _, mutbl) => matches!(k, ty::Ref(_, _, mutbl2) if mutbl == mutbl2),
ty::FnDef(def_id, _) => matches!(k, ty::FnDef(def_id2, _) if def_id == def_id2),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... for all other types

Comment on lines +277 to +282
ty::Param(_) => {
// Note, we simplify parameters for the obligation but not the impl
// so that we do not reject a blanket impl but do reject more
// concrete impls if we're searching for `T: Trait`.
matches!(k, ty::Placeholder(..))
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can just be

(ty::Param(_), _) => false,

parameters only unify with themselves (or inference vars/projections which get normalized/inferred to that param)

Comment on lines +283 to +289
ty::Projection(_) => {
// When treating `ty::Param` as a placeholder, projections also
// don't unify with anything else as long as they are fully normalized.
//
// We will have to be careful with lazy normalization here.
oblig_ty.has_infer_types_or_consts() || matches!(k, ty::Placeholder(..))
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

(ty::Projection(_), _) => oblig_ty.has_infer_types_or_consts(),

}
ty::Infer(_) => true,

ty::GeneratorWitness(..) | ty::Placeholder(..) | ty::Bound(..) | ty::Error(_) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

surprised that ty::Placeholder is unreachable here, that variant should be reachable afaik

`match_impl` has two call sites. For one of them (within `rematch_impl`)
the fast reject test isn't necessary, because any rejection would
represent a compiler bug.

This commit moves the fast reject test to the other `match_impl` call
site, in `assemble_candidates_from_impls`. This lets us move the fast
reject test outside the `probe` call in that function. This avoids the
taking of useless snapshots when the fast reject test succeeds, which
gives a performance win when compiling the `bitmaps` and `nalgebra`
crates.
@nnethercote
Copy link
Contributor Author

@lcnr: thanks for the comments. It's late here, I will get to them tomorrow. But note that I've added a new commit that gives an additional performance boost.

@nnethercote
Copy link
Contributor Author

I hope that you don't mind, otherwise I would be also fine with implementing this myself and leaving you review it.

It'll likely be faster that way, I'm fine if you want to take over.

@nnethercote
Copy link
Contributor Author

This has been superseded by #97166.

@nnethercote nnethercote deleted the improve-fast-rejection branch May 25, 2022 00:53
bors added a commit to rust-lang-ci/rust that referenced this pull request May 25, 2022
add a deep fast_reject routine

continues the work on rust-lang#97136.

r? `@nnethercote`

Actually agree with you on the match structure 😆 let's see how that impacted perf 😅
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants