Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perf regression in nightly when compiling massive matches #29227

Closed
swgillespie opened this issue Oct 22, 2015 · 12 comments
Closed

Perf regression in nightly when compiling massive matches #29227

swgillespie opened this issue Oct 22, 2015 · 12 comments
Labels
T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@swgillespie
Copy link
Contributor

The file located here: https://gist.github.com/swgillespie/37b32f7b09ae536df8dc when compiled using rustc rustc_abuse.rs -o rustc_abuse -Z time-passes takes approximately 9 seconds and 40MB of memory to compile on stable and beta, while taking almost a minute and 4GB of memory on nightly.

The Bad Thing being done here is that there are several massive matches. As expected, match checking takes a few seconds, but the real culprit here seems to be "MIR dump", which is where the memory usage peaks. The memory usage is high enough to get my Travis CI build killed that has some code similar to this, but not as extreme.

These numbers were gathered with:

  • rustc 1.5.0-nightly (4826f9625 2015-10-21)
  • rustc 1.4.0-beta.3 (20eba406f 2015-10-16)
  • rustc 1.3.0 (9a92aaf19 2015-09-15).
  • Mac OS X 10.10.5
@sfackler sfackler added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Oct 22, 2015
@alexcrichton
Copy link
Member

cc @rust-lang/compiler, seems like some low hanging fruit perhaps?

@nikomatsakis
Copy link
Contributor

Perhaps, yeah. Thanks for the report!

@nikomatsakis
Copy link
Contributor

Well, it probably requires tweaking MIR somewhat to have a more general "switch" style statement, as @Aatch suggested at some point.

@eddyb
Copy link
Member

eddyb commented Oct 23, 2015

@nikomatsakis I believe I've seen MIR building turn constant patterns not into plain x == constant but PartialEq::eq(&x, &constant) - wouldn't that be a bug, given that patterns need to be pure?

@nikomatsakis
Copy link
Contributor

@eddyb ah yes, I meant to writeup an RFC on this point. I believe the
current pattern behavior (not what MIR does) is incorrect (or at least
undesirable). Certainly it's not something that I fully understood we were
doing. In particular, I think that

match v { CONSTANT => true, _ => false }

and

v == CONSTANT

should always be equivalent, if both of them compile. (I could imagine
that the match version might not always compile). The current behavior of
converting constants into patterns obviously doesn't preserve this
equivalence, which seems pretty wrong to me, but it also doesn't work very
well with things like associated constants.

On Fri, Oct 23, 2015 at 3:21 AM, Eduard-Mihai Burtescu <
notifications@github.com> wrote:

@nikomatsakis https://github.com/nikomatsakis I believe I've seen MIR
building turn constant patterns not into plain x == constant but PartialEq::eq(&x,
&constant) - wouldn't that be a bug, given that patterns need to be pure?


Reply to this email directly or view it on GitHub
#29227 (comment).

@eddyb
Copy link
Member

eddyb commented Oct 23, 2015

I don't think there's a way to change that and not break backwards compat, as currently the purity of PartialEq impls is not publically reflected in APIs, and matching always works, even without a PartialEq impl or an impure one.

OTOH, expanding associated constants may require delaying match MIR generation post-monomorphization, at least for the associated constant pattern, not necessarily the whole match.

@nikomatsakis
Copy link
Contributor

nikomatsakis commented Oct 23, 2015 via email

@arielb1
Copy link
Contributor

arielb1 commented Oct 23, 2015

@nikomatsakis

I am not sure we want to support non-resolvable associated constants in match, exactly because of this unclarity. However, I am not sure match x { C => {}, _ => {}} into match x { a if a == C => {}, _ => {} } is so terrible.

@eddyb
Copy link
Member

eddyb commented Oct 24, 2015

@nikomatsakis Exhaustivity-wise, we can just be conservative, which might indeed be too much of a restriction for associated constants to be useful at all in matches.

As for the box patterns, we need restrictions there for soundness, and I believe the best way to go about it is to reuse the DST support: the CoerceUnsized<Rc<U>> for Rc<T> impl lets the compiler know where the actual pointer is.
Coupled with a Deref<Target=T> for Rc<T> impl, the compiler can assume...

Actually, scratch that, it was for a different feature (self: Rc<T>) - the pointers the compiler knows about is to an RcBox<T>, not as useful here.
To get box patterns to work, we can just require an unsafe impl<T> PureDeref for Rc<T> {} and be done with it.
Although I wouldn't stabilize that because it could interfere with a potential const impl<T> Deref<Target=T> for Rc<T> {...}.

@nikomatsakis
Copy link
Contributor

This is pretty off topic for this issue. I plan to open a discussion thread
in the next day or so and we can hash it out there. :) Alternatively,
#20489 has related discussion.

On Sat, Oct 24, 2015 at 3:50 AM, Eduard-Mihai Burtescu <
notifications@github.com> wrote:

@nikomatsakis https://github.com/nikomatsakis Exhaustivity-wise, we can
just be conservative, which might indeed be too much of a restriction for
associated constants to be useful at all in matches.

As for the box patterns, we need restrictions there for soundness, and I
believe the best way to go about it is to reuse the DST support: the CoerceUnsized<Rc>
for Rc impl lets the compiler know where the actual pointer is.
Coupled with a Deref<Target=T> for Rc impl, the compiler can assume...

Actually, scratch that, it was for a different feature (self: Rc) -
the pointers the compiler knows about is to an RcBox, not as useful
here.
To get box patterns to work, we can just require an unsafe impl
PureDeref for Rc {} and be done with it.
Although I wouldn't stabilize that because it could interfere with a
potential const impl Deref<Target=T> for Rc {...}.


Reply to this email directly or view it on GitHub
#29227 (comment).

@nikomatsakis nikomatsakis reopened this Oct 26, 2015
@nikomatsakis
Copy link
Contributor

This was accidentally closed. The issue is not solved, though #29384 is a temporary workaround to prevent regressing the stable release.

@nikomatsakis
Copy link
Contributor

(I did start a branch that I THINK will resolve this issue, however.)

bors added a commit that referenced this issue Nov 6, 2015
Introduce a `SwitchInt` and restructure pattern matching to collect integers and characters into one master switch. This is aimed at #29227, but is not a complete fix. Whereas before we generated an if-else-if chain and, at least on my machine, just failed to compile, we now spend ~9sec compiling `rustc_abuse`. AFAICT this is basically just due to a need for more micro-optimization of the matching process: perf shows a fair amount of time just spent iterating over the candidate list. Still, it seemed worth opening a PR with this step alone, since it's a big step forward.
nikomatsakis added a commit to nikomatsakis/rust that referenced this issue Nov 11, 2015
bors added a commit that referenced this issue Nov 11, 2015
The older algorithm was pretty inefficient for big matches. Fixes #29227. (On my computer, MIR construction on this test case goes from 9.9s to 0.025s.) Whereas before we had a loop like:

- for all outcomes of the test we are performing
    - for all candidates
        - check whether candidate is relevant to outcome

We now do:

- for all candidates
    - determine which outcomes the candidate is relevant to

Since the number of outcomes in this case is proportional to the number of candidates, the original algorithm turned out to be O(n^2), and the newer one is just O(n).

This PR also does some minor speedups by eagerly mirroring all patterns, so that we can just pass around `&Pattern<'tcx>`, which makes cloning cheaper. We could probably go a bit further in this direction.

r? @Aatch
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

6 participants