Skip to content

NLL: dropck_vec_cycle_checked.rs very slow to compile #49998

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
pnkfelix opened this issue Apr 16, 2018 · 5 comments
Closed

NLL: dropck_vec_cycle_checked.rs very slow to compile #49998

pnkfelix opened this issue Apr 16, 2018 · 5 comments
Labels
NLL-performant Working towards the "performance is good" goal

Comments

@pnkfelix
Copy link
Member

pnkfelix commented Apr 16, 2018

As noted on #49900 (comment) it appears that dropck_vec_cycle_checked.rs under -Znll requires more than an hour to compile (not yet clear how much time is actually required.)

Here is a reduced version of that test case that exhibits a 50x slowdown under -Znll. (Namely, it takes 0.163s to compile without -Znll, and takes 8.424s to compile with -Znll

// Reduced version of compile-fail/dropck_vec_cycle_checked.rs that
// still exhibits unacceptable time blow-up (but not to the point of
// being unrunnable).

use std::cell::Cell;

trait HasId { }
struct CheckId<T:HasId> { v: T }
impl<T:HasId> Drop for CheckId<T> { fn drop(&mut self) { } }
struct C<'a> { v: Vec<CheckId<Cell<Option<&'a C<'a>>>>> }
impl<'a> HasId for Cell<Option<&'a C<'a>>> { }
impl<'a> C<'a> { fn new() -> C<'a> { C { v: Vec::new() } } }

fn main() {
    let (mut c1, mut c2);
    c1 = C::new();
    c2 = C::new();

    c1.v[0].v.set(Some(&c2));
    //~^ ERROR `c2` does not live long enough

    c1.v[0].v.set(Some(&c2));
    //~^ ERROR `c2` does not live long enough
}
@pnkfelix pnkfelix added NLL-performant Working towards the "performance is good" goal WG-compiler-nll labels Apr 16, 2018
@pnkfelix
Copy link
Member Author

pnkfelix commented Apr 17, 2018

It seems the problem, at least in this case, is this line of code

 if cause < **old_cause {

After some inspection with gdb and perf, my guess is that these region_infer::Causes can build up into (very?) long linked-list chains. I would think that the automatically-derived PartialOrd implementation would only fall into bad performance if those long chains have matching prefixes, but I haven't actually confirmed that (i.e., is it possible that the derived PartialOrd does not short cut when an early point in the linked list is different? Does it need to reach the end for some reason?)

Anyway, from what I can tell this code is just trying to reduce the cause to something simpler, but maybe once we are looking at two instances of Cause::Outlives we should not be wasting time trying to figure out which one is "less" ?

So some experiments to try:

  1. Find out how long these chains actually are. Perhaps even store the length in the Outlives variant and use that as a way to eagerly exit the comparison when the two lengths are not the same. I haven't done this yet because I was more interested in trying the next experiment.
    • Update: adding the length field and assuming it would short-cut traversal did not seem to work. (I.e. the running time is still bad). The length for the error being reported is 14; I have not yet gathered data on the lengths during the various comparison operations that are taking place.
  2. Make a variant comparison method that does not recur, and instead just always returns false when comparing two Cause::Outlives. I have implemented this and it seems to address the slow-down for this test case. (Of course this approach does not preserve the prior behavior. My theory though is that wasting a lot of time to find the minimal cause is not going to pay off in these cases.)

@pnkfelix
Copy link
Member Author

pnkfelix commented Apr 17, 2018

Based on some rough debug! instrumentation, the maximum length is 25, the average is about 5, and that cause < **old_cause expression is being evaluated 5,194 times. To be honest I'm a little surprised that the time is blowing up that badly if it really is due solely to the comparison operation alone. (Maybe the perf data is misleading me.)

  • When I use a filter that skips considering the comparison and the replacement when both sides are Cause::Outlives (i.e. experiment 2 written in previous comment), then add_internal only adds new elements 3,738 times. I briefly skimmed the code to see where the returned boolean here flows to, and it didn't seem like it was affecting anything (i.e. it seemed like it ended up being discarded without affecting control flow) but I certainly could have overlooked something.

Another reasonable theory is that the reason that it speeds things up to always return false when comparing two Cause::Outlives is that then we will spend far less time replacing the entry (a linked list structure) held in the cause map. It should be relatively cheap to move the new cause (which we already have constructed regardless of the outcome of the comparison operation) into a newly allocated Rc, but we also must deallocate the old cause (and/or the causes it links to) if there are no other owners sharing it?

@pnkfelix
Copy link
Member Author

A variation on outright skipping comparison when given two Outlives: always reduce the two inputs to root_cause() before comparing them. In my initial experiments, this seems like it yields a similar performance boost in error reporting to what I reported above, while I think it has a smaller (though still non-zero) impact on the actual errors we currently produce on the ui test suite.

pnkfelix added a commit to pnkfelix/rust that referenced this issue Apr 18, 2018
…ating of cause map.

This seems to avoid poor scaling on src/test/ui/span/dropck_vec_cycle_checked.rs
pnkfelix added a commit to pnkfelix/rust that referenced this issue Apr 18, 2018

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
…imal-causes` flag

This commit only applies the flag to the one test case,
ui/span/dropck_vec_cycle_checked.rs, that absolutely needs it. Without
the flag, that test takes an unknown amount of time (greater than 1
minute) to compile. But its possible that other tests would also
benefit from the flag, and we may want to make it the default (after
evaluating its impact on other tests).

In terms of its known impact on other tests, I have only evaluated the
ui tests, and the *only* ui test I have found that the flag impacts
(running under NLL mode, of course), is src/test/ui/nll/issue-31567.rs

In particular:

```
% ./build/x86_64-unknown-linux-gnu/stage1/bin/rustc ../src/test/ui/nll/issue-31567.rs
error[E0597]: `*v.0` does not live long enough
  --> ../src/test/ui/nll/issue-31567.rs:22:26
   |
22 |     let s_inner: &'a S = &*v.0; //~ ERROR `*v.0` does not live long enough
   |                          ^^^^^ borrowed value does not live long enough
23 |     &s_inner.0
24 | }
   | - borrowed value only lives until here
   |
note: borrowed value must be valid for the lifetime 'a as defined on the function body at 21:1...
  --> ../src/test/ui/nll/issue-31567.rs:21:1
   |
21 | fn get_dangling<'a>(v: VecWrapper<'a>) -> &'a u32 {
   | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

error: aborting due to previous error

For more information about this error, try `rustc --explain E0597`.
% ./build/x86_64-unknown-linux-gnu/stage1/bin/rustc ../src/test/ui/nll/issue-31567.rs  -Z nll-subminimal-causes
error[E0597]: `*v.0` does not live long enough
  --> ../src/test/ui/nll/issue-31567.rs:22:26
   |
22 |     let s_inner: &'a S = &*v.0; //~ ERROR `*v.0` does not live long enough
   |                          ^^^^^ borrowed value does not live long enough
23 |     &s_inner.0
24 | }
   | -
   | |
   | borrowed value only lives until here
   | borrow later used here, when `v` is dropped

error: aborting due to previous error

For more information about this error, try `rustc --explain E0597`.
%
```
@ishitatsuyuki
Copy link
Contributor

Please revisit the status of this issue, as the PartialOrd bug is fixed.

@nikomatsakis
Copy link
Contributor

Testing with nightly now, I see no difference at all between them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NLL-performant Working towards the "performance is good" goal
Projects
None yet
Development

No branches or pull requests

3 participants