Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pathological performance in resolve with lots of unused imports in style crate #43572

Closed
alexcrichton opened this issue Jul 31, 2017 · 10 comments · Fixed by #43584
Closed

Pathological performance in resolve with lots of unused imports in style crate #43572

alexcrichton opened this issue Jul 31, 2017 · 10 comments · Fixed by #43584
Labels
C-bug Category: This is a bug. I-compiletime Issue: Problems and improvements with respect to compile times. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@alexcrichton
Copy link
Member

alexcrichton commented Jul 31, 2017

A local build of Servo's style crate shows a huge amount of time (46 seconds!) in name resolution. It turns out that basically all of it is calling span_to_snippet, specifically this one. That's a lot of unused imports!

Sure enough if we create a synthetic file locally with this script:

N=20000

echo '#![allow(unused)]' > lots-of-unused.rs
for i in `seq 1 $N`; do
  echo "use foo as foo$i;" >> lots-of-unused.rs
done
echo 'fn foo() {}' >> lots-of-unused.rs
echo 'fn main() {}' >> lots-of-unused.rs


echo > lots-of-used.rs
for i in `seq 1 $N`; do
  echo "use test::foo$i;" >> lots-of-used.rs
done
echo 'mod test {' >> lots-of-used.rs
for i in `seq 1 $N`; do
  echo "pub fn foo$i() {}" >> lots-of-used.rs
done
echo '}' >> lots-of-used.rs
echo 'fn main() {' >> lots-of-used.rs
for i in `seq 1 $N`; do
  echo "foo$i();" >> lots-of-used.rs
done
echo '}' >> lots-of-used.rs

we find:

$ rustc +nightly -V
rustc 1.21.0-nightly (aac223f4f 2017-07-30)
$ sh foo.sh
$ rustc +nightly lots-of-used.rs -Z time-passes 2>&1 | grep 'name resolution'
time: 0.033; rss: 164MB name resolution
$ rustc +nightly lots-of-unused.rs -Z time-passes 2>&1 | grep 'name resolution'
time: 1.139; rss: 136MB name resolution

That's a lot of time spent processing unused imports!

The slowdown here isn't quite the same as the style crate, unfortunately. The style crate only has ~7k unused imports, which apparently takes it 48s in that profile. In any case the intent here is clear, lots of unused imports seem to slow down compilation, and the style crate looks to have a lot of unused imports!

@alexcrichton alexcrichton added I-compiletime Issue: Problems and improvements with respect to compile times. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. C-bug Category: This is a bug. labels Jul 31, 2017
@alexcrichton
Copy link
Member Author

cc @rust-lang/compiler

@alexcrichton alexcrichton changed the title Pathological performance in resolve with lots of unused imports Pathological performance in resolve with lots of unused imports in style crate Jul 31, 2017
@jseyfried
Copy link
Contributor

We could address this quickly by just emitting the first 10 unused imports warnings and then warning that there are more unused imports not shown.

@alexcrichton
Copy link
Member Author

I can never deal with more than 10 lints at a time anyway myself, so seems like a reasonable solution to me!

@sfackler
Copy link
Member

sfackler commented Aug 1, 2017

Dumping out a ton of warnings is useful when you're hooked into an IDE and you can see them as e.g. green squiggles under the offending text as opposed to 10k lines on a terminal.

@arielb1
Copy link
Contributor

arielb1 commented Aug 1, 2017

I think the reason span_to_snippet is slow is because it calls ensure_filemap_source_present, which does this:
https://github.com/rust-lang/rust/blob/master/src/libsyntax/codemap.rs#L564

    fn ensure_filemap_source_present(&self, file_map: Rc<FileMap>) -> bool {
        let src = self.file_loader.read_file(Path::new(&file_map.name)).ok();
        return file_map.add_external_src(src)
    }

I don't think we want to read source files multiple times like that.

@SimonSapin
Copy link
Contributor

SimonSapin commented Aug 1, 2017

Thank you for looking into Servo compile times Alex, and great find here! Could you CC me or someone from the Servo team for Servo-related issues? Thanks.

@alexcrichton
Copy link
Member Author

Can do!

@Manishearth
Copy link
Member

Manishearth commented Aug 1, 2017

In general I've always wanted the lint system to avoid even running in #[allow] blocks.

Currently the lint system runs all lints in parallel on the tree. This is great since it means we only traverse it once, and the cost of the traversal is consolidated. We don't need to reduce that cost itself (which is nice because then we don't have to worry about #[warn] inside #[allow]).

Most lints do not maintain state. The lint framework has the capability to do so, and we use this a couple times in Clippy, but most lints are struct FooPass; impl LateLintPass for FooPass {...}, so they won't notice if check_expr or whatever isn't called in a region of #[allow].

The "allowness" of a lint is tied to where in the tree the lint traversal is, not the span emitted by the lint. It's perfectly possible to #[allow] a block of code and have the lint still be emit there because a different block of code (where it is #[warn]) hit the lint but emitted a span it got from elsewhere. We've had trouble with this before in clipyp iirc.

All of this enables this proposal: If we had a specialized StatelessLateLintPass trait for this, then when we have an #[allow] we can turn off calling lint methods entirely, and when there's a #[warn] turn it on again.

It doesn't even have to be a trait, really, it could just be a defaulted fake-static method (until rust-lang/rfcs#2027 happens) on LateLintPass.

I suspect this will get significant speedups for crates like Style. I may be able to take a crack at this if y'all think this is a good approach.

@Manishearth
Copy link
Member

It occurs to me that a boolean check per lint each tree node might actually end up being more expensive? I'll have look at this code closer. Might be possible to make this fast by having a second array of trait objects which only contains the active lint passes.

@Manishearth
Copy link
Member

It seems like we do a large enough amount of work that the whole "vector of active lints" stuff won't really be necessary.

However, I just realized that this bug is not during the lint pass but during the resolution phase (which defers linting to the actual lint pass). This won't be able to help, though we can fix it by doing the span_to_snippet in the actual lint and then making this optimization.

So in that case I suspect this optimization is unnecessary. In general we've rarely had performance issues with lints in clippy (which has way more lints than rustc), and we've been pretty careful to measure impact there.

Still, if folks think this optimization can help, let me know.

bors added a commit that referenced this issue Aug 2, 2017
Fix quadratic performance with lots of use statements

This fixes 2 problems that caused quadratic performance when lots of use-statements were present. After this patch, performance is linear (and very fast) even with 1M uses.

Fixes #43572.
Fixes #43573.

r? @eddyb
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: This is a bug. I-compiletime Issue: Problems and improvements with respect to compile times. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants