-
Notifications
You must be signed in to change notification settings - Fork 1.8k
[PERF] Don't spawn so many compilers (3/2) (19m -> 250k) #15030
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
c26ab0d to
af51bb8
Compare
| // | ||
| // Also, as we only check for attribute names and don't do macro expansion, | ||
| // we can check only for #[test] | ||
| if !(text.contains("fn main") || text.contains("#[test]")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hm, if someone writes fn main() instead of fn main() will this cause false negatives?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed (and improved filtering!) Now we check for instances of fn (two times) and main in general.
y21
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a nice optimisation, but I have one question
| // Also, as we only check for attribute names and don't do macro expansion, | ||
| // we can check only for #[test] | ||
|
|
||
| if !((text.contains(" main") && text.splitn(2, "fn ").nth(2).is_none()) || text.contains("#[test]")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The splitn condition is a bit confusing, I can't tell what it's checking for. splitn(2) limits the iterator to 2 items, so isn't nth(2) always None?
Is this meant to be splitn(3, ..).nth(2).is_none() to check that there are at most two occurences of fn? But even then, I don't see why this is needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, what an oversight that splitn(2) should actually be splitn(3), I'll add tests with 3 and 4 functions.
This is necessary because in the actual check_code_sample we care only if the code block has one function and that function is main() (This was the behaviour even before this PR)
If there's more than one function, fn main would be relevant as a separate entity so we cannot report it as useless.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've simplified filtering again with the risk of being over-scoped. This is still a pretty heavy optimization just with some more flexibility. Also accounting for all the possible spaces in between fn and main.
75e694d to
3c60c42
Compare
y21
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, simplifying indeed made it easier to understand and I think that's already good enough, though two small questions (nothing major)
4fbc9ab to
e222faf
Compare
Avoid creating so many SessionGlobals Improve filtering and account for spacing Actually return early
e222faf to
3745a3f
Compare
y21
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
| /// Additional function, shouldn't trigger | ||
| /// ```rust | ||
| /// fn additional_function() { | ||
| /// let _ = 0; | ||
| /// // Thus `fn main` is actually relevant! | ||
| /// } | ||
| /// fn main() { | ||
| /// let _ = 0; | ||
| /// } | ||
| /// ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Non blocking question) Out of curiosity, do you know why we don't want to lint this case (when there is another function)? Nested functions are allowed and so the function would instead just be part of the implicit main fn, so it could still be removed as far as I can tell?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really, I suppose it's just a sensible thing to do. When having more than one function omitting fn main makes it look like some pseudo-language for scripting or something.
fn a() -> bool {
return true
}
let _ = 0;
let _ = a();
// ^^^^ What's the scope for this??|
Weird CI error 🤔 html5ever fails to build? |
This is rust-lang/rust#142785 |
Optimize
needless_doctest_main, make it short-circuit, make sure that we don't spin up a new compiler on EVERY code block.The old implementation was creating a new compiler, new parser, new thread, new SessionGlobals, new everything for each code block. No matter if they actually didn't even contain
fn main()or anything relevant.On callgrind, seems that we're reducing about a 6.7242% de cycle count (which turns out to be a 38 million instruction difference, great!). Benchmarked in
bumpalo-3.16.0. Also on bumpalo we spawn 78 less threads. This movesSessionGlobals::newfrom the top time-consuming function by itself in some benchmarks, into one not even in the top 500.Also, populate the test files.
changelog:[
needless_doctest_main]: Avoid spawning so many threads in unnecessary circumstances