-
Notifications
You must be signed in to change notification settings - Fork 13.3k
Grab bag of runtime optimizations #8734
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Removing approval because @bors is going crazy right now |
Force line ending of '.in' files in jemalloc to LF
Naturally, and sadly, turning off sanity checks in the runtime is a noticable performance win. The particular test I'm running goes from ~1.5 s to ~1.3s. Sanity checks are turned *on* when not optimizing, or when cfg includes `rtdebug` or `rtassert`.
This makes the lock much less contended. In the test I'm running the number of times it's contended goes from ~100000 down to ~1000.
It's not a huge win but it does reduce the amount of time spent contesting the message queue when the schedulers are under load
These aren't used for anything at the moment and cause some TLS hits on some perf-critical code paths. Will need to put better thought into it in the future.
vec::unshift uses this to add elements, scheduler queues use unshift, and this was causing a lot of reallocation
I'm not comfortable turning off rtassert! yet
flip1995
pushed a commit
to flip1995/rust
that referenced
this pull request
Jun 4, 2022
Add some testcases for recent rustfix update changelog: none This adds a testcase for a bugfix that has been fixed by https://github.com/rust-lang/rustfix/tree/v0.6.1 `rustfix` is pulled in by `compiletest_rs`. So to test that the correct rustfix version is used, I added one (and a half) testcase. I tried to add a testcase for rust-lang#8734 as well, but interesting enough the rustfix is wrong: ```diff fn issue8734() { let _ = [0u8, 1, 2, 3] .into_iter() - .and_then(|n| match n { + .flat_map(|n| match n { + 1 => [n + .saturating_add(1) 1 => [n .saturating_add(1) .saturating_add(1) .saturating_add(1) .saturating_add(1) .saturating_add(1) .saturating_add(1) .saturating_add(1) .saturating_add(1)], n => [n], }); } ``` this needs some investigation and then this testcase needs to be enabled by commenting it out closes rust-lang#8878 related to rust-lang#8734
flip1995
pushed a commit
to flip1995/rust
that referenced
this pull request
Jul 18, 2022
flip1995
pushed a commit
to flip1995/rust
that referenced
this pull request
Jul 18, 2022
Uncomment test for rust-lang#8734 I believe the issue was an interaction between rustfix and `span_lint_and_sugg_for_edges`, so this would've been fixed by rust-lang#98261 (Thanks, `@WaffleLapkin!)` Closes rust-lang#8734 changelog: none
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Here are a bunch of of small optimizations that add up to a 36% improvement on one particular message passing benchmark.
After this and @toddaaro's optimizations from #8566 the next biggest wins are probably going to be avoiding the event loop, which is another 25%, and using a less-allocating channel implementation (not sure how much this wins but it should be a lot). Beyond that there is still the important optimization of using the stack pointer for TLS instead of the TLS API, reducing lock contention, identifying and reducing other syscalls, page faults, context switches and allocations, recycling tasks. Codegen improvements may help as well as there appears to be some nonsense in the assembly that one wouldn't write by hand.
There are a couple of notable changes here:
rtassert!
off for optimized builds, adding a new constant that can be queried before running expensive sanity checks:pub static ENFORCE_SANITY: bool = !cfg!(rtopt) || cfg!(rtdebug) || cfg!(rtassert)
.cfg(rtopt)
is turned on by makefiles.Before (just with #8566):
After (these opts + #8566):
And here is how Go does on the same benchmark:
Here's what the profile looks like after these optimizations, those in #8566, and a hacked up optimization to not hit epoll (not included in this PR):