-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lazily allocate TypedArena's first chunk #36592
Conversation
(rust_highfive has picked a reviewer for you, use r? to override) |
Some more details. For hyper.0.5.0 this reduces cumulative heap allocations
(These measurements are from Valgrind's DHAT tool, which I used to identify This is due to rustc's frequent use of
The stage2 improvements are smaller, presumably because jemalloc is faster at |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not an official Rust reviewer, but some general thoughts I had when looking through the code.
let prev_capacity = chunks.last().unwrap().storage.cap(); | ||
let new_capacity = prev_capacity.checked_mul(2).unwrap(); | ||
if chunks.last_mut().unwrap().storage.double_in_place() { | ||
if chunks.len() == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prefer Vec::is_empty()
for mut chunk in chunks_borrow.drain(..last_idx) { | ||
let cap = chunk.storage.cap(); | ||
chunk.destroy(cap); | ||
if chunks_borrow.len() > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prefer !chunks_borrow.is_empty()
.
chunk.destroy(cap); | ||
if chunks_borrow.len() > 0 { | ||
let last_idx = chunks_borrow.len() - 1; | ||
self.clear_last_chunk(&mut chunks_borrow[last_idx]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not chunks_borrow.last_mut()
? It might conflict with the drain below, in which case you can use split_at_mut
I think.
for chunk in chunks_borrow.iter_mut() { | ||
let cap = chunk.storage.cap(); | ||
chunk.destroy(cap); | ||
if chunks_borrow.len() > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prefer is_empty
.
The code (both preexisting and current) also duplicates the pop last element, then drain/mutably iterate and destroy the rest of the chunks. Can it be extracted into a helper function? I've discussed this with @nnethercote, I think they believe that this would be best done as a follow up PR; leaving this comment here so this idea doesn't get lost. |
49ef1cf
to
177ac54
Compare
I replaced the |
In the libcollections convention, Alternatively, Either way, doc comments need updates to not say "preallocated" for new and with_capacity. |
The is_empty tests seem to be inverted |
177ac54
to
d6ed815
Compare
Thank you for the comments, @bluss. I fixed the inverted is_empty tests and updated the comments for |
I did a quick grep over the rust source code and I don't think |
let prev_capacity = chunks.last().unwrap().storage.cap(); | ||
let new_capacity = prev_capacity.checked_mul(2).unwrap(); | ||
if chunks.last_mut().unwrap().storage.double_in_place() { | ||
if chunks.is_empty() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would too like to rewrite this to use the .last_mut() option for control flow. (None is the empty case). But it needs some wrangling to be able to call chunks.push
at the end.
let cap = chunk.storage.cap(); | ||
chunk.destroy(cap); | ||
if !chunks_borrow.is_empty() { | ||
let mut last_chunk = chunks_borrow.pop().unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pop's Option can be used for control flow
for mut chunk in chunks_borrow.drain(..last_idx) { | ||
let cap = chunk.storage.cap(); | ||
chunk.destroy(cap); | ||
if !chunks_borrow.is_empty() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could use .pop() here too, drain all other chunks, then put the last chunk back (seems like the simplest way to keep the borrow checker happy).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not more work than what drain already does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using the Options for control flow will end up with prettier Rust code
(I haven't ever used the review feature before. I haven't heard any news on how we want to use it in the project.) |
Currently `TypedArena` allocates its first chunk, which is usually 4096 bytes, as soon as it is created. If no allocations are ever made from the arena then this allocation (and the corresponding deallocation) is wasted effort. This commit changes `TypedArena` so it doesn't allocate the first chunk until the first allocation is made. This change speeds up rustc by a non-trivial amount because rustc uses `TypedArena` heavily: compilation speed (producing debug builds) on several of the rustc-benchmarks increases by 1.02--1.06x. The change should never cause a slow-down because the hot `alloc` function is unchanged. It does increase the size of `TypedArena` by one `usize` field, however. The commit also fixes some out-of-date comments.
d6ed815
to
80a4477
Compare
I made the requested control flow changes. I haven't changed |
Note that |
@bors r+ |
📌 Commit 80a4477 has been approved by |
Nice catch @nnethercote! The redundant arenas used to not matter because we had 1 |
Now that I have a better idea of how rustc-benchmarks works, here are some stage 1 (uses glibc malloc)
stage2 (uses jemalloc)
With glibc malloc they're mostly in the range 1.03--1.06x faster. With jemalloc |
Lazily allocate TypedArena's first chunk Currently `TypedArena` allocates its first chunk, which is usually 4096 bytes, as soon as it is created. If no allocations are ever made from the arena then this allocation (and the corresponding deallocation) is wasted effort. This commit changes `TypedArena` so it doesn't allocate the first chunk until the first allocation is made. This change speeds up rustc by a non-trivial amount because rustc uses `TypedArena` heavily: compilation speed (producing debug builds) on several of the rustc-benchmarks increases by 1.02--1.06x. The change should never cause a slow-down because the hot `alloc` function is unchanged. It does increase the size of `TypedArena` by one `usize` field, however. The commit also fixes some out-of-date comments.
Lazily allocate TypedArena's first chunk Currently `TypedArena` allocates its first chunk, which is usually 4096 bytes, as soon as it is created. If no allocations are ever made from the arena then this allocation (and the corresponding deallocation) is wasted effort. This commit changes `TypedArena` so it doesn't allocate the first chunk until the first allocation is made. This change speeds up rustc by a non-trivial amount because rustc uses `TypedArena` heavily: compilation speed (producing debug builds) on several of the rustc-benchmarks increases by 1.02--1.06x. The change should never cause a slow-down because the hot `alloc` function is unchanged. It does increase the size of `TypedArena` by one `usize` field, however. The commit also fixes some out-of-date comments.
@nnethercote FWIW I've been meaning to eventually move to a single common drop-less arena (instead of a dozen typed ones), but there were things to rework first to make that even possible - we're almost there, in fact |
_own: PhantomData, | ||
} | ||
TypedArena { | ||
first_chunk_capacity: cmp::max(1, capacity), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If with_capacity
isn't used, I think it'd be worth just not having first_chunk_capacity
around at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good suggestion. I'll file a follow-up PR to remove with_capacity
once this one lands.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, this PR would be simpler if it also did that change, I'm saying. I'd r+ it immediately and this PR will have to wait at least half a day more before getting merged, so you have time now.
Lazily allocate TypedArena's first chunk Currently `TypedArena` allocates its first chunk, which is usually 4096 bytes, as soon as it is created. If no allocations are ever made from the arena then this allocation (and the corresponding deallocation) is wasted effort. This commit changes `TypedArena` so it doesn't allocate the first chunk until the first allocation is made. This change speeds up rustc by a non-trivial amount because rustc uses `TypedArena` heavily: compilation speed (producing debug builds) on several of the rustc-benchmarks increases by 1.02--1.06x. The change should never cause a slow-down because the hot `alloc` function is unchanged. It does increase the size of `TypedArena` by one `usize` field, however. The commit also fixes some out-of-date comments.
[breaking-change] Remove TypedArena::with_capacity This is a follow-up to #36592. The function is unused by rustc. Also, it doesn't really follow the usual meaning of a `with_capacity` function because the first chunk allocation is now delayed until the first `alloc` call. This change reduces the size of `TypedArena` by one `usize`. @eddyb: we discussed this on IRC. Would you like to review it?
Currently
TypedArena
allocates its first chunk, which is usually 4096bytes, as soon as it is created. If no allocations are ever made from
the arena then this allocation (and the corresponding deallocation) is
wasted effort.
This commit changes
TypedArena
so it doesn't allocate the first chunkuntil the first allocation is made.
This change speeds up rustc by a non-trivial amount because rustc uses
TypedArena
heavily: compilation speed (producing debug builds) onseveral of the rustc-benchmarks increases by 1.02--1.06x. The change
should never cause a slow-down because the hot
alloc
function isunchanged. It does increase the size of
TypedArena
by oneusize
field, however.
The commit also fixes some out-of-date comments.