-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
alloc: make RawVec round up allocation sizes to next power-of-two #29848
Conversation
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @nikomatsakis (or someone else) soon. If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. The way Github handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes. Please see the contribution instructions for more information. |
Looking at the Would it be sufficient to just return |
Is it actually useful? |
You are right. The other possibility is to make sure that Depending on your usage pattern, this might actually be a win. |
I remember hearing that the "extra capacity" in jemalloc was always basically "round up to a multiple of the requested alignment", but since all non- Note that e.g. Facebook's FBVector doesn't bother requesting usable_size, and it's specifically designed to work really well jemalloc. CC #27627 |
The "may have more cap than requested" stuff is IIRC a reservation for when we get custom local allocators, which strongly favour much more restricted allocation patterns. |
It's worth confirming these sorts of things, because this isn't true at all. From the jemalloc docs:
If you request a size that isn't one of these, jemalloc will round the request up to the next size class. The exact sizes depend on the jemalloc configuration, but this kind of steadily increasing spaces between size classes will always be present with jemalloc. FWIW, we have found it worthwhile to modify multiple vector-like classes in Firefox to maximize capacity in the way suggested by this PR. |
Awesome, thanks! (I did try to verify it but couldn't find anything) |
@nnethercote are you familiar with the reallocation strategies used by FBVector? They make claims of explicitly optimizing for jemalloc. When I looked into their impl it didn't seem to check usable size (but was optimized to work with the size classes to some extent). |
It sounds like we don't need exactly capacity == len, as long as jemalloc gets a capacity in the right size class, but we can't special case for only jemalloc in RawVec, we support several allocators. |
I don't know anything about FBVector beyond what https://github.com/facebook/folly/blob/master/folly/docs/FBVector.md says. It's vague about how it integrates with jemalloc. |
It's probably best to keep discussion about this topic centralized, so if we're talking about whether or not the change at hand should be made let's do that on the issue and if we're talking about the technical details then let's go here instead. Just trying to reduce the number of threads we need to follow! |
I remember there being a claim that this doesn’t (or shouldn’t) really matter if allocation is a power-of-two, however e.g. |
Before actually coming up with this PR (which lazy me should update already!), I made a little testcase here: https://gist.github.com/Swatinem/34ac667b170a069b326d The results show that for strings the with smaller alignment, you get totally different usable sizes (npot) than for u32 for example. |
* adds a note to String::shrink_to_fit to match the same note on Vec regarding additional capacity * change asserts in docs to treat capacity as a lower bound
I just updated the PR as follows: I also removed the usage from |
The system allocator may give more memory than requested, usually rounding up to the next power-of-two. This PR uses a similar logic inside RawVec to be able to make use of this excess memory. NB: this is *not* used for `with_capacity` and `shrink_to_fit` pending discussion in rust-lang#29931
ff6e3e5
to
352ff77
Compare
Rounding up to power of two sounds like it defeats the purpose of the change: rust programs will use more memory. Using |
☔ The latest upstream changes (presumably #30148) made this pull request unmergeable. Please resolve the merge conflicts. |
This still seems to be relevant but has stalled. |
I would be glad to update the patch once again. I would just need someone with the authority to make the decision which approach (using usable_size or going with next pot) to go forward with. As a first step, just for the automatic growing. As I said, I would happily implement all that with a little guidance/mentoring. But to be honest, I don’t really have the patience for the bikeshedding involved, since this seems to be quite controversial. |
I don't think we can move forward without some sample workloads and empirical results. |
Closing due to inactivity, but feel free to resubmit with some numbers as @gankro requested! |
The system allocator may give more memory than requested, usually rounding up to the next power-of-two.
This PR uses a similar logic inside RawVec to be able to make use of this excess memory.
NB: this is not used for
with_capacity
andshrink_to_fit
pending discussion in #29931r? @gankro