-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More efficient Vec::reserve strategy #14264
Conversation
Before this commit new capacity was always chosen as the next power of two for requested capacity. It is inefficient in scenarios where several elements are pushed to the vector at once, and it is final operations. Example: ``` let v = Vec::new(); v.grow(10, false); // do something with vector ``` With patch applied, new capacity is chosen as max of current capacity doubled and requested capacity. Before this patch resulting capacity of the vector from the example above is 16. With patch applied, capacity is 10. Similar strategy is used, for instance, to reserve capacity of a vector in libc++ (1) (`vector::__recommend()`) or to reserve capacity of ArrayList in OpenJDK (2) (`ArrayList.grow()`). Patch also includes minor adjustment that does not affect reallocations: capacity is compared to `self.cap`, not to `self.len` in `reserve()`. (1) http://llvm.org/viewvc/llvm-project/libcxx/trunk/include/vector?view=markup (2) http://hg.openjdk.java.net/jdk8/jdk8/jdk/file/687fd7c7986d/src/share/classes/java/util/ArrayList.java
This is only more efficient when you're doing a single mutation to the vector. If you push anything else to the vector afterwards, then it's markedly less efficient: let v = Vec::new();
v.grow(10, false);
v.grow(6, true); You cannot assume that any given operation that lengthens a vector is the only operation that's going to be applied. In fact, it's relatively common to append several collections to a vector in a row. If you know you're only going to be doing the one mutation operation, you can use |
@kballard Consider
Since probablility of case 2 is greater than zero, patched
|
Asymptotically? Vectors don't grow asymptotically. And the patched I also disagree that the Which is to say, once again, the patched But of course we're also missing the obvious reason why this is bad, which is that it's disregarding proper allocator behavior. Allocating powers of 2 is generally the best way to handle allocators (although I believe @thestinger has indicated that, for jemalloc, we should actually be using powers of 2 only once the allocation size is >=4096, and should be using a different scheme with IIRC a 1.5x multiplier below that). |
@kballard I meant statistically.
If vector continues to grow, it is a tie (statistically).
No, it is not. It has the same number of reallocations and the same average capacity overhead. Patched
I guess, it is quite the contrary: allocators use powers of two for small allocations. But it is not important right now, and you made a good point. BTW, if rust's going to rely on implementation details of concrete allocator (jemalloc), I suppose it should query allocator for actual allocated memory size (something like, Anyway, I think this patch is not very important, so I better close this PR. |
@kballard: The crossover point is actually far larger than 4096 bytes, but otherwise you're remembering correctly. Using two as the multiplier is the worst case at small sizes because it misses any chance to do in-place growth. |
The `IoBufRead` diagnostic has been added during the latest rustup. changelog: none
Before this commit new capacity was always chosen as the next power of
two for requested capacity.
It is inefficient in scenarios where several elements are pushed to the
vector at once, and it is final operations. Example:
With patch applied, new capacity is chosen as max of current capacity
doubled and requested capacity.
Before this patch resulting capacity of the vector from the example
above is 16. With patch applied, capacity is 10.
Similar strategy is used, for instance, to reserve capacity of a vector in
libc++ (1) (
vector::__recommend()
) or to reserve capacity of ArrayListin OpenJDK (2) (
ArrayList.grow()
).Patch also includes minor adjustment that does not affect reallocations:
capacity is compared to
self.cap
, not toself.len
inreserve()
.(1) http://llvm.org/viewvc/llvm-project/libcxx/trunk/include/vector?view=markup
(2) http://hg.openjdk.java.net/jdk8/jdk8/jdk/file/687fd7c7986d/src/share/classes/java/util/ArrayList.java