More efficient Vec::reserve strategy #14264

stepancheg · 2014-05-17T20:23:27Z

Before this commit new capacity was always chosen as the next power of
two for requested capacity.

It is inefficient in scenarios where several elements are pushed to the
vector at once, and it is final operations. Example:

let v = Vec::new();
v.grow(10, false);
// do something with vector

With patch applied, new capacity is chosen as max of current capacity
doubled and requested capacity.

Before this patch resulting capacity of the vector from the example
above is 16. With patch applied, capacity is 10.

Similar strategy is used, for instance, to reserve capacity of a vector in
libc++ (1) (vector::__recommend()) or to reserve capacity of ArrayList
in OpenJDK (2) (ArrayList.grow()).

Patch also includes minor adjustment that does not affect reallocations:
capacity is compared to self.cap, not to self.len in reserve().

(1) http://llvm.org/viewvc/llvm-project/libcxx/trunk/include/vector?view=markup
(2) http://hg.openjdk.java.net/jdk8/jdk8/jdk/file/687fd7c7986d/src/share/classes/java/util/ArrayList.java

Before this commit new capacity was always chosen as the next power of two for requested capacity. It is inefficient in scenarios where several elements are pushed to the vector at once, and it is final operations. Example: ``` let v = Vec::new(); v.grow(10, false); // do something with vector ``` With patch applied, new capacity is chosen as max of current capacity doubled and requested capacity. Before this patch resulting capacity of the vector from the example above is 16. With patch applied, capacity is 10. Similar strategy is used, for instance, to reserve capacity of a vector in libc++ (1) (`vector::__recommend()`) or to reserve capacity of ArrayList in OpenJDK (2) (`ArrayList.grow()`). Patch also includes minor adjustment that does not affect reallocations: capacity is compared to `self.cap`, not to `self.len` in `reserve()`. (1) http://llvm.org/viewvc/llvm-project/libcxx/trunk/include/vector?view=markup (2) http://hg.openjdk.java.net/jdk8/jdk8/jdk/file/687fd7c7986d/src/share/classes/java/util/ArrayList.java

lilyball · 2014-05-17T21:45:58Z

This is only more efficient when you're doing a single mutation to the vector. If you push anything else to the vector afterwards, then it's markedly less efficient:

let v = Vec::new();
v.grow(10, false);
v.grow(6, true);

You cannot assume that any given operation that lengthens a vector is the only operation that's going to be applied. In fact, it's relatively common to append several collections to a vector in a row.

If you know you're only going to be doing the one mutation operation, you can use .reserve_exact() ahead of time to avoid the excess capacity. But libstd doesn't have enough knowledge to know when this is appropriate. Only clients of the library can make that call.

stepancheg · 2014-05-18T02:47:26Z

@kballard Consider a.push_all(b) Basically, there are two mutually exclusive cases:

a.len() <= b.len(). In this case patched reserve is as efficient as original reserve
statistically: for random sets of lengths of a and b. For example, patched reserve loses in grow(10); grow(6); and wins in grow(10); grow(7);
a.len() < b.len(). In this case patched reserve is clear winner.

Since probablility of case 2 is greater than zero, patched reserve is statistically better.

reserve_exact is not panacea, because reserve* are rarely used directly, but rather implicitly in push_all, push_all_move, append and similar operations.

lilyball · 2014-05-18T02:56:00Z

@stepancheg

Asymptotically? Vectors don't grow asymptotically. And the patched reserve doesn't win in grow(10); grow(7);. Both the original and the patched behavior has to reallocate on the grow(7), which means that it's at best a tie. Although the patched reserve will end up with a capacity of 20 and the original with a capacity of 32. In the general case, it's hard to say which is better (it largely depends on whether the vector will continue to grow).

I also disagree that the a.len() < b.len() case is a "clear winner". This is equivalent to the grow(7) case, in that both approaches will have to reallocate, but the patched reserve will end up with a smaller capacity. In fact, the patched reserve is very likely to be worse-performing here, because even appending a single element will require another reallocation (as it did an exact reserve).

Which is to say, once again, the patched reserve definitively loses if you have to grow the vector again.

But of course we're also missing the obvious reason why this is bad, which is that it's disregarding proper allocator behavior. Allocating powers of 2 is generally the best way to handle allocators (although I believe @thestinger has indicated that, for jemalloc, we should actually be using powers of 2 only once the allocation size is >=4096, and should be using a different scheme with IIRC a 1.5x multiplier below that).

stepancheg · 2014-05-18T03:29:35Z

@kballard I meant statistically.

Although the patched reserve will end up with a capacity of 20 and the original with a capacity of 32. In the general case, it's hard to say which is better (it largely depends on whether the vector will continue to grow).

I also disagree that the a.len() < b.len() case is a "clear winner". This is equivalent to the grow(7) case, in that both approaches will have to reallocate, but the patched reserve will end up with a smaller capacity. In fact, the patched reserve is very likely to be worse-performing here, because even appending a single element will require another reallocation (as it did an exact reserve).

If vector continues to grow, it is a tie (statistically).

Which is to say, once again, the patched reserve definitively loses if you have to grow the vector again.

No, it is not. It has the same number of reallocations and the same average capacity overhead. Patched realloc wins if last push pushes more than vec already contains.

But of course we're also missing the obvious reason why this is bad, which is that it's disregarding proper allocator behavior. Allocating powers of 2 is generally the best way to handle allocators (although I believe @thestinger has indicated that, for jemalloc, we should actually be using powers of 2 only once the allocation size is >=4096, and should be using a different scheme with IIRC a 1.5x multiplier below that).

I guess, it is quite the contrary: allocators use powers of two for small allocations. But it is not important right now, and you made a good point.

BTW, if rust's going to rely on implementation details of concrete allocator (jemalloc), I suppose it should query allocator for actual allocated memory size (something like, je_sallocx function) and store it in vector cap field. In this case, patched realloc will be even better.

Anyway, I think this patch is not very important, so I better close this PR.

thestinger · 2014-05-18T05:05:24Z

@kballard: The crossover point is actually far larger than 4096 bytes, but otherwise you're remembering correctly. Using two as the multiplier is the worst case at small sizes because it misses any chance to do in-place growth.

The `IoBufRead` diagnostic has been added during the latest rustup. changelog: none

stepancheg closed this May 18, 2014

flip1995 pushed a commit to flip1995/rust that referenced this pull request Feb 27, 2025

Remove obsolete comment and simplify code (rust-lang#14264)

732e329

The `IoBufRead` diagnostic has been added during the latest rustup. changelog: none

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

More efficient Vec::reserve strategy #14264

More efficient Vec::reserve strategy #14264

stepancheg commented May 17, 2014

lilyball commented May 17, 2014

stepancheg commented May 18, 2014

lilyball commented May 18, 2014

stepancheg commented May 18, 2014

thestinger commented May 18, 2014

More efficient Vec::reserve strategy #14264

More efficient Vec::reserve strategy #14264

Conversation

stepancheg commented May 17, 2014

lilyball commented May 17, 2014

stepancheg commented May 18, 2014

lilyball commented May 18, 2014

stepancheg commented May 18, 2014

thestinger commented May 18, 2014