You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Auto merge of #37270 - Mark-Simulacrum:smallvec-optimized-arenas, r=eddyb
Add ArrayVec and AccumulateVec to reduce heap allocations during interning of slices
Updates `mk_tup`, `mk_type_list`, and `mk_substs` to allow interning directly from iterators. The previous PR, #37220, changed some of the calls to pass a borrowed slice from `Vec` instead of directly passing the iterator, and these changes further optimize that to avoid the allocation entirely.
This change yields 50% less malloc calls in [some cases](https://pastebin.mozilla.org/8921686). It also yields decent, though not amazing, performance improvements:
```
futures-rs-test 4.091s vs 4.021s --> 1.017x faster (variance: 1.004x, 1.004x)
helloworld 0.219s vs 0.220s --> 0.993x faster (variance: 1.010x, 1.018x)
html5ever-2016- 3.805s vs 3.736s --> 1.018x faster (variance: 1.003x, 1.009x)
hyper.0.5.0 4.609s vs 4.571s --> 1.008x faster (variance: 1.015x, 1.017x)
inflate-0.1.0 3.864s vs 3.883s --> 0.995x faster (variance: 1.232x, 1.005x)
issue-32062-equ 0.309s vs 0.299s --> 1.033x faster (variance: 1.014x, 1.003x)
issue-32278-big 1.614s vs 1.594s --> 1.013x faster (variance: 1.007x, 1.004x)
jld-day15-parse 1.390s vs 1.326s --> 1.049x faster (variance: 1.006x, 1.009x)
piston-image-0. 10.930s vs 10.675s --> 1.024x faster (variance: 1.006x, 1.010x)
reddit-stress 2.302s vs 2.261s --> 1.019x faster (variance: 1.010x, 1.026x)
regex.0.1.30 2.250s vs 2.240s --> 1.005x faster (variance: 1.087x, 1.011x)
rust-encoding-0 1.895s vs 1.887s --> 1.005x faster (variance: 1.005x, 1.018x)
syntex-0.42.2 29.045s vs 28.663s --> 1.013x faster (variance: 1.004x, 1.006x)
syntex-0.42.2-i 13.925s vs 13.868s --> 1.004x faster (variance: 1.022x, 1.007x)
```
We implement a small-size optimized vector, intended to be used primarily for collection of presumed to be short iterators. This vector cannot be "upsized/reallocated" into a heap-allocated vector, since that would require (slow) branching logic, but during the initial collection from an iterator heap-allocation is possible.
We make the new `AccumulateVec` and `ArrayVec` generic over implementors of the `Array` trait, of which there is currently one, `[T; 8]`. In the future, this is likely to expand to other values of N.
Huge thanks to @nnethercote for collecting the performance and other statistics mentioned above.
0 commit comments