You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It's sometimes useful to convert a flattened Vec<T> into a chunked Vec<[T; N]>. There's Vec::into_flattened to go the other direction, but currently no function in Vec to efficiently un-flatten a vector.
Motivating examples or use cases
In graphics programming, Vec<[u16; 3]> is a natural representation of an index buffer of triangles (just like Vec<[f32; 3]> for 3D vertex buffers and Vec<[f32; 2]> for UVs), but data might get decoded from a file or received from an API as a flat Vec<u16>. It would be convenient to easily convert between flat and chunked forms, especially if the data needs to be passed to another API that requires an owned Vec in chunked form and doesn't accept iterators or slices.
Iterators and slices already have array_chunks, along with a bunch of other chunk-related functions, so a new Vec method would fit any use case where those chunked sequences need to be owned as well (or growable, unlike Box<[T]>). Any use case for Vec::into_flattened that may need reversing applies here too.
Solution sketch
Here is a potential Vec::into_chunks method that complements Vec::into_flattened. This version accepts any valid length and capacity and drops/reallocates as needed, but it could alternatively panic if the length is not divisible by N.
impl<T,A:Allocator>Vec<T,A>{/// Groups every `N` elements in the `Vec<T>` into chunks to produce a `Vec<[T; N]>`, dropping/// elements in the remainder. `N` must be greater than zero.////// If the capacity is not a multiple of the chunk size, the buffer will shrink down to the/// nearest multiple with a reallocation or deallocation.////// This function can be used to reverse [`Vec::into_flattened`].////// # Examples////// ```/// let vec = vec![0, 1, 2, 3, 4, 5, 6, 7];/// assert_eq!(vec.into_chunks::<3>(), [[0, 1, 2], [3, 4, 5]]);////// let vec = vec![0, 1, 2, 3];/// let chunks: Vec<[u8; 10]> = vec.into_chunks();/// assert!(chunks.is_empty());////// let flat = vec![0; 8 * 8 * 8];/// let reshaped: Vec<[[[u8; 8]; 8]; 8]> = flat.into_chunks().into_chunks().into_chunks();/// assert_eq!(reshaped.len(), 1);/// ```pubfninto_chunks<constN:usize>(mutself) -> Vec<[T;N],A>{const{assert!(N != 0,"chunk size should be greater than zero");}let(len, cap) = (self.len(),self.capacity());let len_remainder = len % N;if len_remainder != 0{self.truncate(len - len_remainder);}let cap_remainder = cap % N;if !T::IS_ZST && cap_remainder != 0{self.buf.shrink_to_fit(cap - cap_remainder);}let(ptr, _, _, alloc) = self.into_raw_parts_with_alloc();// SAFETY:// - `ptr` and `alloc` were just returned from `self.into_raw_parts_with_alloc()`// - `[T; N]` has the same alignment as `T`// - `size_of::<[T; N]>() * cap / N == size_of::<T>() * cap`// - `len / N <= cap / N` because `len <= cap`// - the allocated memory consists of `len / N` valid values of type `[T; N]`// - `cap / N` fits the size of the allocated memory after shrinkingunsafe{Vec::from_raw_parts_in(ptr.cast(), len / N, cap / N, alloc)}}}
I've seen conflicting information on whether align_of::<T>() is guaranteed to equal align_of::<[T; N]>()--see the unsafe code guidelines, also discussed here, vs the Rust reference. Other standard library functions assume it is though, so I think this is okay. And even though the from_raw_parts documentation says "capacity needs to be the capacity that the pointer was allocated with," from_raw_parts_in has more permissive language, and existing methods like into_flattened do the same kind of thing into_chunks is doing.
Alternatives
I think the best safe way to achieve this at the moment is Iterator::array_chunks:
vec.into_iter().array_chunks::<N>().collect()
This is admittedly pretty simple, and in many cases the optimized output is similar to what into_chunks generates, at least in avoiding unnecessary allocation or memcpy calls. However, there are still subtle performance downsides. The generated code is rather long and messy for non powers of 2: if you assert, for example, that vec.len() % 3 == 0 and vec.capacity() % 3 == 0 before calling array_chunks::<3>(), LLVM is apparently unable to recognize that there's no remainder and still does the extra checks for reallocation. And there's a loop in the optimized asm that counts chunks one by one to find where the remainder starts.
I did some rough benchmarking with Vec<f32>s of random capacities and lengths. Unsurprisingly, in debug mode into_chunks is usually something like 40x faster. The improvement does become negligible with heavier optimization (opt-level = 3, codegen-units = 1, etc), hovering consistently around 1.01-1.04x with my testing setup. Still, it's nice to have a guaranteed safe and efficient API that doesn't need to rely on uncertain optimizations.
Other alternatives include unsafe code or a crate like bytemuck.
Links and related work
The tracking issues for slice_flatten and array_chunks include discussion on chunking and flattening methods for iterators and slices. This issue has more on array_chunks.
What happens now?
This issue contains an API change proposal (or ACP) and is part of the libs-api team feature lifecycle. Once this issue is filed, the libs-api team will review open proposals as capability becomes available. Current response times do not have a clear estimate, but may be up to several months.
Possible responses
The libs team may respond in various different ways. First, the team will consider the problem (this doesn't require any concrete solution or alternatives to have been proposed):
We think this problem seems worth solving, and the standard library might be the right place to solve it.
We think that this probably doesn't belong in the standard library.
Second, if there's a concrete solution:
We think this specific solution looks roughly right, approved, you or someone else should implement this. (Further review will still happen on the subsequent implementation PR.)
We're not sure this is the right solution, and the alternatives or other materials don't give us enough information to be sure about that. Here are some questions we have that aren't answered, or rough ideas about alternatives we'd want to see discussed.
The text was updated successfully, but these errors were encountered:
Proposal
Problem statement
It's sometimes useful to convert a flattened
Vec<T>
into a chunkedVec<[T; N]>
. There'sVec::into_flattened
to go the other direction, but currently no function inVec
to efficiently un-flatten a vector.Motivating examples or use cases
In graphics programming,
Vec<[u16; 3]>
is a natural representation of an index buffer of triangles (just likeVec<[f32; 3]>
for 3D vertex buffers andVec<[f32; 2]>
for UVs), but data might get decoded from a file or received from an API as a flatVec<u16>
. It would be convenient to easily convert between flat and chunked forms, especially if the data needs to be passed to another API that requires an ownedVec
in chunked form and doesn't accept iterators or slices.Iterators and slices already have
array_chunks
, along with a bunch of other chunk-related functions, so a newVec
method would fit any use case where those chunked sequences need to be owned as well (or growable, unlikeBox<[T]>
). Any use case forVec::into_flattened
that may need reversing applies here too.Solution sketch
Here is a potential
Vec::into_chunks
method that complementsVec::into_flattened
. This version accepts any valid length and capacity and drops/reallocates as needed, but it could alternatively panic if the length is not divisible byN
.I've seen conflicting information on whether
align_of::<T>()
is guaranteed to equalalign_of::<[T; N]>()
--see the unsafe code guidelines, also discussed here, vs the Rust reference. Other standard library functions assume it is though, so I think this is okay. And even though thefrom_raw_parts
documentation says "capacity
needs to be the capacity that the pointer was allocated with,"from_raw_parts_in
has more permissive language, and existing methods likeinto_flattened
do the same kind of thinginto_chunks
is doing.Alternatives
I think the best safe way to achieve this at the moment is
Iterator::array_chunks
:This is admittedly pretty simple, and in many cases the optimized output is similar to what
into_chunks
generates, at least in avoiding unnecessary allocation or memcpy calls. However, there are still subtle performance downsides. The generated code is rather long and messy for non powers of 2: if you assert, for example, thatvec.len() % 3 == 0
andvec.capacity() % 3 == 0
before callingarray_chunks::<3>()
, LLVM is apparently unable to recognize that there's no remainder and still does the extra checks for reallocation. And there's a loop in the optimized asm that counts chunks one by one to find where the remainder starts.I did some rough benchmarking with
Vec<f32>
s of random capacities and lengths. Unsurprisingly, in debug modeinto_chunks
is usually something like 40x faster. The improvement does become negligible with heavier optimization (opt-level = 3
,codegen-units = 1
, etc), hovering consistently around 1.01-1.04x with my testing setup. Still, it's nice to have a guaranteed safe and efficient API that doesn't need to rely on uncertain optimizations.Other alternatives include
unsafe
code or a crate likebytemuck
.Links and related work
The tracking issues for
slice_flatten
andarray_chunks
include discussion on chunking and flattening methods for iterators and slices. This issue has more onarray_chunks
.What happens now?
This issue contains an API change proposal (or ACP) and is part of the libs-api team feature lifecycle. Once this issue is filed, the libs-api team will review open proposals as capability becomes available. Current response times do not have a clear estimate, but may be up to several months.
Possible responses
The libs team may respond in various different ways. First, the team will consider the problem (this doesn't require any concrete solution or alternatives to have been proposed):
Second, if there's a concrete solution:
The text was updated successfully, but these errors were encountered: