-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Provide a means of turning iterators into fixed-size arrays #81615
Comments
Existing PRs #69985, #75644 and #79659. But It's great to have a place to consolidate discussion since there are so many ways to skin this cat. Additional options.
|
This is an interesting alternative, and has precedence with
This alternative is mentioned above. The difficulty of handling the error case means that I wouldn't want this to be the only way to convert an iterator into an array, but I wouldn't be opposed to it existing alongside of another approach. Thinking about it, I suppose this approach could technically be considered equivalent to the previously-mentioned "let the user provide a [value]" approach, assuming that the user does something like
Can you elaborate on what this is? |
Something like struct ArrayVec<T, N: const usize> {
len: usize,
data: [MaybeUninit<T>; N]
}
impl ArrayVec {
/// may panic
fn to_ary(self) -> [T; N] {
// ...
}
} This has been proposed in other conversations as it would help in several places in the standard library where an It's similar to returning a tuple of a partially initialized array and a usize that tells you how many members have been initialized, except that it's safe to use. With this we could do |
#81382 (comment) suggests That is actually more general than using #79659 (comment) also suggests something similar, albeit as an iterator method instead of an iterator adapter.
I'm not sure if that's possible. Iterators are stateful, they can be in a partially exhausted without changing their type. So the remaining number of elements can't be determined at compile time. |
Hm, I'm very intrigued by the suggestion of I've also added |
What about As for the return type, it is worth noting: Whereas in an array, if some elements are "gone" (e.g. when a None is returned from |
Probably
Well, that's just the return type question again, but this time for the associated type, i.e. If Or it could have any of the other discussed result types, e.g.
They're actually a bit different, both having advantages and disadvantages. If
So those should be considered as distinct options. |
I was mistaken about something in the original post; while it is impossible to impl |
That's a good point! Getting the rest from the adapter definitely avoids the problem of the complicated signature. The downside is that you can forget to look at it, as opposed to a Result, and also that this pattern does not play well with a for loop. |
There were some ideas in #80094, not sure how/if that would fit in with the current Iterators though... @leonardo-m 's comment:
|
I'm not entirely clear what that's suggesting, but it looks like they're asking for an iterator where the length is encoded in the type somehow, which means that the type of the iterator would need to change with every call to |
A somewhat tangentional option would be to have a separate iterator source method on the collections. This has the benefit of guaranteed access to the size, so the iterator will know in advance how large the array can be, and the remainder of the elements will remain in the collection. Similiar to
This doesn't solve the problem of turning arbitrary iterators into an array, but it directly lets you get arrays out of collections and process them. And it avoids the impedance mismatch of iterator methods such as |
#82098 seeks to add a private helper to the stdlib and might serve as the building block for whatever API is decided upon. Some notes on its design:
|
In my codebase I'm using similar things since many months. So far I've settled to two functions:
The design space is wide enough, probably there are other reasonable designs (like returning Options instead of asserting, or truncating the iterable ignoring extra items, etc). But those two cover nearly all my use cases. (I am also using two more functions, a |
…ray, r=dtolnay Add internal `collect_into_array[_unchecked]` to remove duplicate code Unlike the similar PRs rust-lang#69985, rust-lang#75644 and rust-lang#79659, this PR only adds private functions and does not propose any new public API. The change is just for the purpose of avoiding duplicate code. Many array methods already contained the same kind of code and there are still many array related methods to come (e.g. `Iterator::{chunks, map_windows, next_n, ...}`, `[T; N]::{cloned, copied, ...}`, ...) which all basically need this functionality. Writing custom `unsafe` code for each of those doesn't seem like a good idea. I added two functions in this PR (and not just the `unsafe` version) because I already know that I need the `Option`-returning version for `Iterator::map_windows`. This is closely related to rust-lang#81615. I think that all options listed in that issue can be implemented using the function added in this PR. The only instance where `collect_array_into` might not be general enough is when the caller want to handle incomplete arrays manually. Currently, if `iter` yields fewer than `N` items, `None` is returned and the already yielded items are dropped. But as this is just a private function, it can be made more general in future PRs. And while this was not the goal, this seems to lead to better assembly for `array::map`: https://rust.godbolt.org/z/75qKTa (CC `@JulianKnodt)` Let me know what you think :) CC `@matklad` `@bstrie`
…ray, r=dtolnay Add internal `collect_into_array[_unchecked]` to remove duplicate code Unlike the similar PRs rust-lang#69985, rust-lang#75644 and rust-lang#79659, this PR only adds private functions and does not propose any new public API. The change is just for the purpose of avoiding duplicate code. Many array methods already contained the same kind of code and there are still many array related methods to come (e.g. `Iterator::{chunks, map_windows, next_n, ...}`, `[T; N]::{cloned, copied, ...}`, ...) which all basically need this functionality. Writing custom `unsafe` code for each of those doesn't seem like a good idea. I added two functions in this PR (and not just the `unsafe` version) because I already know that I need the `Option`-returning version for `Iterator::map_windows`. This is closely related to rust-lang#81615. I think that all options listed in that issue can be implemented using the function added in this PR. The only instance where `collect_array_into` might not be general enough is when the caller want to handle incomplete arrays manually. Currently, if `iter` yields fewer than `N` items, `None` is returned and the already yielded items are dropped. But as this is just a private function, it can be made more general in future PRs. And while this was not the goal, this seems to lead to better assembly for `array::map`: https://rust.godbolt.org/z/75qKTa (CC ``@JulianKnodt)`` Let me know what you think :) CC ``@matklad`` ``@bstrie``
What about some kind of wrapper type, say My very rough attempt: playground |
...which would allow things like let res: [u32; 2] = [1u32, 2, 3, 4]
.into_iter_fixed()
.zip([4, 3, 2, 1])
.map(|(a, b)| a + b)
.skip::<1>()
.take::<2>()
.collect(); keeping the length in the type system removing the need for asserting the length at runtime. Of course this can not support things like Are there many real use cases where those limitations would be problematic when collecting into collections of a fixed size? :) |
Hm, that's an intriguing possibility. On the other hand it doesn't actually solve the issue for users who didn't start out with a fixed-size array in the first place. Which is to say, even if we had such a size-aware array-only iterator, we'd probably still want one of the other solutions in this thread to fill in the gaps. I'd say your proposal is orthogonal to this issue and is worth being considered separately.
Agreed that having real-world use cases would be a big help here. |
There already is a backlink, but just to make it explicit: There's a RFC PR for ArrayVec. If that gets blessed this would at least settle a good chunk of the return type questions. |
Stabilize `impl From<[(K, V); N]> for HashMap` (and friends) In addition to allowing HashMap to participate in Into/From conversion, this adds the long-requested ability to use constructor-like syntax for initializing a HashMap: ```rust let map = HashMap::from([ (1, 2), (3, 4), (5, 6) ]); ``` This addition is highly motivated by existing precedence, e.g. it is already possible to similarly construct a Vec from a fixed-size array: ```rust let vec = Vec::from([1, 2, 3]); ``` ...and it is already possible to collect a Vec of tuples into a HashMap (and vice-versa): ```rust let vec = Vec::from([(1, 2)]); let map: HashMap<_, _> = vec.into_iter().collect(); let vec: Vec<(_, _)> = map.into_iter().collect(); ``` ...and of course it is likewise possible to collect a fixed-size array of tuples into a HashMap ([but not vice-versa just yet](rust-lang#81615)): ```rust let arr = [(1, 2)]; let map: HashMap<_, _> = std::array::IntoIter::new(arr).collect(); ``` Therefore this addition seems like a no-brainer. As for any impl, this would be insta-stable.
I made a potential implementation of approach 4 that passes data back with any type that implements Of course, this is probably completely unnecessary once |
Stabilize `impl From<[(K, V); N]> for HashMap` (and friends) In addition to allowing HashMap to participate in Into/From conversion, this adds the long-requested ability to use constructor-like syntax for initializing a HashMap: ```rust let map = HashMap::from([ (1, 2), (3, 4), (5, 6) ]); ``` This addition is highly motivated by existing precedence, e.g. it is already possible to similarly construct a Vec from a fixed-size array: ```rust let vec = Vec::from([1, 2, 3]); ``` ...and it is already possible to collect a Vec of tuples into a HashMap (and vice-versa): ```rust let vec = Vec::from([(1, 2)]); let map: HashMap<_, _> = vec.into_iter().collect(); let vec: Vec<(_, _)> = map.into_iter().collect(); ``` ...and of course it is likewise possible to collect a fixed-size array of tuples into a HashMap ([but not vice-versa just yet](rust-lang/rust#81615)): ```rust let arr = [(1, 2)]; let map: HashMap<_, _> = std::array::IntoIter::new(arr).collect(); ``` Therefore this addition seems like a no-brainer. As for any impl, this would be insta-stable.
Now that min_const_generics is approaching there is a great deal of new support for working with fixed-sized arrays, such as
std::array::IntoIter
. But while converting from an array to an iterator is now well-supported, the reverse is lacking:It is roundaboutly possible to do the conversion by going through Vec:
But that isn't no_std compatible, and even with std there should be no need to allocate here.
The non-allocating way of converting the array presupposes a fair bit of familiarity with the stdlib, unsafe code, and unstable features:
...which suggests this is a prime candidate for stdlib inclusion.
The first problem is what the signature should be. There's no way to statically guarantee that an iterator is any given length, so (assuming that you don't want to panic) what is the return type when the iterator might be too short?
-> [T; N]
: straightforward, but you'd have to haveT: Default
, which limits its usefulness. Furthermore this uses in-band signaling to mask what is probably an error (passing in a too-small iterator), which feels user-hostile. This seems little better than panicking.-> [Option<T>; N]
: the obvious solution to indicate potentially-missing data, but likely annoying to work with in the success case. And sadly the existing blanket impl that ordinarily allows you.collect
from from a collection of options into an option of a collection can't be leveraged here because you still don't have an impl ofFromIterator<T> for [T; N]
. So if you actually want a[T; N]
you're left with manually iterating over what was returned, which is to say, you're no better off than having the iterator you started out with!-> Option<[T; N]>
: the simplest solution that doesn't totally ignore errors. This would be consistent withstd::slice::array_windows
for producing None when a function cannot construct a fixed-size array due to a too-small iterator. However, it unfortunately seems less quite a bit less recoverable in the failure case than the previous approach.-> Result<[T; N], U>
: same as the previous, though it's possible you could pass something useful back in the error slot. However, as long as the iterator is by-value and unless it's limited to ExactSizeIterators, it might be tricky to pass the data back.-> ArrayVec<T, N>
: this would be a new type designed to have infallible conversion fromFromIterator
. Actually extracting the fixed-size array would be done through APIs on this type, which avoids some problems in the next section.IMO approaches 1 and 2 are non-starters.
The second problem is how to perform the conversion.
FromIterator
: the obvious approach, however, it cannot be used with return type 3 from the prior section. This is because of the aforementioned blanket impl for collecting a collection-of-options into an option-of-collections, which conflicts with any attempt to implFromIterator<T> for Option<[T; N]>
. I think even specialization can't solve this?TryFrom
: theoretically you could forgo an impl ofFromIterator<T>
and instead implTryFrom<I: IntoIterator<Item=T>>
; hacky, but at least you're still using some standard conversion type. Sadly, Invalid collision with TryFrom implementation? #50133 makes it impossible to actually write this impl; people claim that specialization could theoretically address that, but I don't know enough about the current implementation to know if it's sufficient for this case.TryFromIterator
trait. This would work, but this also implies a newTryIntoIterator
trait andIterator::try_collect
. Likely the most principled approach.std::array::TryCollect
trait, impl'd forI: IntoIterator
. Less principled than the prior approach but less machinery for if you happen to thinkTryFromIterator
andTryIntoIterator
wouldn't be broadly useful.std::array::from_iter
function. The simplest and least general approach. Less consistent with.collect
than the previous approach, but broadly consistent withstd::array::IntoIter
(although that itself is considered a temporary stopgap until.into_iter
works on arrays). Similarly, could be an inherent associated function impl'd on[T; N]
.FromIterator<T> for [T; N]
to Just Work, possibly via introducing a new const-generics aware version of ExactSizeIterator. If this could be done it would unquestionably be the most promising way to proceed, but I don't have the faintest idea where to begin.Iterator::collect_array
as a hardcoded alternative to collect. Similarly, an Iterator-specific equivalent ofslice::array_chunks
could fill the same role while also being potentially more flexible.Any other suggestions?
The text was updated successfully, but these errors were encountered: