Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

impl<T> IntoIterator for [T; $N] #32871

Closed
wants to merge 2 commits into from

Conversation

cuviper
Copy link
Member

@cuviper cuviper commented Apr 10, 2016

This allows an array to move its values out through iteration. I find
this especially handy for flat_map, to expand one item into several
without having to allocate a Vec, like one of the new tests:

fn test_iterator_flat_map() {
    assert!((0..5).flat_map(|i| [2 * i, 2 * i + 1]).eq(0..10));
}

Note the array must be moved for the iterator to own it, so you probably
don't want this for large T or very many items. But for small arrays,
it should be faster than bothering with a vector and the heap.

This allows an array to move its values out through iteration.  I find
this especially handy for `flat_map`, to expand one item into several
without having to allocate a `Vec`, like one of the new tests:

    fn test_iterator_flat_map() {
        assert!((0..5).flat_map(|i| [2 * i, 2 * i + 1]).eq(0..10));
    }

Note the array must be moved for the iterator to own it, so you probably
don't want this for large `T` or very many items.  But for small arrays,
it should be faster than bothering with a vector and the heap.
@rust-highfive
Copy link
Collaborator

r? @brson

(rust_highfive has picked a reviewer for you, use r? to override)

@cuviper
Copy link
Member Author

cuviper commented Apr 11, 2016

In case it's questioned, I don't know why the PhantomData<T> is necessary. I thought the bound for FixedSizeArray<T> would suffice, but the compiler complained that T was unused.

Also, I'd love a more direct way than Option to hide the array from dropping, but this is the best way according to the nomicon.

@brson brson added the T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. label Apr 11, 2016
@cuviper
Copy link
Member Author

cuviper commented Apr 11, 2016

Hmm, I had only run libcoretest, sorry. The CI failure is from rustdoc, in a snippet like this:

["main", "search", /*...*/]].into_iter().map(|id| (String::from(*id), 1)).collect()

Before it got something like IntoIterator::into_iter(&[&str; N]), automatically referencing the array. Iterator::Item = &&str, so *id becomes &str and String::from is fine.

Now it gets to my new IntoIterator::into_iter([&str; N]), with Iterator::Item = &str, so we arrive at the error that *id (a str) is unsized for the call to String::from.

So, I suppose my addition is a breaking change. There are easy ways around it, in this case either change to .iter() or just ditch the *id dereference in favor of String::from(id). I'll make sure nothing else is directly broken, at least, but maybe we can't add this at all... :(

@cuviper
Copy link
Member Author

cuviper commented Apr 11, 2016

Now some iter doctests are failing, all cases of array a.into_iter() expecting to have hit the slice's IntoIterator rather than on the array itself. All where a.iter() would do just fine.

So... I'll await a decision whether this break is even tenable before I try to fix any more, however trivial.

@huonw
Copy link
Member

huonw commented Apr 11, 2016

cc @rust-lang/libs and #25725.

Being able to move out of fixed sized arrays would definitely be nice. However, the libs team has not been keen on adding functionality to fixed sized arrays via the up-to-32 macro hack we have now, preferring to wait until we have a more structured way to handle all fixed sized arrays.

(Also, being a breaking change is unfortunate, although, in its favour, it is likely considered a "minor change" per the API evolution RFC.)

@cuviper
Copy link
Member Author

cuviper commented Apr 11, 2016

Thanks. FWIW, the iterator itself is not in the macro hack here, just the initial IntoIterator. Actually, that's based on FixedSizeArray -- I wonder if specialization could allow a broad IntoIterator implementation for that trait?

let ndrop = cmp::min(n, len);
let slice = array.as_mut_slice();
for p in &mut slice[self.index..self.index + ndrop] {
unsafe { ptr::drop_in_place(p); }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if one of these destructors panics? self.index probably needs to be updated either before the loop ("leak amplification") or inside the loop.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point! I think I prefer inside the loop, just to leak as little as possible, but I could go either way.

@huonw
Copy link
Member

huonw commented Apr 11, 2016

Yeah, I agree that the Iterator type itself seems good (modulo possibly needing some tweaks to ensure panic-safety), but the only way to create it is via IntoIterator hence the concern.

The existing specialisation scheme probably isn't quite enough, as I suspect we'd need the lattice version that allows handling overlapping blanket implementations (the impl for A: FixedSizedArray could theoretically overlap with I: Iterator).

}

// Prevent the array as a whole from dropping.
unsafe { ptr::write(&mut self.array, None); }
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This actually has the same hazard you raised for nth -- if there's a panic in one of the drop_in_place calls, we won't get to clobber self.array. And AFAICT a panic in a drop function will still try to drop the struct members, or even locals if I were to take() the array out early.

I think this will need drop-prevention to be an additional wrapper on the array, to be unwound on its own.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arrayvec has experienced this hazard too, and it uses two drop flags for this reason (nested types).

@bluss
Copy link
Member

bluss commented Apr 11, 2016

I'll raise the one issue I know with Option (and the reason this doesn't use Option).

In Option<[X; N]>, the fields of X may be used for the enum layout optimization. It's not super problematic in this code, but it means that it will read from the 0th array element even after that location was moved out from. Is this problem free here? I guess it can be, as long as the compiler doesn't change its mind about what a moved-from location contains.

@cuviper
Copy link
Member Author

cuviper commented Apr 11, 2016

@bluss interesting - so the concern is that once the 0th element is moved (or dropped in place), then that memory is sort of undefined, and the tag of the Option may be at risk. I'd be really surprised to see anything actually happen here, just from a pragmatic view, but I don't know enough to make any warranties. Was this a theoretical concern in your code, or did you actually encounter issues?

@cuviper
Copy link
Member Author

cuviper commented Apr 11, 2016

I feel less concerned about Option corruption from plain "moving" via ptr::read, because that uses a *const. From the compiler's perspective, the source should be unchanged and still valid memory. Dropping in place seems potentially dangerous though, so if we can't be sure of it then I'd just get rid of the nth and last overrides -- let them read out like a default Iterator.

@bluss
Copy link
Member

bluss commented Apr 11, 2016

In arrayvec it was a practical concern, since it used uninitialized or bitwise zeroed elements when empty.

Something that uses unsafe_no_drop_flag and sets itself to 0 even though it's "NonZero" would in fact be problematic here when dropped in place. (This is what zeroing drop used to do until it became filling drop).

@alexcrichton
Copy link
Member

Thanks for the PR @cuviper! The libs team discussed this during triage yesterday and out conclusion was that we don't want to merge this for now. This unfortunately needs to mention the type in the iterator rather than just the N (number of items), meaning the type of the returned iterator is IntoIter<T, [T; $N]>. In the "ideal world" where we have integer generics, the returned iterator would likely be IntoIter<T, N>, but it would be a breaking change to alter the type definition (if this were stable).

As a result this seems like something which would perhaps want to bake on crates.io first, so we decided to close this for now. We also thought that if impl Iterator were stable we could certainly do that as well, but unfortunately we don't quite have that just yet either!

@cuviper
Copy link
Member Author

cuviper commented Apr 15, 2016

That's a good point that the type signature would want to change between this and the ideal integer generics! So I understand holding off for now. I do have safety fixes for the issues raised so far, which I'll go push to my branch later just to have available. When integer generics finally come around, hopefully we can get back on this quickly. :)

I'll see about publishing something in a crate too. ArrayVec already serves well as an external solution, actually, but this would be a bit more minimal if moving iteration is all you need. Although it would have to reinvent or steal FixedSizeArray or arrayvec::Array too, so maybe not.

Is your impl Iterator comment regarding blanket specialization? Otherwise I lost you on that one...

@sfackler
Copy link
Member

impl Iterator refers to the Impl trait RFC which would allow us to provide these implementations without actually naming the Iterator type allowing it to be changed in the future.

@cuviper
Copy link
Member Author

cuviper commented Apr 15, 2016

I see, thanks!

@alexcrichton
Copy link
Member

Ah yeah sorry, the jargon is pretty cryptic :(

cuviper added a commit to cuviper/generic-array that referenced this pull request May 25, 2016
This allows an array to move its values out through iteration. I find
this especially handy for flat_map, to expand one item into several
without having to allocate a Vec, like the new test:

```rust
fn test_iter_flat_map() {
    assert!((0..5).flat_map(|i| arr![i32; 2 * i, 2 * i + 1]).eq(0..10));
}
```

This is a spiritual successor to rust-lang/rust#32871, which was
declined at least until it could use integer generics.  GenericArray
fills that gap, though, and could use IntoIterator today.
@cuviper
Copy link
Member Author

cuviper commented Sep 16, 2017

FWIW, I have rebased and updated my branch (comparison), and it should be simple to change it to IntoIter<T, N> once const generics are implemented (cc #44580).

@bluss
Copy link
Member

bluss commented Sep 16, 2017

@cuviper: Nice. But wow, those .into_iter() usages in the examples are worrying.

@cuviper
Copy link
Member Author

cuviper commented Sep 16, 2017

I'm tempted to go ahead and submit those s/into_iter()/iter()/ changes where slice iteration is intended, just to encourage better practice in the examples. But yeah, we'll surely need a crater run to evaluate how bad it is in the wild. At least I only had to make one change in non-example code.

@cuviper
Copy link
Member Author

cuviper commented Mar 13, 2018

@alexcrichton

Thanks for the PR @cuviper! The libs team discussed this during triage yesterday and out conclusion was that we don't want to merge this for now. This unfortunately needs to mention the type in the iterator rather than just the N (number of items), meaning the type of the returned iterator is IntoIter<T, [T; $N]>. In the "ideal world" where we have integer generics, the returned iterator would likely be IntoIter<T, N>, but it would be a breaking change to alter the type definition (if this were stable).

With RFC 2000 on the horizon, do you think we could try this again? We would still have to start with the less-satisfactory IntoIter<T, [T; $N]> type, but we can leave that type unstable with the clear plan to transition to IntoIter<T, N> before stabilization.

I'm tempted to go ahead and submit those s/into_iter()/iter()/ changes where slice iteration is intended, just to encourage better practice in the examples.

I want to put this together now, at least.

bors added a commit that referenced this pull request Mar 14, 2018
impl IntoIterator for arrays

This allows an array to move its values out through iteration.

This was attempted once before in #32871, but closed because the `IntoIter<T, [T; $N]>` type is not something we would want to stabilize.  However, RFC 2000's const generics (#44580) are now on the horizon, so we can plan on changing this to `IntoIter<T, const N: usize>` before stabilization.

Adding the `impl IntoIterator` now will allows folks to go ahead and iterate arrays in stable code.  They just won't be able to name the `array::IntoIter` type or use its inherent `as_slice`/`as_mut_slice` methods until they've stabilized.

Quite a few iterator examples were already using `.into_iter()` on arrays, getting auto-deref'ed to the slice iterator.  These were easily fixed by calling `.iter()` instead, but it shows that this might cause a lot of breaking changes in the wild, and we'll need a crater run to evaluate this.  Outside of examples, there was only one instance of in-tree code that had a problem.

Fixes #25725.

r? @alexcrichton
@RReverser
Copy link
Contributor

This unfortunately needs to mention the type in the iterator rather than just the N (number of items), meaning the type of the returned iterator is IntoIter<T, [T; $N]>. In the "ideal world" where we have integer generics, the returned iterator would likely be IntoIter<T, N>, but it would be a breaking change to alter the type definition (if this were stable).

@alexcrichton Re-reading this again, I wonder why do you think this is a problem? When const generics arrive, the IntoIter type could be simply generalised to IntoIter<T, [T; N]>. Such approach would allow implementing IntoIterator today without waiting for any new features, and then, when const generics arrive, upgrading it to the arbitrarily-sized version without breaking changes.

@alexcrichton
Copy link
Member

@RReverser we didn't want to commit to that type definition at the time. The type definition will want to change with const generics, and it didn't seem pressing enough to take the future ergonomic hit at the time. Times change though!

(also this PR is over 2 years old, naturally it's pretty far out of cache)

@cuviper
Copy link
Member Author

cuviper commented Oct 23, 2018

My newer PR was #49000, where I still wasn't able to update the type definition to const generics, but my hope was that we could just leave that part unstable for the time being. We could have usable array iterators, just not be able to name the iterator type in stable code.

We're now blocked on having a good migration story to fix conflicting "legacy" use of into_iter(), which could be answered by issue rust-lang/rust-clippy#1565. @kennytm now has PR rust-lang/rust-clippy#3344 for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants