Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

impl<A, B> IntoIterator for (A, B) as Zip #78204

Closed
wants to merge 4 commits into from

Conversation

cuviper
Copy link
Member

@cuviper cuviper commented Oct 21, 2020

This makes it a little easier to zip iterators:

for (x, y) in (xs, ys) {}
// vs.
for (x, y) in xs.into_iter().zip(ys) {}

You can iterate (&mut xs, &ys) for the conventional iter_mut() and
iter(), respectively. This can also support arbitrary nesting, where
it's easier to see the item layout than with arbitrary zip chains:

for ((x, y), z) in ((xs, ys), zs) {}
for (x, (y, z)) in (xs, (ys, zs)) {}
// vs.
for ((x, y), z) in xs.into_iter().zip(ys).zip(xz) {}
for (x, (y, z)) in xs.into_iter().zip((ys.into_iter().zip(xz)) {}

Only tuple pairs are implemented for now. It's possible to implement
longer tuples, but that would require a new iterator type to produce
those tuple items. See itertools' Zip and rayon's MultiZip for
prior art there.

See also rust-lang/rfcs#870 and rust-lang/rfcs#930.

This makes it a little easier to `zip` iterators:

```rust
for (x, y) in (xs, ys) {}
// vs.
for (x, y) in xs.into_iter().zip(ys) {}
```

You can iterate `(&mut xs, &ys)` for the conventional `iter_mut()` and
`iter()`, respectively. This can also support arbitrary nesting, where
it's easier to see the item layout than with arbitrary `zip` chains:

```rust
for ((x, y), z) in ((xs, ys), zs) {}
for (x, (y, z)) in (xs, (ys, zs)) {}
// vs.
for ((x, y), z) in xs.into_iter().zip(ys).zip(xz) {}
for (x, (y, z)) in xs.into_iter().zip((ys.into_iter().zip(xz)) {}
```

Only tuple pairs are implemented for now. It's possible to implement
longer tuples, but that would require a new iterator type to produce
those tuple items. See itertools' [`Zip`] and rayon's [`MultiZip`] for
prior art there.

[`Zip`]: https://docs.rs/itertools/0.9.0/itertools/structs/struct.Zip.html
[`MultiZip`]: https://docs.rs/rayon/1.4.1/rayon/iter/struct.MultiZip.html

See also rust-lang/rfcs#870 and rust-lang/rfcs#930.
@rust-highfive
Copy link
Collaborator

r? @estebank

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Oct 21, 2020
@LeSeulArtichaut LeSeulArtichaut added the T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. label Oct 21, 2020
@cuviper
Copy link
Member Author

cuviper commented Oct 21, 2020

I may have gone overboard with example conversions throughout the compiler, but I can back any of that out if so.

cc @yaahc @scottmcm @nikomatsakis from various conversations.

@cuviper
Copy link
Member Author

cuviper commented Oct 21, 2020

One other note: in rayon, we also did the parallel equivalent of IntoIterator for &(A, B) and &mut (A, B), which just convert references to A and B as IntoIterator. I tried adding that here, and while it compiles in core, the test crate fails:

error[E0275]: overflow evaluating the requirement `&_: IntoIterator`
   --> library/test/src/lib.rs:266:21
    |
266 |         for test in &timed_out {
    |                     ^^^^^^^^^^
    |
    = help: consider adding a `#![recursion_limit="256"]` attribute to your crate (`test`)
    = note: required because of the requirements on the impl of `IntoIterator` for `&(_, _)`
    = note: required because of the requirements on the impl of `IntoIterator` for `&((_, _), _)`
    = note: required because of the requirements on the impl of `IntoIterator` for `&(((_, _), _), _)`
[...many more...]

That timed_out is just a Vec, but that relies on inference from the return type, so I suppose the search for IntoIterator types is getting lost in tuple candidates. Maybe that's a compiler bug?

fn get_timed_out_tests(running_tests: &mut TestMap) -> Vec<TestDesc> {
let now = Instant::now();
let timed_out = running_tests
.iter()
.filter_map(|(desc, timeout)| if &now >= timeout { Some(desc.clone()) } else { None })
.collect();
for test in &timed_out {
running_tests.remove(test);
}
timed_out
};

I hope that won't ever be an issue for consuming tuples by value, per this PR, but I don't know...

@scottmcm scottmcm added the needs-fcp This change is insta-stable, so needs a completed FCP to proceed. label Oct 21, 2020
@scottmcm
Copy link
Member

I may have gone overboard with example conversions throughout the compiler, but I can back any of that out if so.

Consider restricting it to ones where it's particularly good at showing its value, because this will need an FCP and you probably want to avoid the 10+ days of merge conflicts.

Nit: these ones distracted me as I was scrolling,
image
thinking "wait, if it was .iter() before, shouldn't it be (param_tys, &args) now?" I'm sure that with CI passing the changes work fine, but I don't know if that distraction or potential weird perf impacts (like carrying a vector in the iterator instead of 2 pointers) are worth having those additional examples.

@cuviper
Copy link
Member Author

cuviper commented Oct 21, 2020

Consider restricting it to ones where it's particularly good at showing its value,

Do you see any that are particularly compelling in your opinion?

thinking "wait, if it was .iter() before, shouldn't it be (param_tys, &args) now?" I'm sure that with CI passing the changes work fine, but I don't know if that distraction or potential weird perf impacts (like carrying a vector in the iterator instead of 2 pointers) are worth having those additional examples.

It's actually no difference there because args has type &[&Value] -- it could have been .zip(args) already.
I tried not to make any semantic changes, so the perf should be unchanged overall.

@cuviper
Copy link
Member Author

cuviper commented Oct 21, 2020

Maybe I should back it up to just the library/ changes for now, as those are relatively few, and then I can try to land the broader compiler changes separately once this is accepted.

@scottmcm
Copy link
Member

Do you see any that are particularly compelling in your opinion?

No idea 🙃 There's also some value in volume to show prevalence...

I guess one I really liked was
image
since rustfmt seems to really accentuate the asymmetry of the zip form.

That said, the niceness there does have the objection of "well, use Iterator::zip(modules.iter(), names.iter()) then".

it could have been .zip(args) already.

Ah, cool. I didn't look into what the types actually were.

@nikomatsakis
Copy link
Contributor

As I expressed before, I'm a fan of this change. I really like being able to write for (a, b) in (as, bs) and I am always sad that I cannot.

@estebank
Copy link
Contributor

I really like this change, but am slightly concerned about the insta-stabilization. r=me on the code, but defer to @rust-lang/libs

@rfcbot fcp merge

@nagisa
Copy link
Member

nagisa commented Oct 23, 2020

While I'm a fan of this change, I'm worried that there's another interpretation to (a, b).into_iter() – iterate-yield a and then b. This may end up being confusing to some people who are not already aware with how this construct operates in Rust. This is especially relevant because that's the behaviour you get in e.g. Python.

Further consideration on programmer's part may reveal that the interpretation is not actually valid in general, because Rust iterators are not polymorphic like they are in e.g. Python. Nevertheless I think the overlap is big between the people who may initially assume the Python-like behaviour and people who are unlikely to come up to a similar conclusion.

@SimonSapin
Copy link
Contributor

I'm worried that there's another interpretation to (a, b).into_iter() – iterate-yield a and then b.

That would only be possible if a and b have the same type, which is not necessarily the case for tuples.

This may end up being confusing to some people who are not already aware with how this construct operates in Rust. This is especially relevant because that's the behaviour you get in e.g. Python.

Dynamic typing + iteration means that Python tuples are pretty much interchangeable with lists (except for being shallow-immutable). I feel that’s not particularly a good thing we should try to emulate, especially with static typing.

@leonardo-m
Copy link

One one hand I have found lot of places in my code where this idea improves my code, shortening it and making it less noisy.

On the other hand I agree with the Python Zen point that explicit is better than implicit. This idea introduces implicit parallel iteration, while a "zip" makes it explicit that we're having a parallel iteration. Implicit operations are handy but make the code less readable, so better to keep implicitness to only the most common operations and only where it doesn't cost too much (see Scala Implicits. Very powerful but they also cause troubles).

So in a cost/advantages table I don't know if it's a good idea to introduce this idea.

@LukasKalbertodt
Copy link
Member

While I'm a fan of this change, I'm worried that there's another interpretation to (a, b).into_iter()

Another potential interpretation is the Cartesian product of the two iterators. Something like for (x, y) in (0..3, 0..3) { ... } being equivalent to two nested loops seems plausible and is certainly something people regularly need.

So while my initial reaction was ":tada: that's a lot nicer than a.zip(b)", given that the syntax could easily be interpreted to do something else, I think I am against this change.

@clarfonthey
Copy link
Contributor

Maybe there could be a method (xs, ys).zip() instead? That way it's still easier to write than xs.zip(ys) or Iterator::zip(xs, ys).

@cuviper
Copy link
Member Author

cuviper commented Oct 25, 2020

itertools has a free function zip too, calling into_iter on both and then zipping. In some ways that's better because zip(xs, ys) immediately gives an iterator, vs. (xs, ys).into_iter(), which helps when followed by something like enumerate().

So if the consensus is for something more explicit, we could instead add std::iter::zip. It's not as convenient since you have to import it, unless we also add it to the prelude, but it's an option.

An inherent tuple method zip() is interesting too, and wouldn't have to mess with imports or the prelude.

@bors
Copy link
Contributor

bors commented Oct 25, 2020

☔ The latest upstream changes (presumably #78334) made this pull request unmergeable. Please resolve the merge conflicts.

Note that reviewers usually do not review pull requests until merge conflicts are resolved! Once you resolve the conflicts, you should change the labels applied by bors to indicate that your PR is ready for review. Post this as a comment to change the labels:

@rustbot modify labels: +S-waiting-on-review -S-waiting-on-author

@nikomatsakis
Copy link
Contributor

I wasn't aware of Python's interpretation. That seems like a negative.

The cartesian product interpretation doesn't worry me as much because (ime) it is used far less frequently than zip.

That said I agree that a free function or inherent tuple method might make for an interesting choice, although the need to import it is certainly a bit of a drag. But writing std::iter::zip(a, b) in place is also not that terrible, actually. (I've found that ever since the module changes in Rust 2018, I am much less reluctant to write out full paths...)

@withoutboats
Copy link
Contributor

A free function named zip might be a better alternative, but I'll also note that if a user expects (a, b) to iterate over these two iterators in turn, its very unlikely their program will not type error, because they will not write code expecting an iterator of (Item, Item). We could also try to provide a diagnostic hint to explain that type error maybe.

@cuviper
Copy link
Member Author

cuviper commented Oct 26, 2020

Just to add a little more context, I am also interested in adding the reverse, an unzip-like operation with FromIterator and Extend for (A, B), as discussed in this zulip thread. We've already done this in rayon for ParallelExtend, and we will probably go ahead with FromParallelIterator too. I think there is a stronger case for these in rayon, since parallel implementations are more complicated to do otherwise, but they would still be handy in std.

I thought IntoIterator would be the easier part to start with here, being a little less "magical" than effects on collect types, but maybe this potential zip/unzip duality for (A, B) will sway people one way or the other.

@nagisa
Copy link
Member

nagisa commented Oct 26, 2020

Yeah, I just wanted to point out a potential difference with how this is approached by other languages, but I'm otherwise hoping to see this change happen regardless.

@SimonSapin
Copy link
Contributor

@cuviper
Copy link
Member Author

cuviper commented Oct 26, 2020

@SimonSapin Yes we have unzip, but FromIterator would let you compose with other kinds of collecting, like a short-circuiting unzip via collect::<Result<(A, B), E>>(). Extend would let you effect an unzip into existing collections, and you could also nest tuples to approximate wider unzips, like let ((a, b), c) = iter.unzip() -- the (a, b) part would use Extend.

@cuviper
Copy link
Member Author

cuviper commented Mar 8, 2021

RE: #78204 (comment)

A free zip function might be a good alternative.

See #82917 -- just zipping a simple pair, no fancy variadics...

@rfcbot
Copy link

rfcbot commented Mar 12, 2021

🔔 This is now entering its final comment period, as per the review above. 🔔

@rfcbot rfcbot added final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. and removed proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. labels Mar 12, 2021
@rfcbot rfcbot added finished-final-comment-period The final comment period is finished for this PR / Issue. and removed final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. labels Mar 22, 2021
@rfcbot
Copy link

rfcbot commented Mar 22, 2021

The final comment period, with a disposition to close, as per the review above, is now complete.

As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.

@rfcbot rfcbot added the to-announce Announce this issue on triage meeting label Mar 22, 2021
@m-ou-se m-ou-se closed this Mar 22, 2021
@apiraino apiraino removed the to-announce Announce this issue on triage meeting label Mar 25, 2021
cuviper added a commit to cuviper/rust that referenced this pull request Mar 26, 2021
This makes it a little easier to `zip` iterators:

```rust
for (x, y) in zip(xs, ys) {}
// vs.
for (x, y) in xs.into_iter().zip(ys) {}
```

You can `zip(&mut xs, &ys)` for the conventional `iter_mut()` and
`iter()`, respectively. This can also support arbitrary nesting, where
it's easier to see the item layout than with arbitrary `zip` chains:

```rust
for ((x, y), z) in zip(zip(xs, ys), zs) {}
for (x, (y, z)) in zip(xs, zip(ys, zs)) {}
// vs.
for ((x, y), z) in xs.into_iter().zip(ys).zip(xz) {}
for (x, (y, z)) in xs.into_iter().zip((ys.into_iter().zip(xz)) {}
```

It may also format more nicely, especially when the first iterator is a
longer chain of methods -- for example:

```rust
    iter::zip(
        trait_ref.substs.types().skip(1),
        impl_trait_ref.substs.types().skip(1),
    )
    // vs.
    trait_ref
        .substs
        .types()
        .skip(1)
        .zip(impl_trait_ref.substs.types().skip(1))
```

This replaces the tuple-pair `IntoIterator` in rust-lang#78204.
There is prior art for the utility of this in [`itertools::zip`].

[`itertools::zip`]: https://docs.rs/itertools/0.10.0/itertools/fn.zip.html
Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this pull request Mar 27, 2021
Add function core::iter::zip

This makes it a little easier to `zip` iterators:

```rust
for (x, y) in zip(xs, ys) {}
// vs.
for (x, y) in xs.into_iter().zip(ys) {}
```

You can `zip(&mut xs, &ys)` for the conventional `iter_mut()` and
`iter()`, respectively. This can also support arbitrary nesting, where
it's easier to see the item layout than with arbitrary `zip` chains:

```rust
for ((x, y), z) in zip(zip(xs, ys), zs) {}
for (x, (y, z)) in zip(xs, zip(ys, zs)) {}
// vs.
for ((x, y), z) in xs.into_iter().zip(ys).zip(xz) {}
for (x, (y, z)) in xs.into_iter().zip((ys.into_iter().zip(xz)) {}
```

It may also format more nicely, especially when the first iterator is a
longer chain of methods -- for example:

```rust
    iter::zip(
        trait_ref.substs.types().skip(1),
        impl_trait_ref.substs.types().skip(1),
    )
    // vs.
    trait_ref
        .substs
        .types()
        .skip(1)
        .zip(impl_trait_ref.substs.types().skip(1))
```

This replaces the tuple-pair `IntoIterator` in rust-lang#78204.
There is prior art for the utility of this in [`itertools::zip`].

[`itertools::zip`]: https://docs.rs/itertools/0.10.0/itertools/fn.zip.html
flip1995 pushed a commit to flip1995/rust that referenced this pull request Apr 8, 2021
Add function core::iter::zip

This makes it a little easier to `zip` iterators:

```rust
for (x, y) in zip(xs, ys) {}
// vs.
for (x, y) in xs.into_iter().zip(ys) {}
```

You can `zip(&mut xs, &ys)` for the conventional `iter_mut()` and
`iter()`, respectively. This can also support arbitrary nesting, where
it's easier to see the item layout than with arbitrary `zip` chains:

```rust
for ((x, y), z) in zip(zip(xs, ys), zs) {}
for (x, (y, z)) in zip(xs, zip(ys, zs)) {}
// vs.
for ((x, y), z) in xs.into_iter().zip(ys).zip(xz) {}
for (x, (y, z)) in xs.into_iter().zip((ys.into_iter().zip(xz)) {}
```

It may also format more nicely, especially when the first iterator is a
longer chain of methods -- for example:

```rust
    iter::zip(
        trait_ref.substs.types().skip(1),
        impl_trait_ref.substs.types().skip(1),
    )
    // vs.
    trait_ref
        .substs
        .types()
        .skip(1)
        .zip(impl_trait_ref.substs.types().skip(1))
```

This replaces the tuple-pair `IntoIterator` in rust-lang#78204.
There is prior art for the utility of this in [`itertools::zip`].

[`itertools::zip`]: https://docs.rs/itertools/0.10.0/itertools/fn.zip.html
@cuviper cuviper deleted the iter-pairs branch October 15, 2022 00:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
disposition-close This PR / issue is in PFCP or FCP with a disposition to close it. finished-final-comment-period The final comment period is finished for this PR / Issue. needs-fcp This change is insta-stable, so needs a completed FCP to proceed. S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.