Skip to content
This repository has been archived by the owner on Oct 8, 2024. It is now read-only.

Rationale for "Passing the protocol" #122

Closed
conartist6 opened this issue Dec 22, 2020 · 21 comments · Fixed by #194
Closed

Rationale for "Passing the protocol" #122

conartist6 opened this issue Dec 22, 2020 · 21 comments · Fixed by #194

Comments

@conartist6
Copy link

conartist6 commented Dec 22, 2020

The details page says "There are number of decisions which could be made differently. This document attempts to catalog them along with the rationales for the choices currently made." It does not however offer any rationales.

I'm particularly interested in the rationale behind the "passing the protocol" choice made for this API. I think of the iterator spec as having been designed for separate usages: coroutines and data transformation. In data transformation being able to give arguments to next is useless. Control flows only from the data.

So why are we transforming coroutines? If I understand we're basically talking about being able to write this line in a generator: const arg = yield result. Taken this way it seems to indicate some feedback from the consumer about the result yielded. But this only really makes sense to me if you know the consumer got the result you gave. The generator won't know if it's being mapped, so how could it be sensible for the consumer (who won't know what the generator yielded prior to the map operation) to give feedback? filter is even more of a head-scratcher as the generator of values won't know whether the feedback given was about the value it just emitted or about the last value it emitted that passed the filterer predicate (which it doesn't know exists). In addition, these operations (and drop and take) can be combined, with the effect that while a value input to the iterator chain will come out somewhere, it would seem impossible to reason about where, and especially not without knowing the whole chain of transformations inside the coroutine, in which case they should just be expressed using normal control flow logic: function calls instead of map and if statements instead of filter.

I propose removing this functionality, particularly the updated definition of IteratorNext

@conartist6
Copy link
Author

I'm also unsure about the meaning of throw. In iter-tools I propagate return up the iterator chain (without its argument), but never throw -- throw essentially acts as an alias for return. Honestly I've considered removing throw many times, but since it's present on all generators it feels like an unnecessary "gotcha" for it not to exist. But what would propagating throw up a transformation chain mean? If the underlying source is data, there will be no alternative way to proceed. The data doesn't change because a consumer doesn't understand it. So again we're talking about inserting transformations between coroutines, which has the problems I've mentioned above...

@jorendorff
Copy link
Collaborator

jorendorff commented Jun 16, 2021

(6 months later) Hi! Thanks for this comment. The broad strokes of this decision were agreed on at committee. Some details may change.

Let's try and puzzle out what the rationale really is. If it seems good, or at least better than nothing, we can add it to the details page—maybe even a NOTE in the specification.

Before I start, let me just say: I 100% sympathize with your points here. To me, like you, iterators are simple and they're about data, while generators are... well, it's a complex and subtle API, and it's not super clear what it's about. The coroutine protocol requires some extra cooperation between the coroutine and its user, to be useful—unlike iteration, where the iterator and the loop that consumes it are often completely independent.

For now I only have time to dispense with the easiest parts of what you're raising here. I'll have to continue later.

  • The distinction we both see between iterator methods and generator methods does not exist in the ECMAScript spec. .next with an argument, .return, and .throw are all features of the Iterator interface. They are optional, but that doesn't mean this proposal can simply decide not to support them. Rather, when there is a sensible way to support them, the standard committee will insist on supporting them. This is true throughout the language, and consistent support of already-established features does seem like a good policy (when I set aside whether I like those features).

  • The case for propagating .return(), without the argument, the way iter-tools does, is quite strong. The return method makes sense as part of the Iterator interface even outside the context of generators/coroutines.

    return is a cleanup method. When .return() is called on an iterator helper, I think it's clear what should happen: it should clean up any iterators it's using. For us, this means propagating .return() to the source iterator.

    It also means that the iter.map(f) iterator helper should call iter.return() if the function f throws.

    It also means flatMap should call .return() on both the inner and the outer iterator.

    I think all of this follows from a careful examination of how the spec already handles iterators in other contexts. For example: when does the language itself call .return() on an iterator? It happens when you break or return from a for-of loop, or at the end of array destructuring assignment ([a, b] = iter;). Certainly it is unreasonable for the language to silently drop this cleanup signal only with iterator helpers.

  • I think map and flatMap actually do make sense as coroutine transformations. If I think of a coroutine as a conversation with two participants that strictly take turns, map transforms the messages in one direction; and flatMap does the same but stringing together a series of participants on that end of the conversation. To the extent that coroutines are useful at all, this seems reasonable to me. I think a case could be made for take and asIndexedPairs, too.

    But that is actually 4 of the 6 helpers that return iterators! The only ones that are definitely nonsense, to me, are filter and drop.

I hope this helps. It has been a while since I looked at this proposal, but I think it's good and @ystartsev and I hope to get it moving again.

@jorendorff
Copy link
Collaborator

I should have started with this: I think the proposal has been largely rewritten since that section of DETAILS.md was written. I'm not sure the proposal still forwards throw, although I'm sure helper.throw(err) throws err, at least, rather than behaving exactly like helper.return().

I'll have to check the details.

@devsnek
Copy link
Member

devsnek commented Jun 16, 2021

The proposal is being rewritten on top of generator primitives, so it still "passes the protocol" or so to speak, the same way generators do.

@conartist6
Copy link
Author

conartist6 commented Jun 16, 2021

Thanks for the reply. I pretty much agree with what you're saying. iter-tools guarantees (as a library-wide invariant) that return calls will propagate back to the source through all possible transformations. And generally that behavior comes for free, just so long as I'm using generators and for .. of loops.

You also note that the spec may be ahead of the markdown, so I'm trying to piece together what the spec really says.

All it really says is that throw will be defined by the individual iterator, and that each helper whose implementation is defined is "a built-in generator function". As near as I can tell that implementation is compliant with the spec. I note that the spec sort of defines throw by also saying e.g. "%Iterator.prototype%.map is a built-in generator function".

@conartist6
Copy link
Author

conartist6 commented Jun 16, 2021

The other part of this is return values from yield expressions. The spec for map uses lastValue to ensure that next(yieldReturn) propagates the specified value to yield. But my argument is this: if map is to forward such yield-return values, it must logically provide an API for the yield return data to be mapped, otherwise it is a leaky abstraction.

Here's my simple example: You write a coroutine generator, which is to say a generator whose progression may be influenced by return values. Here's a simple one that yields numbers but lets you skip some:

function * range() {
  for (let i = 0;; i++) {
    const nexti = yield i;
    if (nexti !== undefined) i = nexti;
  }
}

Now I map and use our coroutine powers to tweak iteration as we go along.

function *() {
  const iter = range().map(i => -i);
  
  let i = iter.next().value;
  while(true) yield i = iter.next(i - 1).value;
}

But of course this code is broken. Instead of causing the range generator to skip every other value, we instead cause it to always yield its initial value.

In my mind there are two options. First you can just not try to pass yield returns through map. The code can then be changed to respect the API as defined by the real coroutine, and the map step just becomes inlined.

function *() {
  const iter = range();
  
  let i = -iter.next().value;
  while(true) yield i = iter.next(i + 1).value;
}

The second option would be to modify the map api to allow specification of a reciprocal mapper, i.e. a transformation for yield return values:

function *() {
  const iter = range().map(i => -i, i => -i); // note the second callback: the yield return transformer
  
  let i = iter.next().value;
  while(true) yield i = iter.next(i - 1).value;
}

@conartist6
Copy link
Author

Actually better I think would be to combine both suggestions. map would take one argument and discard yield return values. mapCoroutine could be a new helper that would take two arguments and apply the specified mapper to yield return values. This I think has several benefits:

  • The implementation of map is simpler and thus faster.
  • The map function follows the principle of least surprise. 99% of the time when a programmer sees .map it means a unidirectional data transformation function of one argument. This expectation is met.
  • In corollary mapCoroutine calls out that something particularly interesting is happening, which is good because it is. It also helps novice programmers who may never have learned about coroutines find information about the kinds of patterns they are looking at.
  • Consistency is enhanced. Only some of the methods described really support passing yield return values back through a chain of operations to the source. A user who sees a mapCoroutine method but no filterCoroutine method should understand this right away, or if they don't they're a simple "why is there no filterCoroutine" search away from understanding.
  • Better support for making coroutine changes. Say that after this proposal is accepted someone makes a new method. Initially it is not thought to need to support coroutine usage, but then the community makes a case that coroutine support is necessary. If coroutine support is overloaded into a single method, now feature detection would be needed to identify environments with partial support and polyfill.

@jorendorff
Copy link
Collaborator

Regarding .throw() and .return(), I think we pretty much agree, and I just have a few cleanup tasks for myself.

  • The test property you linked -- that all iterators are closed -- is valuable. I want to check two things:

    • The standard should have a note or rationale mentioning this property: all ES language and standard library uses of iterators are meant to comply, but it's not obvious, and user code can of course forget to call .return().

    • Check that the proposal's flatMap satisfies this property, when there are two iterators to close, even if one of the .return() methods throws.

  • Now that I've had time to remember it, I'm about 98% sure .throw(exc) on an iterator helper should call .return() on any upstream iterators it's using, then throw exc (unless a return method already threw). I think the proposal implements this: its use of "Yield" implies this behavior.

    • Check that, by reading up on "Yield" in the current standard.

    • See if we can/should add a NOTE to the proposal.

@domenic
Copy link
Member

domenic commented Aug 31, 2021

Hey, I don't fully understand the issue here about passing return/throw, but I wanted to make sure my use case was supported even if that was dropped.

The use case is stream async iterators, where return() on the iterator will cancel the stream. That is,

for await (const chunk of stream) {
  break;
}

// now stream is canceled

Can we be assured that this property will be preserved even when iterator helpers are involved? I.e. the following works?

for await (const chunk of stream.values().map(...).filter(...)) {
  break;
}

// now stream is canceled

@codehag
Copy link
Collaborator

codehag commented Aug 31, 2021

from @bakkot :

as a concrete example, calling .return on the iterator for a whatwg ReadableStream will close it, which can release e.g. network resources. if for await (let chunk of stream[Symbol.asyncIterator]().map(x => x) break; does not close the stream, that seems bad.

Effectively the same concern as @domenic but the change in wording made the concern more concrete for me.

@codehag
Copy link
Collaborator

codehag commented Aug 31, 2021

Another concern here from @hax

deiter (double ended iterators) rely on param in next(param). So if iterator helpers pass the param then let [a, ..., b] = [1,2,3].values().map(x => x*x) just works (a will be 1, b will be 9)

@conartist6
Copy link
Author

conartist6 commented Aug 31, 2021

Yes, absolutely.

This is about something different. You might not know this but generators allow you to do tricky things like value = yield because the underlying iterator protocol is bidirectional, i.e. it allows you to pass such a value as next(value). This issue is about how helpers handle that rare case, usually used when writing coroutines.

Also if you haven't, check out iter-tools because it's the only library I'm aware of that can transform sync or async iterables and guarantees (100% test coverage) that return() will be called in a semantically correct way. But yes, the helpers do and will continue to do that as well.

@jorendorff
Copy link
Collaborator

@domenic Your use case will definitely continue to work. Everyone agrees that propagating .return() is important.

@conartist6
Copy link
Author

@codehag @hax can you say more about deiter? It sounds like a conversation that is relevant to this issue, but I can't tell what you're saying. Why would double ended iterators rely on inverted control?

@jorendorff
Copy link
Collaborator

jorendorff commented Aug 31, 2021

@conartist6 This refers to a new proposal that is not yet accepted. See @tc39/proposal-deiter.

The proposal uses iter.next('back') as syntax for getting the next value from the back of the iterated sequence.

Passing this "option" argument through iterator helpers would make sense for .map and even .filter. (The existence of double-ended iterators would also suggest additional helper methods—offhand, .reverse(), .takeFromBack(n), .dropFromBack(n).)

@codehag
Copy link
Collaborator

codehag commented Aug 31, 2021

The proposal uses iter.next('back') as syntax for getting the next value from the back of the iterated sequence.

I think that in this case, the better solution would not be a value passed to next, but possibly a new method?

@conartist6
Copy link
Author

conartist6 commented Aug 31, 2021

Yes I already found that but I can find no corroboration for the statement "deiter (double ended iterators) rely on param in next(param)". Edit: OK found it here

Reversing an iterator is not the same thing as reversing the direction of data flow, that is to say that even if you take the values in your iterable in reverse, you're still going from data => iterator/transform/consumer.

@conartist6
Copy link
Author

Also I am strongly opposed to double ended iterators, but I guess I better go over there and explain why, or what I mean by that.

@hax
Copy link
Member

hax commented Jan 22, 2022

the better solution would not be a value passed to next, but possibly a new method?

@codehag It's possible to use a new method, but consider how generator works, if we want to support writing deiter via generators and keep compatibility with single direction iterator, we need to pass param to next() anyway, so new method would just a wrapper of next('back').

PS. Rust deiter actually use a separate method named next_back() but they have the trouble of how implement a deiter via generators because rust next() don't have extra param. There are some ideas but I feel none better than next(param) + function.sent-like (aka. "passing the protocol" as the title of this issue 😊 ) solution.

@hax
Copy link
Member

hax commented Jul 4, 2022

Note, as tc39/proposal-deiter#11 , if deiter proposal move to separate nextLast() method instead of next("last") (the term was changed from "back" to "last" because "back" make people easy confused with bidirectional), deiter proposal will not rely on "passing protocol" semantic. But that also means all iterator helpers need to upgraded in the deiter proposal or future proposals.

Actually it's trivial to make iterator helpers work for deiter, but I found that speccing them by "Abstract Closure" are even harder than the traditional ways.

@bakkot
Copy link
Collaborator

bakkot commented Jul 7, 2022

@michaelficarra and I spent a good while thinking about this and ultimately came to basically the same conclusion as the OP: iterator helpers are necessarily going to break many uses of the non-iteration parts of the generator protocol, so they shouldn't even try to preserve those parts. In #194 we removed the implementations of .throw and updated all the helpers so that they do not forward arguments from .next and .return.

.return still exists and will close any open iterators held by the helpers (there may be two in the case of flatMap), since that's part of the iterator protocol as used by for of. So e.g. #122 (comment) will still work as expected.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants