-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
spec: add range over int, range over func #61405
Comments
Discussion Summary / FAQThis comment summarizes the discussion so far and answers frequently asked questions. Last update: July 23, 2023 Can I try running my own programs?Absolutely. The easiest way right now is to use
We will look into making range-over-int and range-over-func usable in a special mode on the Go playground as well. Can you provide more motivation for range over functions?The most recent motivation is the addition of generics, which we expect will lead to custom containers such as ordered maps, and it would be good for those custom containers to work well with range loops. Another equally good motivation is to provide a better answer for the many functions in the standard library that collect a sequence of results and return the whole thing as a slice. If the results can be generated one at a time, then a representation that allows iterating over them scales better than returning an entire slice. We do not have a standard signature for functions that represent this iteration. Adding support for functions in range would both define a standard signature and provide a real benefit that would encourage its use. For example, here are a few functions from the standard library that return slices but probably merit forms that return iterators instead:
There are also functions we were reluctant to provide in slices form that probably deserve to be added in iterator form. For example, there should be a strings.Lines(text) that iterates over the lines in a text. Similarly, iteration over lines in a bufio.Reader or bufio.Scanner is possible, but you have to know the pattern, and the pattern is different for those two and tends to be different for each type. Establishing a standard way to express iteration will help converge the many different approaches that exist today. For additional motivation for iterators, see #54245. For additional motivation specifically for range over functions, see #56413. Can you provide more motivation for range over integers?3-clause for loops have a long and venerable history, and they are very flexible, but the majority of the time, they are used to count from 0 through N-1 with no complications. When you stop and think about that, a 3-clause for loop requires a lot of machinery and ceremony just to count from 0 through N-1. As @seebs noted, there are many ways to complicate a 3-clause for loop, some of them not terribly obvious. Here are a few:
It helps readability of programs if the trivial counting form is conventionally written for i := range N instead of having to examine the loop header (and body!) carefully to make sure it really is the trivial counting form. Later, @danderson made the point perfectly:
Why not use a library function for range over ints?That would be another approach, but counting from 0 through N-1 is incredibly common. If it required importing a separate package to get the range form, probably most code would keep using the 3-clause form. We want to do better. See also the quoted comment from @danderson in the previous answer. Why not add a new syntax for more sophisticated integer ranges?More sophisticated integer ranges are much less common than 0 through N-1. For those, the balance tips more easily toward custom functions that can be imported and called when necessary. For example, a very common 3-clause loop in some code bases is counting from N-1 down through 0, to iterate backward over a slice. We could provide that countdown as a function, or we could specialize to slices, with something like
The ability to do this customization for these cases makes custom functions much more attractive. But lots of code just needs to count up, and that should be as easy as possible. Why not use range over an interface instead of range over functions?Today, range makes the decision of how to iterate based only on the underlying type of the range variable, not its methods. In fact, no syntax in Go ever makes use of specific methods. The closest exception is the predefined error interface, but even that never has its Error method called by any Go syntax. People often ask for operator methods, and we have avoided doing that for the same reason: we don't want to privilege specific methods. It is of course possible to create a language structured around such methods (Python is one example), but Go is not designed that way. Nothing in the Go language invokes specific method names. A second design reason for the decision is that it's more difficult to construct implementations of an interface than it is to just write a function literal. A function like slices.Backward above would have to define a special backwardIterator object holding no state just to have a special method. Or some package would need to define a func-based implementation like http.HandlerFunc. It's much easier just to use functions directly and not worry about needing a different type for each kind of iterator. There is also a technical incompatibility in using interfaces. If a custom map type (say) defined the special iteration method, then ranging over it in current Go would do a regular map iteration, while ranging over it in a future Go would invoke the special iteration method. That would be a breaking change. Perhaps a desired change, but still a breaking one. That concern is secondary to the basic design principles though. There is more discussion about the downsides of range over interfaces in #54245. The switch to range over functions was meant to address that directly. What is a simple example of how range over function runs?Sure. Consider the Backwards function in the previous answer. It can be invoked as:
This program would translate inside the compiler to a program more like:
The "return true" at the end of the body is the implicit "continue" at the end of the loop body. An explicit continue would translate to "return true" as well. A break statement would translate to "return false" instead. Other control structures are more complicated but still possible. How are more complicated loops implemented?Beyond simple break and continue, other control flow (labeled break, continue, goto out of the loop, return) requires setting a variable that the code outside the loop can consult when the loop breaks. For example a "return" might turn into something like "doReturn = true; return false" where the "return false" is the "break" implementation, and then when the loop finishes, other generated code would do "if(doReturn) return". The full rewrite is explained at the top of cmd/compile/internal/rangefunc/rewrite.go in the implementation. What if the iterator function ignores yield returning false?We'll get there. Keep scrolling down. Why are yield functions limited to at most two arguments?There has to be a limit; otherwise people file bugs against the compiler when it rejects ridiculous programs. If we were designing in a vacuum, perhaps we would say it was unlimited but that implementations only had to allow up to 1000, or something like that. However, we are not designing in a vacuum: go/ast and go/parser exist, and they can only represent and parse up to two range values. We clearly need to support two values to simulate existing range usages. If it were important to support three or more values, we could change those packages, but there does not appear to be a terribly strong reason to support three or more, so the simplest choice is to stop at two and leave those packages unchanged. If we find a strong reason for more in the future, we can revisit that limit. Another reason to stop at two is to have a more limited number of function signatures for general code to define. Standard library changes are explicitly out of scope for this proposal, but having only three signatures means a package can easily define names for all three, like perhaps:
Why are standard library changes out of scope for this proposal?This proposal is limited to the language change. If we add standard library changes to the scope, it will be that much harder to converge. There will be other proposals for the standard library. Probably they will include some kind of useful iterator definitions, as well as iterator-returning forms of the functions mentioned earlier, like strings.Split and regexp.Regexp.FindAll. For now, though, please limit discussion to the language change. We will comment on this issue with pointers to library change proposals when they are posted. We are still working through some details in the draft. What will idiomatic APIs with range functions look like?We don't know yet, and that's really part of the eventual standard library proposal. However, you could imagine a container like a binary tree implementing an All method that returns an iterator:
We would like to establish a name like that, probably All, as the default "all items" iterator. Specific containers might provide others as well. Maybe a list would provide backward iteration too:
These examples are meant to show that the library can be written in a way that should make these kinds of functions readable and understandable. Please don't focus on the exact details. Again, that will be the focus of other proposals, which we will link here when they are ready. Will Go programs using range over functions be readable?We think they can be. For example using slices.Backward instead of the explicit count-down loop should be easier to understand, especially for developers who don't see count-down loops every day and have to think carefully through the boundary conditions to make sure they are correct. It is true that the possibility of range over a function means that when you see range x, if you don't know what x is, you don't know exactly what code it will run or how efficient it will be. But slice and map iteration are already fairly different as far as the code that runs and how fast it is, not to mention channels. Ordinary function calls have this problem too - in general we have no idea what the called function will do - and yet we find ways to write readable, understandable code, and even to build an intuition for performance. The same will certainly happen with range over functions. We will build up useful patterns over time, and people will recognize the most common iterators and know what they do. What is the bool result from the iterator function for?(TL;DR: It is a mistake that hasn't been deleted from the proposal yet.) The bool result from the iterator function indicates whether the iterator finished. If yield returns false, the iterator should stop and return false. Otherwise it should return true. The intent was that the bool result would make iterator composition easier, but it looks like the only operation it helps is concatenating iterators, as in:
But this function could just as easily be written:
So the bool result does not seen entirely justified. A second example of composition is iterating a binary tree:
Having the boolean result avoids writing a second helper function, because All is the iterator (as opposed to returning the iterator). However, writing programs using these functions, it quickly becomes confusing that sometimes parens are needed to obtain the iterator (range slices.Backward(x)) and sometimes they are not (range t.All). It seems less confusing if every method used in a range statement returns iterators instead of being iterators. It's also more readable, since these methods can explicitly return iter.Seq[V] in their public API instead of spelling out the iterator function signature. That means t.All should return an iterator, not be an iterator, which requires the helper function anyway, and the helper can use a bool even if the standard signature does not:
When All was itself an iterator (as opposed to returning an iterator), there was a useful pun on the name All in that it could equivalently be defined as:
This is like Python's All. However, if All returns an iterator, the opportunity for the useful pun goes away, and so does whatever small nudge it provided for a bool result. Since all three reasons for a bool result from the iterator don't look terribly compelling, we should probably remove it. The proposal is not yet updated to reflect that. What if the iterator function ignores yield returning false?The current prototype does not notice, because the added check was causing problems with inlining, and we wanted to demonstrate that the performance of these range-over-func loops could be competitive with hand-written loops. For the real implementation, we will make sure the check happens, with a panic on misuse, without noticeably hurting performance. What if the iterator function saves yield and calls it after returning?The current prototype does not notice. Again, the real implementation will. What if the iterator function calls yield on a different goroutine?This is an open question that is not yet decided. We have gone back and forth about whether this should be allowed. In general Go tries very hard to avoid exposing any notion of goroutine identity, and it would be strange to define that yield must be called on same goroutine that the function iterator was invoked on. It would be stranger still to notice and panic, since that would provide a way to create functions that only run on a single goroutine. Even so, maybe it is worth doing. What if the iterator function calls yield on multiple goroutines simultaneously?This is an open question that is not yet decided. In general that would cause a race, and running the race detector would catch the problem. However, if the loop body does not write to any shared state and does not execute any special control flow other than continue (no break, no return, no goto), that code probably would not have any races. It is unclear whether we should disallow that kind of use (probably) and whether we can catch it efficiently (perhaps not). If we did decide that yield should only be called on its original goroutine, that would also eliminate the possibility of use on multiple goroutines simultaneously. Can we write a vet check for misuse of a yield function?We have not looked into this, but almost certainly yes. With the runtime checks, however, it may not be necessary. The initial version probably will not include this check. But maybe. What do stack traces look like in the loop body?The loop body is called from the iterator function, which is called from the function in which the loop body appears. The stack trace will show that reality. This will be important for debugging iterators, aligning with stack traces in debuggers, and so on. Why are the semantics not exactly like if the iterator function ran in a coroutine or goroutine?Running the iterator in a separate coroutine or goroutine is more expensive and harder to debug than having everything on one stack. Since we're going to have everything on one stack, that fact will change certain visible details. We just saw the first: stack traces show the calling function and the iterator function interleaved, as well as showing the explicit yield function that does not exist on the page in the program. It can be helpful to think about running the iterator function in its own coroutine or gorotine as an analogy or mental model, but in some cases the mental model doesn't give the best answer, because it uses two stacks, and the real implementation is defined to use one. What happens if the loop body defers a call? Or if the iterator function defers a call?If a range-over-func loop body defers a call, it runs when the outer function containing the loop returns, just as it would for any other kind of range loop. That is, the semantics of defer do not depend on what kind of value is being ranged over. It would be quite confusing if they did. That kind of dependence seems unworkable from a design perspective. Some people have proposed disallowing defer in a range-over-func loop body, but this would be a semantic change based on the kind of value being ranged over and seems similarly unworkable. The loop body's defer runs exactly when it looks like it would if you didn't know anything special was happening in range-over-func. If an iterator function defers a call, the call runs when the iterator function returns. The iterator function returns when it runs out of values or is told to stop by the loop body (because the loop body hit a "break" statement which translated to "return false"). This is exactly what you want for most iterator functions. For example an iterator that returns lines from a file can open the file, defer closing the file, and then yield lines. The iterator function's defer runs exactly when it looks like it would if you didn't know the function was being used in a range loop at all. This pair of answers can mean the calls run in a different time order than the defer statements executed, and here the goroutine analogy is useful. Think of the main function running in one goroutine and the iterator running in another, sending values over a channel. In that case, the defers can run in a different order than they were created because the iterator returns before the outer function does, even if the outer function loop body defers a call after the iterator does. What happens if the loop body panics? Or if the iterator function panics?The deferred calls run in the same order for panic that they would in an ordinary return: first the calls deferred by the iterator, then the calls deferred by the loop body and attached to the outer function. It would be very surprising if ordinary returns and panics ran deferred calls in different orders. Again there is an analogy to having the iterator in its own goroutine. If before the loop started the main function deferred a cleanup of the iterator, then a panic in the loop body would run the deferred cleanup call, which would switch over to the iterator, run its deferred calls, and then switch back to continue the panic on the main goroutine. That's the same order the deferred calls run in an ordinary iterator, even without the extra goroutine. See this comment for a more detailed rationale for these defer and panic semantics. What happens if the iterator function recovers a panic in the loop body?This is an open question that is not yet decided. If an iterator recovers a panic from the loop body, the current prototype allows it to invoke yield again and have the loop keep executing. This is a difference from the goroutine analogy. Perhaps it is a mistake, but if so it is a difficult one to correct efficiently. We continue to look into this. Can range over a function perform as well as hand-written loops?Yes. Consider the slices.Backward example again, which first translates to:
The compiler can recognize that slices.Backward is trivial and inline it, producing:
Then it can recognize a function literal being immediately called and inline that:
Then it can devirtualize yield:
Then it can inline that func literal:
From that point the SSA backend can see through all the unnecessary variables and treats that code the same as
This looks like a fair amount of work, but it only runs for simple bodies and simple iterators, below the inlining threshold, so the work involved is small. For more complex bodies or iterators, the overhead of the function calls is insignificant. |
Change https://go.dev/cl/510540 mentions this issue: |
Change https://go.dev/cl/510541 mentions this issue: |
Change https://go.dev/cl/510539 mentions this issue: |
Change https://go.dev/cl/510538 mentions this issue: |
Change https://go.dev/cl/510536 mentions this issue: |
Change https://go.dev/cl/510537 mentions this issue: |
Change https://go.dev/cl/510535 mentions this issue: |
The func part seems like it'd be easier to reason about with a usage example showing the body of a function that would work here. If I've understood this correctly:
would print 0, 1, 2, 3, 4, and 5, then the call to yield(6) would return false and the loop would terminate? Is there a particular reason to cap functions at two arguments, rather than just taking arbitrary functions which return bool? |
@seebs, yes, you're right about the rangeTen example. If we cap the number of variables at two, then no changes are required in go/ast or go/parser, and libraries that want to support iteration functions know the 3 signatures they need to support. |
Hmm. So, basically,
So... does the Basically, what happens if the user is holding it wrong? Thinking about it more:
Does this print |
The proposed spec change says I read that to mean: if I'm writing a library that provides a map-ish data structure, and I want to allow the consumer to do all of Is that correct and the way you'd expect library authors to work with this language feature? Or are we going to end up having to provide |
I, too, would find it easier to understand the proposal with more examples. Perhaps you can give some examples of the code you're referring to with:
and
Do you think that the implementation of this feature will be able to be faster than 1 full function call per iteration? |
@seebs In your example: func rangeTen(yield func(int) bool) {
for i := range 10 {
if !yield(i) { return }
}
fmt.Println("got here")
}
...
for i := range rangeTen {
if i == 6 { break }
fmt.Println(i)
} Did you mean to use |
... yes. |
I was just writing some custom collection types last week and had no good solutions for dealing with iteration over members, so this proposal is timely! The ability to use a function with
Or, more concretely, what exactly would the For the sake of example I'm thinking about a hypothetical "set of T" type; let's call it As a method for name := range s.Range {
// ...
} ...or as a function in for name := range sets.Range(s) {
// ...
} Is Edit: when I first wrote this I had a thinko and for some reason wrote Rust-style Edit edit: A later rsc comment implicitly answers this question without directly answering it. 😀 From that comment I infer that:
for name := range s.All {
// ...
} That does seem pretty reasonable to me, though intentionally not including anything in the name representing "iterator" or "range" does mean that someone looking at the godoc for a package will presumably just need to be familiar with the pattern that any function which takes a yield function as its only argument is intended for use with |
In that case, this prints 0, 1, 2, 3, 4, 5, "got here". (You can test this with the draft implementation. Although that seems to require The "got here" will execute if the |
It would be great if there were a playground for this. Building a custom tool chain isn't too hard but it's fairly involved if it's not something you've done before and even if you have it's a bit much if you want to just try 2 or 3 small things. |
This seems like a possibly-surprising thing, because I think my current intuition is that if I'm doing a loop, and i execute But then I'm uncertain what I expect to happen in the |
Agreed a playground would be nice, but FWIW, it is pretty straightforward to try out a CL just by using the gotip wrapper utility, which will install a parallel Go toolchain in $HOME/sdk/gotip (without impacting your main Go toolchain). To try out these changes with the CL Russ suggested:
|
Yes. The |
Yes. |
There are two different loops here, and they should be considered separately.
What is new here is that if the And, yes, if we |
Here is a possibly illustrative (but untested) example. // Tree is a binary tree.
type Tree[E any] struct {
val E
left, right *Tree
}
// All may be used in a for/range loop to iterate
// through all the values of the tree.
// This implementation does an in-order traversal.
func (t *Tree[E]) All(yield func(E) bool) {
t.doAll(yield)
}
// doAll is a helper for All, to make it easier
// to know when the iteration stopped in a subtree.
func (t *Tree[E]) doAll(yield func(E) bool) bool {
if t == nil {
return true
}
return t.left.doAll(yield) && yield(t.val) && t.right.doAll(yield)
} Here is an example of using that method to sum all the values in a tree. // SumTree returns the sum of all the elements in t.
func SumTree(t *Tree[int]) int {
s := 0
for v := range t.All {
s += v
}
return s
} |
Hmm. Okay, so, the actual implementation of Perhaps it should be specified what, if anything, happens if you call |
The proposal should specify what happens if the yield function escapes and then is called. |
@rsc I built the toolchain from your branch as per your instructions (using make.bash) but it doesn't work.
range.go package main
import (
"fmt"
)
func main() {
for i := range 5 {
fmt.Println(i)
}
} What am I missing? |
I successfully compiled using gotip download 510541 and output the expected results. package main
import (
"fmt"
)
func main() {
for i := range 5 {
fmt.Println(i)
}
} My go version output go version devel go1.21-3bf145578f Mon Jul 17 17:18:06 2023 -0400 windows/amd64. |
Is this code https://go.dev/play/p/ikPyKMASpKv?v=gotip spec compliant? Is not using |
@DmitriyMV Looking at the code, I think it should fail, but when I run it with all the checks enabled (in a CL that has not yet landed) it does not, nor does it tweak the race detector when I try that. As a general rule, an iterator that runs loop bodies in parallel is likely to fail, because the misuse detector assumes serial execution. There are shared state and checking variables that both trigger races and probably have "wrong values" because of concurrent updates. The reason for this restrictive assumption is that supporting parallel execution would generate questions that we don't want to figure out answers to, and would almost add so much overhead that the normal, serial, case would be come uninteresting. Parallel execution is simply not the use case for this. (In a language with a much more exotic type system I can imagine this happening, but Go is not that language, and I'm not sure one like that even exists outside of research.) |
Modify rangefunc #next protocol to make it more robust Extra-terrible nests of rangefunc iterators caused the prior implementation to misbehave non-locally (in outer loops). Add more rangefunc exit flag tests, parallel and tricky This tests the assertion that a rangefunc iterator running in parallel can trigger the race detector if any of the parallel goroutines attempts an early exit. It also verifies that if everything else is carefully written, that it does NOT trigger the race detector if all the parts run time completion. Another test tries to rerun a yield function within a loop, so that any per-line shared checking would be fooled. Added all the use-of-body/yield-function checking. These checks handle pathological cases that would cause rangefunc for loops to behave in surprising ways (compared to "regular" for loops). For example, a rangefunc iterator might defer-recover a panic thrown in the syntactic body of a loop; this notices the fault and panics with an explanation Modified closure naming to ID rangefunc bodies Add a "-range<N>" suffix to the name of any closure generated for a rangefunc loop body, as provided in Alessandro Arzilli's CL (which is merged into this one). Fix return values for panicky range functions This removes the delayed implementation of "return x" by ensuring that return values (in rangefunc-return-containing functions) always have names and translating the "return x" into "#rv1 = x" where #rv1 is the synthesized name of the first result. Updates #61405. Change-Id: I933299ecce04ceabcf1c0c2de8e610b2ecd1cfd8 Reviewed-on: https://go-review.googlesource.com/c/go/+/584596 Reviewed-by: Matthew Dempsky <mdempsky@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Tim King <taking@google.com>
@dr2chase the reason I'm asking this is that calls to trasformed I understand that this approach is also quite fragile because if any iterator in the chain between |
Change https://go.dev/cl/589397 mentions this issue: |
For #61405. Fixes #65237. Change-Id: Ia7820c0ef089c828ea7ed3d2802c5185c945290e Reviewed-on: https://go-review.googlesource.com/c/go/+/589397 TryBot-Bypass: Robert Griesemer <gri@google.com> Reviewed-by: Robert Griesemer <gri@google.com> Auto-Submit: Robert Griesemer <gri@google.com> Reviewed-by: Matthew Dempsky <mdempsky@google.com> Reviewed-by: Ian Lance Taylor <iant@golang.org>
Change https://go.dev/cl/590756 mentions this issue: |
The language change for the accepted range-over-func proposal #61405 was documented in CL 590616. Remove the corresponding 'TODO' entry. Also improve formatting slightly, and switch to preferred relative links. They'll work better in the long term and in more contexts. While here, also simplify the suggested line to preview release notes locally: setting the -content='' flag explicitly is no longer required as of CL 589936. For #65614. Change-Id: I6cee951b9ede33900bca48c9f709e3b2c5e87337 Reviewed-on: https://go-review.googlesource.com/c/go/+/590756 Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org> Reviewed-by: Robert Griesemer <gri@google.com>
I have a late-night suggestion: if the iterator function has an optional error pointer argument, then an error occurring during iteration can be conveniently collected by the enclosing function outside the for loop. That’s a headache with pull-iterators — people like Harald that spend their days iterating over SQL and locked-up nfs thinks this would be great An enclosing function returning the error would look like:
|
Change https://go.dev/cl/592935 mentions this issue: |
must be a different Harald, as I managed to avoid SQL alsmost completely so far. I'm the one doing ugly Linux kernel namespace acrobatics, locking go routines all the time 😀 |
The "Changes to the language" section at the top of the release notes will likely ultimately include more explanation about iterators, or at least, the Go project will likely publish additional introductory material on iterators on the blog and so on. As a perhaps temporary step given current interest, this CL updates the release notes with two additional links for details and motivation. The new package documentation for the iter package is up-to-date, precise, and also more accessible than the language spec, while the 2022 pre-proposal GitHub discussion starts with perhaps the most compelling motivation writeup so far. (We purposefully include "2022" in the text to help illustrate this was not the result of an overly hasty process). We also update the target of the existing language spec reference to be closer to the new material. For #61405. Change-Id: I4bc0f99c40f31edfc5c0e635dca5f844b26b6eeb Reviewed-on: https://go-review.googlesource.com/c/go/+/592935 Reviewed-by: Mauri de Souza Meneguzzo <mauri870@gmail.com> Reviewed-by: Ian Lance Taylor <iant@google.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Auto-Submit: Ian Lance Taylor <iant@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Will function iterators work with spread operator? |
My understanding is that they work with the range operator, but the spread syntax doesn't support it. What would be the semantics for two-return value iterators, considering not only the index,v and k,v and v,err use cases? |
Also, what about this? func main() {
var x = []int{1,2,3,4,5}
F(slices.Values(x)...)
}
func F(x ...int) {
x[0] = 42
} Or what about this? func main() {
var x = func(yield func(int) bool) {
for i := 0; ; i++ {
if !yield(i) {
return
}
}
}
F(x...)
}
func F(x ...int) {
fmt.Println(len(x))
} |
@kodeyeen |
Ok, get it
In the first one x would be a slice . So you're just changing first element of the slice. But I agree with thediveo it's unclear what to do with iter.Seq2 functions. Maybe allow only iter.Seq. |
@rsc Sorry for being very much too late to the party, but could you or someone please point me to the motivation, why it has been decided that push iterator functions are the standard interface, and pull iterator functions are realized by a (rather complicated) stdlib implementation? Given the fact that pull iterator functions are trivially wrapped into push iterator functions, I naively assumed the opposite (making pull iterator functions the standard) would be the more natural way to go. I'm sure there are good reasons, but there is quite some discussion spread everywhere and I couldn't find it in a reasonable time. Thanks! |
According to #61897: "The vast majority of the time, push iterators are more convenient to implement and to use, because setup and teardown can be done around the yield calls rather than having to implement those as separate operations and then expose them to the caller. Direct use (including with a range loop) of the push iterator requires giving up storing any data in control flow, so individual clients may occasionally want a pull iterator instead. Any such code can trivially call Pull and defer stop." In other words, if you're implementing a pull iterator, it has to have a 'next()' method and some place to keep its data and keep track of the iterator's progress in between calls to (Interestingly, or not, Rust uses pull iterators, but is now in the process of adding coroutines and push iterators.) |
Following discussion on #56413, I propose to add two new types that a for-range statement can range over: integers and functions.
In the spec, the table that begins the section would have a few more rows added:
Range over integer
If n is an integer type, then
for x := range n { ... }
would be completely equivalent tofor x := T(0); x < n; x++ { ... }
, whereT
is the type ofn
(assuming x is not modified in the loop body).The additional spec text for range over integer is:
In the Go code I have examined from the main repo, approximately half of existing 3-clause for loops can be converted to range over integer. Loops that don't care about the index variable become even shorter. For example, the canonical benchmark iteration becomes:
3-clause for loops certainly have their place in Go programs, but something as simple as counting to
n
should be easier. The combination of range over integer and range over functions should make range the for loop form of choice for almost all iteration. When you see a 3-clause loop, you'll know it is doing something unusual and know to examine it more carefully.Range over function
If
f
is a function type of the formfunc(yield func(T1, T2)bool) bool
, thenfor x, y := range f { ... }
is similar tof(func(x T1, y T2) bool { ... })
, where the loop body has been moved into the function literal, which is passed tof
asyield
. The boolean result fromyield
indicates tof
whether to keep iterating. The boolean result fromf
itself is ignored in this usage but present to allow easier composition of iterators.I say "similar to" and not "completely equivalent to" because all the control flow statements in the loop body continue to have their original meaning. In particular, break, continue, defer, goto, and return all do exactly what they would do in range over a non-function.
Both T1 and T2 are optional:
func(yield func(T1)bool) bool
andfunc(yield func()bool) bool
can both be iterated over as well. Just like any other range loops, it is permitted to omit unwanted variables. For example if f has typefunc(yield func(T1, T2) bool) bool
any of these are valid:The additional spec text for range over function is:
Being able to range over a function enables writing range over user-specified iteration logic,
which should simplify many programs. It should also help make use of custom collection types nicer.
Not all iteration a program wants to do fits into a single for loop. To convert one of these functions to a "next-based" or "pull-based" iterator, we plan to add a function to the standard library that runs the yield-based ("push-based") iterator in a coroutine, along the lines of the
coro.Pull
function in my recent blog post. The specific name of those functions will be decided in a future proposal; this proposal is limited to language changes, not library changes.One of the problems identified in the discussion of #56413 was having too many different possible functions to range over. For that reason, this proposal drops the possibility of funcs that do not return bool as well as funcs of the form
func() (T, bool)
("pull-iterators"). Push iterators are the standard form with language support and that packages should provide, and something likecoro.Pull
will allow conversion to pull iterators in code that needs that form.Implementation
I have implemented this proposal so that we can explore it and understand it better. To run that prototype implementation:
and then use the Go toolchain you just built. If you already have Go checked out and use the codereview plugin, you can:
(Or git codereview change if you don't have the standard aliases set up.)
The text was updated successfully, but these errors were encountered: