-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
iterators that don't know they're done until they've tried #8149
Comments
I hesitate to say this, but this is a perfect use case for nullable types. |
You could have an interface consisting of start() and next(), where each This also allows iterators over empty ranges (start returns "done"), and -erik On Tue, Aug 26, 2014 at 2:30 PM, Stefan Karpinski notifications@github.com
Erik Schnetter schnetter@cct.lsu.edu |
see also #6125 |
Thanks for the reference, Jeff – I must be terrible at keyword search on GitHub. Lots of relevant issues referenced from there. @eschnett, the problem is that you still need to return a value even if it isn't used – where do you get such a value? |
The reason an exception is not so crazy is that at least it only happens once per loop, and doesn't affect code that runs on every iteration. However for nested numerical loops it becomes crazy again. I'm skeptical of whether there is any way to make passing an extra bit out of |
@johnmyleswhite 's suggestion of nullable types might be just the ticket though. For value types, the value field can be uninitialized, and for reference types the value field can be left as an undefined reference. This is actually a case where immutable types with uninitialized fields are useful. I have found that LLVM is actually very good at handling structs that contain some value plus a |
The Nullable type is a good possibility. One question is whether it's the |
I think the Nullable version should definitely be tried to kick the tires for feel and performance. A big problem will be manually using |
Well, I have to confess that if we take the nullable option (see what I did there), then I'd like to get rid of state = start(itr)
while true
value, state = next(itr, state)
isnull(value) && break
# loop body
end |
Yes, not thinking about |
The problem I see with that is that it might not be possible to construct a valid |
We can do whatever |
Fair enough. |
Solved problem: https://github.com/johnmyleswhite/NullableTypes.jl/blob/master/src/01_typedef.jl: immutable Nullable{T}
isnull::Bool
value::T
Nullable() = new(true)
Nullable(value::T) = new(false, value)
end |
Maybe it's time I make a pull request to bring NullableTypes.jl into Base? |
Yeah, let's do that. It's the sort of type that is far more useful if part of Base. |
Ok, I'll do that once I get home from work. |
Won't using option types incur additional overhead for reference types, since immutables with references are allocated on the heap? |
Yes, but in that case we're already allocating a tuple if |
Nullable PR: #8152 |
Question: if you don't have The other aspect to consider with iterators is if you want to allow lazy functional composition. I am going to write the examples in Elixir (sorry!). In Elixir, we have two modules: Enum (for eager computations) and Stream (for lazy computations): 1..3
|> Enum.map(fn -> IO.inspect(x) end)
|> Enum.map(fn -> x * 2 end)
|> Enum.map(fn -> IO.inspect(x) end)
Streams, on the other hand, allow us to lazily compose: stream = 1..3
|> Stream.map(fn -> IO.inspect(x) end)
|> Stream.map(fn -> x * 2 end)
|> Stream.map(fn -> IO.inspect(x) end) At this point, no computation was done. We need to eagerly convert the stream to a list or force it somehow. If we call
Now the list was traversed just once. Laziness allows all kind of interesting composition, working with infinite collections, lazy resources and so on. I am just bringing this up because choosing to support (or not support) those features affect considerably the design of iterators. The classic start/next iterator, described here, cannot efficiently support laziness. You would need to at least make |
Just one disclaimer: the solutions linked above aim to be purely functional. Another option is to make the iterator mutable in terms of its state (so it returns only |
The design pattern I use in this situation is to introduce a 1 step phase offset between the state and the data produced. This means that between loop start and the first item being delivered there are at most two tries made. But after that everything proceeds as normal. In the context of tasks the pattern would be
The idea being to make "done" as light as possible, and to focus it on strictly Boolean determinations. As a design pattern putting a lot of branching and assignment inside of a function that is called implicitly on a branch, the "while", is very risky. |
Subsumed by #18823. |
There's a fundamental awkwardness to using the start/next/done protocol to iterate things like streams or tasks that can't know if they're done or not until they've tried to get the next item. Currently, we work around this mismatch by advancing the iterator in done instead of in next. This kind of works, but it's pretty awkward and intuitive. It also makes composition of these kinds of iterators fail very badly in some cases (zipping multiple streams, etc.).
The only way around this is for
next
to have two exit routes: normal return of the next value and "abnormal" return when the stream has gone dry. Python handles this by with exceptions: when a generator is done, it throws one. This does not have sufficiently good performance for Julia, however. Another option would be to use a special sentinel return value to indicate that iteration has finished and there's no value to be had. To use this approach, we'd need to make sure the compiler is clever enough to optimize this type instability away and produce fast code.Another option would be to use exceptions like Python does, but not always use exceptions. It's worth noting that when you're iterating over something like a stream or a task, it is often the case that you want to do some kind of cleanup – e.g. closing a file handle – if the iteration exits via some error.
The text was updated successfully, but these errors were encountered: