-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make eachindex more efficient for linear arrays #10704
Conversation
@@ -713,7 +713,7 @@ function skip_deleted(h::Dict, i) | |||
end | |||
|
|||
start(t::Dict) = skip_deleted(t, 1) | |||
done(t::Dict, i) = done(t.vals, i) | |||
done(t::Dict, i) = i > length(t.vals) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious why this change is necessary...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This changes the iterator state to be a tuple of (IndexIterator, IteratorState)
instead of just the next index. Dict was (incorrectly) passing a state to the Array iteration protocol that did not come from the Array's start
or next
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, thanks.
+1 I worry about the extra complexity in AbstractArray iteration. Maybe we should still use the old, super-simple definitions for |
I'm completely in favor. Have you examined performance and/or looked at the generated code to make sure there is no performance loss? Given Jeff's concern, it would be interesting to know whether all the apparent complexity ends up getting inlined out for Arrays. (That said, for people digging into base with |
I imagine everything can be inlined, and this can be really fast. But iterating over Arrays is so common that this could create significant extra work for type inference and the inliner, even if there is no effect on the final machine code. |
True indeed. I'm in favor of the idea of reintroducing specialized versions for Arrays. |
That makes sense. For linear fast arrays, this inlines perfectly and has no noticeable performance impact, but I agree that we should keep the specialized methods for The bad news is that this is not inlined for Linear Slow arrays with a huge performance impact. I suppose that makes sense, though… the previous |
What are "Linear Slow arrays"? |
I updated the documentation for
I think that
|
Sprinkling in a few more inline annotations has helped some, but I'm still allocating 128 bytes per loop for iterating over a sparse matrix. |
It's almost perfect 😄. I would consider rewording the last sentence to something like
|
OK! Getting closer to parity with master here. No more allocations (I think the core issue was having nested tuples). This is now actually a bit faster for loops like Now to chase down those test failures... |
Nice! There were already a few oddities about our performance, and overall even on master we're a bit slow on small arrays (see the benchmarks in #10507, focusing on |
Ah, great suggestion. I ran the full suite (and updated the gist). I'm on-par or better than everything except that wonky SubArray case. In some cases performance here is 10x better. So I'm considering performance fixed. |
That's awesome! For any other performance oddities, also bear #9080 in mind. It seems to affect all "complex" iterators. (Unless you've solved it??) |
julia> @time sumcart_manual(A)
elapsed time: 0.14881743 seconds (184 bytes allocated)
5.000083609636129e7
julia> @time sumcart_iter(A)
elapsed time: 0.145975981 seconds (184 bytes allocated)
5.000083609636129e7 :) |
Yes, but I'm guessing you're cheating :-). With this PR, you have to switch |
Aw, shucks, I didn't notice that. Right you are, this doesn't solve it. |
You had me hoping there...anyway, just wanted to make sure you knew about that issue---might help interpret any performance oddities. |
This greatly improves the ability to use `eachindex` in generic code. For arrays with fast linear storage, `eachindex` now simply returns a `UnitRange` to linearly index the array. For arrays with slow linear access, this still returns a `CartesianRange`. In typical code constructs, the resulting elements from the iterators can be used identically. As a result, we can simplify AbstractArray cartesian iteration by moving all the complexity into the `eachindex` iterator.
Alright! All tests pass locally (including the full subarray suite). I've squashed the commits so this should be good to go (pending CI and a final thumbs up). |
# 0-d cartesian ranges are special-cased to iterate once and only once | ||
start{I<:CartesianIndex{0}}(iter::CartesianRange{I}) = false | ||
next{I<:CartesianIndex{0}}(iter::CartesianRange{I}, state) = iter.start, true | ||
done{I<:CartesianIndex{0}}(iter::CartesianRange{I}, state) = state | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's indeed much nicer than the old trick with finished
.
Looks great to me! |
## iteration support for arrays as ranges ## | ||
## iteration support for arrays ## | ||
macro _inline_meta() | ||
Expr(:meta, :inline) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A comment here about pushmeta!
not yet being available in the bootstrap might be handy. Otherwise
there's a (small) risk we might see usages of Base.@_inline_meta
popping up in package code from folks who learn by browsing through the source.
👍 Thanks for doing this! |
Make eachindex more efficient for linear arrays
This greatly improves the ability to use
eachindex
in generic code. For arrays with fast linear storage,eachindex
now simply returns aUnitRange
to linearly index the array. For arrays with slow linear access, this still returns aCartesianRange
. In typical code constructs, the resulting elements from the iterators can be used identically. (If not, I would consider it a missing functionality in CartesianIndex.)As a result, we can simplify AbstractArray cartesian iteration by moving all the complexity into the
eachindex
iterator.Cc @timholy, ref the discussion in #4774 (comment)