-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ReshapedArrays, take2 #15449
ReshapedArrays, take2 #15449
Conversation
10c6c99
to
35777df
Compare
OK, this is getting more serious now. I'm deliberately taking the "hard-core" road, deleting virtually all other @mbauman, I reworked Perhaps of greatest interest is a performance benchmark in this gist. The first line corresponds to uncommenting a new definition of Here are the performance results: julia> include("perf_reshaped.jl")
0.000006 seconds (159 allocations: 11.137 KB)
reshaped linearslow, integers
1.206878 seconds (5 allocations: 176 bytes)
reshaped linearslow, linear indexing
1.186159 seconds (5 allocations: 176 bytes)
reshaped linearslow, eachindex
0.278382 seconds (5 allocations: 176 bytes)
original subarray, integers (linearslow)
0.107050 seconds (5 allocations: 176 bytes)
original subarray, eachindex (linearslow)
0.212035 seconds (5 allocations: 176 bytes)
reshaped linearfast, integers
0.109079 seconds (5 allocations: 176 bytes)
reshaped linearfast, linear indexing
0.110559 seconds (5 allocations: 176 bytes)
reshaped linearfast, eachindex
0.109705 seconds (5 allocations: 176 bytes)
original array, integers
0.108872 seconds (5 allocations: 176 bytes)
original array, linear indexing
0.108599 seconds (5 allocations: 176 bytes)
original array, eachindex
0.110465 seconds (5 allocations: 176 bytes) It seems that |
|
||
return SparseMatrixCSC(mS, nS, colptr, rowval, nzval) | ||
end | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is one we should surely revert before merging this. But it's a good test of generality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, scratch that. Unlike reshape
methods for BitArray
and SharedArray
, this copies the data. So keeping this is probably not on the table, if we're serious about having reshape
always return a view.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is akin to the sparse views problem. It's inconsistent, but I think you should restore this to the copying method for now. Making reshape default to a view is still a huge improvement over the status quo, even if we don't get to 100% consistency.
It may not be all that terrible to just have sparse reshapes and slices always return copies as an exception to the rule. Heck, NumPy gets away with their fancy-indexing-returns-a-copy, but I have no idea how error prone that is in practice.
Crazy half-baked thought: what if we expose this functionality as a new index type instead of a new wrapper? With APL indexing (#15431), we can use specialized multidimensional arrays as a mapping from immutable ReshapedIndices{N,M} <: AbstractArray{CartesianIndex{M}, N}
dims::NTuple{N, Int}
mi::NTuple{M, SignedMultiplicativeInverse{Int}}
end
function ReshapedIndices(dest_size::Dims, src_size::Dims)
strds = front(tail(size_strides((1,), src_size...)))
mi = map(SignedMultiplicativeInverse, strds)
ReshapedIndices(dest_size, mi)
end
Base.size(ri::ReshapedIndices) = ri.dims
Base.getindex(A::ReshapedIndices, indexes::Int...) = CartesianIndex(ind2sub_rs(A.mi, sub2ind(size(A), indexes...)))
julia> R = ReshapedIndices((6,2), (4, 3))
6x2 ReshapedIndices{2,1}:
CartesianIndex{2}((1,1)) CartesianIndex{2}((3,2))
CartesianIndex{2}((2,1)) CartesianIndex{2}((4,2))
CartesianIndex{2}((3,1)) CartesianIndex{2}((1,3))
CartesianIndex{2}((4,1)) CartesianIndex{2}((2,3))
CartesianIndex{2}((1,2)) CartesianIndex{2}((3,3))
CartesianIndex{2}((2,2)) CartesianIndex{2}((4,3)) Of course, this doesn't address the linear fast or iteration special cases that you've worked so hard on getting good performance with… but I think those can just be expressed as dispatch specializations on the SubArray index tuple. I think the hardest part would be reshapes of SubArrays; the naive |
And the solution there would be to make |
Yes, I agree that a lazy Given this line of thinking: I would posit that |
I may be missing something, but I'm pretty sure that this is almost isomorphic to your current design. The sketch of the ReshapedIndices type doesn't pre-compute the mapping — it does so lazily on-demand. I just stripped out the index part of the Now, I don't think it'll gain us anything major, but I think there are a few advantages here:
Now, the only difference that I see potentially causing trouble is that this would lean more on |
Ah, I misunderstood your main intent; I thought you meant that we should store Sure, I'd be fine with a semi-flattening of the hierarchy, putting the index gymnastics into the indexes tuple. There's a certain logic to that. If one did I'm not sure I yet see dramatic advantage in your design, but the bottom line is that I'd be fine in principle with an isomorphic design---or more properly, ANY design that works! At the moment, right now I'm pretty focused on (aka, worried about) efficient iteration and generic code; I would guess we're facing the same obstacles with either design, unless I'm completely missing an insight you've had. |
Yup, exactly. I'd feel stronger if there were more cases where |
I have a sense that some folks seem to be making a bit of an effort to move towards "end-game" on 0.5. If that's true, a key question is: how to think about this PR? In particular, how worried should I be about the remaining cases of poor performance? I've been taking the stance that we really need to solve essentially "all" of the troublesome cases before merging, which basically means solving #15648. That's a big job. (It's a good one, but big.) However, the cases of bad performance are, arguably, a relatively small subset: the poster child is Especially if we're unlikely to return views-by-default from |
35777df
to
866cd4d
Compare
|
OK, this seems ready to go, pending the benchmark results and a merge conflict that seems to have arisen since this morning. @mbauman, I think I addressed all your review comments:
Finally, while I've been vocal about some slow corner cases, it's worth pointing out that this isn't all doom and gloom. In particular, if you must use a linear index with a julia> function badsum(A)
s = 0.0
for i = 1:length(A)
s += A[i]
end
s
end
badsum (generic function with 1 method)
julia> A = rand(1000,1001);
julia> S = sub(A, 1:999, 1:998);
julia> R = vec(S);
julia> typeof(R)
Base.ReshapedArray{Float64,1,SubArray{Float64,2,Array{Float64,2},Tuple{UnitRange{Int64},UnitRange{Int64}},false},Tuple{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64}}}
# after suitable warmup
julia> @time badsum(S)
0.030715 seconds (5 allocations: 176 bytes)
498730.01196278195
julia> @time badsum(R)
0.011148 seconds (5 allocations: 176 bytes)
498730.01196278195 Something on the scale of a 3x improvement, mostly thanks to #15357. This is still roughly 7x slower than with a |
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @jrevels |
No generated functions, works for CartesianIndex.
866cd4d
to
5f7e1cb
Compare
Amazing work, Tim! This looks great. |
🎉 |
|
||
@inline setindex!(A::ReshapedArrayLF, val, index::Int) = (@boundscheck checkbounds(A, index); @inbounds parent(A)[index] = val; val) | ||
@inline setindex!(A::ReshapedArray, val, indexes::Int...) = (@boundscheck checkbounds(A, indexes...); _unsafe_setindex!(A, val, indexes...)) | ||
@inline setindex!(A::ReshapedArray, val, index::ReshapedIndex) = (@boundscheck checkbounds(parent(A), index.parentindex); @inbounds parent(A)[index.parentindex] = val; val) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
line lengths in this file are a tad out of hand
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like maybe "functions containing multiple statements should use the long form function definition syntax" would be a good style guideline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in #15973.
I believe this broke many doctests. One example from http://julia.readthedocs.io/en/latest/manual/interfaces/#abstract-arrays julia> immutable SquaresVector <: AbstractArray{Int, 1}
count::Int
end
julia> Base.size(S::SquaresVector) = (S.count,)
julia> Base.linearindexing(::Type{SquaresVector}) = Base.LinearFast()
julia> Base.getindex(S::SquaresVector, i::Int) = i*i
julia> s = SquaresVector(7)
7-element SquaresVector:
1
4
9
16
25
36
49
julia> s \ rand(7, 2)
ERROR: MethodError: no method matching qrfact(::Base.ReshapedArray{Int64,2,SquaresVector,Tuple{}}, ::Type{Val{true}})
Closest candidates are:
qrfact{T<:Union{Complex{Float32},Complex{Float64},Float32,Float64}}(::Union{Base.ReshapedArray{T<:Union{Complex{Float32},Complex{Float64},Float32,Float64},2,A<:DenseArray,MI<:Tuple{Vararg{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64},N}}},DenseArray{T<:Union{Complex{Float32},Complex{Float64},Float32,Float64},2},SubArray{T<:Union{Complex{Float32},Complex{Float64},Float32,Float64},2,A<:Union{Base.ReshapedArray{T,N,A<:DenseArray,MI<:Tuple{Vararg{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64},N}}},DenseArray},I<:Tuple{Vararg{Union{Base.AbstractCartesianIndex,Base.NoSlice,Colon,Int64,Range{Int64}},N}},L}}, ::Any)
qrfact{T}(::Union{Base.ReshapedArray{T,2,A<:DenseArray,MI<:Tuple{Vararg{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64},N}}},DenseArray{T,2},SubArray{T,2,A<:Union{Base.ReshapedArray{T,N,A<:DenseArray,MI<:Tuple{Vararg{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64},N}}},DenseArray},I<:Tuple{Vararg{Union{Base.AbstractCartesianIndex,Base.NoSlice,Colon,Int64,Range{Int64}},N}},L}}, ::Any)
qrfact(::SparseMatrixCSC{Tv,Ti<:Integer}, ::Type{Val{true}})
in \(::Base.ReshapedArray{Int64,2,SquaresVector,Tuple{}}, ::Array{Float64,2}) at ./linalg/generic.jl:347
in \(::SquaresVector, ::Array{Float64,2}) at ./linalg/generic.jl:350
in eval(::Module, ::Any) at ./boot.jl:230 There are also other places in the doc that uses |
This is because the old |
Is that what "similar" is supposed to be? |
That is sometimes what it means and we might need to use that more systematically in the |
Rats, I didn't think to run the doctests.
|
Wow, I can't even run the doctests locally; I got a python error in tim@cannon:~/src/julia-0.5$ python --version
Python 2.7.6 |
@andreasnoack, now I see there are already |
We need to promote to a mutable array with an element type that is stable under the necessary arithmetic operations of the factorizations. So far the promotion has mainly been focusing on getting the element type right and then do a AC = similar(A, PromoteType, returnsize)
copy!(AC, A) or define a convenience function for that. |
@timholy You should probably just copy the code and run it separately, there are many failures in the doc test now. I noticed this while rebasing #15753 It also seems that python2 is complaining about the unicode in some of the tests (that's causing the python error you get). I've also seen the hang in metaprogramming doc but haven't got time to look at it more carefully. For me the interface test is the first (or the second) one though.... |
I think that's needed only for |
Given the number of other things on my plate right now, I'm just gonna wait until someone fixes the rest of the problems first 😉. |
This contains my current implementation of ReshapedArrays.
It's built on top of #15357; only the last commit is new.This provides two means of access: the traditional
R[i,j,k]
, and an iteration-specificR[I]
whereI
is a special index that "pops up" to the parent array. This avoids the cost of calculating adiv
, which even with #15357 is somewhat slow.Currently, this is for reference only; there is more work that needs to be done on our iteration paradigms before this will pass tests.