-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Re-working SubArray #3496
Comments
Here is a tentative proposal, import Base.getindex
abstract AbstractSubArray{T,N,A<:AbstractArray} <: AbstractArray{T,N}
# for contiguous block
immutable ContiguousSubArray{T,N,A<:Array} <: AbstractSubArray{T,N,A}
parent::A
dims::NTuple{N,Int}
offset::Int
end
getindex(a::ContiguousSubArray, i::Int) = a.parent[a.offset + i]
getindex(a::ContiguousSubArray, i::Int, j::Int) = a.parent[a.offset + i + (j - 1) * a.dims[1]]
# for strided block
immutable StridedSubVector{T,A<:Array} <: AbstractSubArray{T,1,A}
parent::A
dim1::Int
offset1::Int
stride1::Int
end
getindex(a::StridedSubVector, i::Int) = a.parent[a.offset1 + (i - 1) * a.stride1]
immutable StridedSubMatrix{T,A<:Array} <: AbstractSubArray{T,2,A}
parent::A
dim1::Int
dim2::Int
offset1::Int
offset2::Int
stride1::Int
stride2::Int
end
immutable MultiDimStridedSubArray{T,N,A<:Array} <: AbstractSubArray{T,N,A}
parent::A
dims::NTuple{N,Int}
offsets::NTuple{N,Int}
strides::NTuple{N,Int}
end
# more generic case where the parent is not an instance of Array
immutable GenericSubArray{T,N,A<:AbstractArray,I<:(RangeIndex...,)} <: AbstractSubArray{T,N,A}
parent::A
shape::NTuple{N,Int}
dims::I
strides::NTuple{N,Int}
end
# this allows to define functions that work consistently over arrays with contiguous memory
typealias ContiguousArray{T,N} Union(Array{T,N}, ContiguousSubArray{T})
typealias StridedArray{T,N} Union(Array{T,N}, SubArray{T,N})
# specialized methods to create sub-array
# we need a bunch of these & probably bound checking
function sub{T}(a::Array{T}, ::Colon, j::Int)
m = size(a, 1)
ContiguousSubArray{T,1,Array{T}}(a, (m,), m * (j-1))
end
function sub{T}(a::Array{T}, rgn::Range1{Int}, j::Int)
m = size(a, 1)
ContiguousSubArray{T,1,Array{T}}(a, (length(rgn),), (rgn[1]-1) + m * (j-1))
end
function sub{T}(a::Array{T}, rgn::Range{Int}, j::Int)
m = size(a, 1)
StridedSubVector{T,Array{T}}(a, length(rgn), (rgn[1]-1) + m * (j-1), step(rgn))
end Of course, this is just a small part. But, hopefully, you get the basic idea. If the compiler can get those If we agree on this general direction, I may then make a pull request and work on it soon. |
I haven't looked this over in detail except to note that this version is at least memory-safe, even if it forces construction. In the long run this may be resolvable, but what I like about this is that it gives us a path forward that won't have to be rewritten in the long term. |
An improved version that achieves both good indexing performance & safety: immutable SubVec{T, A<:Array} <: AbstractArray{T, 1}
a::A
ptr::Ptr{T}
len::Int
SubVec(a::A, offset::Int, len::Int) = new(a, pointer(a, offset+1), len)
end
getindex(s::SubVec, i::Int) = unsafe_load(s.ptr, i)
setindex!(s::SubVec, v, i::Int) = unsafe_store!(s.ptr, v, i)
length(s::SubVec) = s.len
function sub{T}(a::Array{T}, ::Colon, j::Int)
m = size(a, 1)
SubVec{T,Array{T}}(a, (j - 1) * m, m)
end Benchmark shows that this works as fast as hand-crafted loops. This version incorporates both the parent and an offseted pointer in the immutable:
|
Very cool! I do think it's probably worth having a dedicated 1d type, since getting good performance is so much easier and it's a very common case. |
Not only 1D vectors, any sub-array that is contiguous can in principle get such performance. However, I just find another issue -- any attempt to do bound checking of |
For regular arrays, Jeff does this in the compiler ( |
Presumably, it's an inlining issue? (I usually check this by the profiler---you can tell by the nesting depth---but if you can read the disassembly it's more direct.) |
I like this approach, and it is much better thought out than what I was thinking of. Let's certainly replace the sub-array implementation with a better one. On @lindahua Perhaps you should also try disabling the bounds check in the yet to be merged #3268 |
We can also implement a non-strided view. That would solve your consistency concern. |
Even when I and J are arbitrary things (e.g. integer vector), we can still make a view by maintaining references to I and J. So we will have a variety of subarrays, contiguous, strided, and more generic one that simply maintain references to given indexers. Let's first focus on rewriting |
The thing about non-strided views is that it will have extra indexing overhead, and will be slower. The default can always be views, and the user has to make copies explicitly in some cases. I agree that we should make |
I suspect this can be closed now. |
I created this issue to move the discussion in #3224 over here.
I wrote a benchmark script on Gist to compare the performances of different approaches:
Here is what I get:
We can observe:
sub
, which createsSubArray
is way too slow -- the overhead and indexing cost just dominate.a[:, i]
which makes copies also kills performancepointer_to_array
to create a shared-memory view work much better. That said, both lead to significant overhead when the columns are short.I may work on this as the first step towards better performance of Julia. But we need some discussion to settle on a approach first.
A problem with immutable is that methods defined on them do not get properly inlined, resulting in performance hit.
The text was updated successfully, but these errors were encountered: