From 4223b58ec17cb1ad46e61998c55dc537022b3db5 Mon Sep 17 00:00:00 2001 From: Matt Bauman Date: Sat, 20 Jun 2015 23:19:31 -0400 Subject: [PATCH 1/5] Add an interfaces manual chapter Document the iterable, indexable, and abstract array interfaces. [av skip] --- doc/index.rst | 1 + doc/manual/interfaces.rst | 223 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 224 insertions(+) create mode 100644 doc/manual/interfaces.rst diff --git a/doc/index.rst b/doc/index.rst index f11335a886ac2..7a62aeb794287 100644 --- a/doc/index.rst +++ b/doc/index.rst @@ -30,6 +30,7 @@ manual/methods manual/constructors manual/conversion-and-promotion + manual/interfaces manual/modules manual/documentation manual/metaprogramming diff --git a/doc/manual/interfaces.rst b/doc/manual/interfaces.rst new file mode 100644 index 0000000000000..66fcdbbe81683 --- /dev/null +++ b/doc/manual/interfaces.rst @@ -0,0 +1,223 @@ +.. _man-interfaces: + +************ + Interfaces +************ + +A lot of the power and extensibility in Julia comes from a collection of informal interfaces. By extending few specific methods to work for a custom type, objects of that type not only receive those functionalities, but they are also able to be used in other methods that are written to generically build upon those behaviors. + +Iteration +--------- + +================================= ======================== =========================================== +Required methods Brief description +================================= ======================== =========================================== +:func:`start(iter) ` Returns the initial iteration state +:func:`next(iter, state) ` Returns the current item and the next state +:func:`done(iter, state) ` Tests if there are any items remaining +**Important optional methods** **Default definition** **Brief description** +:func:`eltype(IterType) ` ``Any`` The container's element type +:func:`length(iter) ` (*undefined*) The container's length +================================= ======================== =========================================== + +Sequential iteration is implemented by the methods :func:`start`, :func:`done`, and :func:`next`. Instead of mutating objects as they are iterated over, Julia provides these three methods to keep track of the iteration state externally from the object. The :func:`start(iter)` method returns an initial ``state`` object that gets passed along to :func:`done(iter, state)`, which tests if there are any elements remaining, and :func:`next(iter, state)`, which returns a tuple containing the current element and an updated ``state``. The ``state`` object can be anything, and is generally considered to be an implementation detail private to the iterable object. + +Any object that has these three methods appropriately defined can be used in a ``for`` loop since the syntax:: + + for i in iter # or "for i = iter" + # body + end + +is translated into:: + + state = start(iter) + while !done(iter, state) + (i, state) = next(iter, state) + # body + end + +A simple example is an iterable collection of square numbers with a defined length:: + + immutable Squares + count::Int + end + Base.start(::Squares) = 1 + Base.next(S::Squares, state) = (state*state, state+1) + Base.done(S::Squares, s) = s > S.count + +With only those definitions, the ``Squares`` type is already pretty powerful. We can iterate over all the elements:: + + julia> for i in Squares(10) + print(i, ", ") + end + 1, 4, 9, 16, 25, 36, 49, 64, 81, 100, + +We can compute the sum of all squares up to a certain number:: + + julia> sum(Squares(1803)) + 1955361914 + +Or even the mean and standard deviation:: + + julia> mean(Squares(100)), std(Squares(100)) + (3383.5,3024.355854282583) + +There are a few more methods we can extend to give Julia more information about this iterable collection. We know that the elements in a ``Squares`` collection will always be ``Int``. By extending the :func:`eltype` method, we can give that information to Julia and help it make more specialized code in the more complicated methods. We also know the number of elements in our collection, so we can extend :func:`length`, too:: + + Base.eltype(::Type{Squares}) = Int # Note that this is defined for the type + Base.length(S::Squares) = S.count + +Now, when we ask Julia to :func:`collect` all the elements into an array it can preallocate a ``Vector{Int}`` of the right size instead of blindly ``push!``\ ing each element into a ``Vector{Any}``:: + + julia> collect(Squares(100))' # transposed to save space + 1x100 Array{Int64,2}: + 1 4 9 16 25 36 49 64 81 100 … 9025 9216 9409 9604 9801 10000 + +While we can rely upon generic implementations, we can also extend specific methods where we know there is a simpler algorithm. For example, there's a formula to compute the sum of squares, so we can override the generic iterative version with a more performant solution:: + + julia> sum(S::Squares) = (n = S.count; return n*(n+1)*(2n+1)÷6) + sum(Squares(1803)) + 1955361914 + +This is a very common pattern throughout the Julia standard library: a small set of required methods define an informal interface that enable many fancier behaviors. In some cases, types will want to additionally specialize those extra behaviors when they know a more efficient algorithm can be used in their specific case. + +Indexing +-------- + +====================================== ================================== +Methods to implement Brief description +====================================== ================================== +:func:`getindex(X, i) ` ``X[i]``, indexed element access +:func:`setindex!(X, v, i) ` ``X[i] = v``, indexed assignment +:func:`endof(X) ` The last index, used in ``X[end]`` +====================================== ================================== + +For the ``Squares`` collection above, we can easily compute the ``i``\ th element of the collection by squaring it. We can expose this as an indexing expression ``S[i]``. To opt into this behavior, ``Squares`` simply needs to define :func:`getindex`:: + + julia> function Base.getindex(S::Squares, i::Int) + 1 <= i <= S.count || throw(BoundsError(S, i)) + return i*i + end + Squares(100)[23] + 529 + +Additionally, to support the syntax ``S[end]``, we must define :func:`endof` to specify the last valid index:: + + julia> Base.endof(S::Squares) = length(S) + Squares(23)[end] + 529 + +Abstract Arrays +--------------- + +========================================================== ============================================ ======================================================================================= +Methods to implement Brief description +========================================================== ============================================ ======================================================================================= +:func:`size(A) ` Returns a tuple containing the dimensions of A +:func:`Base.linearindexing(Type) ` Returns either ``Base.LinearFast()`` or ``Base.LinearSlow``. See the description below. +:func:`getindex(A, i::Int) ` (if ``LinearFast``) Linear scalar indexing +:func:`getindex(A, i1::Int, ..., iN::Int) ` (if ``LinearSlow``, where ``N = ndims(A)``) N-dimensional scalar indexing +:func:`setindex!(A, v, i::Int) ` (if ``LinearFast``) Scalar indexed assignment +:func:`setindex!(A, v, i1::Int, ..., iN::Int) ` (if ``LinearSlow``, where ``N = ndims(A)``) N-dimensional scalar indexed assignment with N ``Int`` arguments +**Optional methods** **Default definition** **Brief description** +:func:`getindex(A, I...) ` defined in terms of scalar :func:`getindex` Multidimensional and nonscalar indexing +:func:`setindex!(A, I...) ` defined in terms of scalar :func:`setindex!` Multidimensional and nonscalar indexed assignment +:func:`start`/:func:`next`/:func:`done` defined in terms of scalar :func:`getindex` Iteration +:func:`length(A) ` ``prod(size(A))`` Number of elements +:func:`similar(A) ` ``similar(A, eltype(A), size(A))`` Return a mutable array with the same shape and element type +:func:`similar(A, ::Type{S}) ` ``similar(A, S, size(A))`` Return a mutable array with the same shape and the specified element type +:func:`similar(A, dims::NTuple{Int}) ` ``similar(A, eltype(A), dims)`` Return a mutable array with the same element type and the specified dimensions +:func:`similar(A, ::Type{S}, dims::NTuple{Int}) ` ``Array(S, dims)`` Return a mutable array with the specified element type and dimensions +========================================================== ============================================ ======================================================================================= + +If a type is defined as a subtype of ``AbstractArray``, it inherits a very large set of complicated behaviors including iteration and multidimensional indexing built on top of single-element access. + +A key part in defining an ``AbstractArray`` subtype is :func:`Base.linearindexing`. Since indexing is such an important part of an array and often occurs in hot loops, it's important to make both indexing and indexed assignment as efficient as possible. Array data structures are typically defined in one of two ways: either it's most efficient to access the elements using just one index (using linear indexing) or it intrinsically accesses the elements with indices specified for every dimension. These two modalities are identified by Julia as ``Base.LinearFast()`` and ``Base.LinearSlow()``. Converting a linear index to multiple indexing subscripts is typically very expensive, so this provides a traits-based mechanism to enable efficient generic code for all array types. + +Returning to our collection of squares from above, we could instead define it as a subtype of an ``AbstractArray``:: + + immutable SquaresVector <: AbstractArray{Int, 1} + count::Int + end + Base.size(S::SquaresVector) = (S.count,) + Base.linearindexing(::Type{SquaresVector}) = Base.LinearFast() + Base.getindex(S::SquaresVector, i::Int) = i*i + +Note that it's very important to specify the two parameters of the ``AbstractArray``; the first defines the :func:`eltype`, and the second defines the :func:`ndims`. But that's it takes for our squares type to be an iterable, indexable, and completely functional array:: + + julia> s = SquaresVector(7) + 7-element SquaresVector: + 1 + 4 + 9 + 16 + 25 + 36 + 49 + + julia> s[s .> 20] + 3-element Array{Int64,1}: + 25 + 36 + 49 + + julia> s \ rand(7,2) + 1x2 Array{Float64,2}: + 0.0116789 0.0155006 + +As a more complicated example, let's define our own toy N-dimensional sparse-like array type built on top of ``Dict``:: + + immutable SparseArray{T,N} <: AbstractArray{T,N} + data::Dict{NTuple{N,Int}, T} + dims::NTuple{N,Int} + end + SparseArray{T}(::Type{T}, dims::Int...) = SparseArray(T, dims) + SparseArray{T,N}(::Type{T}, dims::NTuple{N,Int}) = SparseArray{T,N}(Dict{NTuple{N,Int}, T}(), dims) + + Base.size(A::SparseArray) = A.dims + Base.similar{T}(A::SparseArray, ::Type{T}, dims::Dims) = SparseArray(T, dims) + # Define scalar indexing and indexed assignment up to 3-dimensions + Base.getindex{T}(A::SparseArray{T,1}, i1::Int) = get(A.data, (i1,), zero(T)) + Base.getindex{T}(A::SparseArray{T,2}, i1::Int, i2::Int) = get(A.data, (i1,i2), zero(T)) + Base.getindex{T}(A::SparseArray{T,3}, i1::Int, i2::Int, i3::Int) = get(A.data, (i1,i2,i3), zero(T)) + Base.setindex!{T}(A::SparseArray{T,1}, v, i1::Int) = (A.data[(i1,)] = v) + Base.setindex!{T}(A::SparseArray{T,2}, v, i1::Int, i2::Int) = (A.data[(i1,i2)] = v) + Base.setindex!{T}(A::SparseArray{T,3}, v, i1::Int, i2::Int, i3::Int) = (A.data[(i1,i2,i3)] = v) + +Notice that this is a ``LinearSlow`` array, so we must manually define :func:`getindex` and :func:`setindex!` for each dimensionality we'd like to support. Unlike the ``SquaresVector``, we are able to define :func:`setindex!`, and so we can mutate the array:: + + julia> A = SparseArray(Float64,3,3) + 3x3 SparseArray{Float64,2}: + 0.0 0.0 0.0 + 0.0 0.0 0.0 + 0.0 0.0 0.0 + + julia> rand!(A) + 3x3 SparseArray{Float64,2}: + 0.418674 0.0901867 0.835166 + 0.85045 0.211394 0.0715443 + 0.569111 0.0535879 0.747284 + + julia> A[:] = 1:length(A); A + 3x3 SparseArray{Float64,2}: + 1.0 4.0 7.0 + 2.0 5.0 8.0 + 3.0 6.0 9.0 + +Since the ``SparseArray`` is mutable, we were able to override :func:`similar`. This means that when a base function needs to return an array, it's able to return a new ``SparseArray``:: + + julia> A[1:2,:] + 2x3 SparseArray{Float64,2}: + 1.0 4.0 7.0 + 2.0 5.0 8.0 + +And now, in addition to all the iterable and indexable methods from above, these types can interact with eachother and use all the methods defined in the standard library for ``AbstractArrays``:: + + julia> A[SquaresVector(3)] + 3-element SparseArray{Float64,1}: + 1.0 + 4.0 + 9.0 + + julia> dot(A[:,1],A[:,2]) + 32.0 From 552cbd992e1a031deec2111789d0ef8519894f4f Mon Sep 17 00:00:00 2001 From: Matt Bauman Date: Sun, 21 Jun 2015 00:23:21 -0400 Subject: [PATCH 2/5] Add doctests --- doc/manual/interfaces.rst | 147 ++++++++++++++++++++++++-------------- 1 file changed, 94 insertions(+), 53 deletions(-) diff --git a/doc/manual/interfaces.rst b/doc/manual/interfaces.rst index 66fcdbbe81683..b2fea54cd089d 100644 --- a/doc/manual/interfaces.rst +++ b/doc/manual/interfaces.rst @@ -4,7 +4,7 @@ Interfaces ************ -A lot of the power and extensibility in Julia comes from a collection of informal interfaces. By extending few specific methods to work for a custom type, objects of that type not only receive those functionalities, but they are also able to be used in other methods that are written to generically build upon those behaviors. +A lot of the power and extensibility in Julia comes from a collection of informal interfaces. By extending a few specific methods to work for a custom type, objects of that type not only receive those functionalities, but they are also able to be used in other methods that are written to generically build upon those behaviors. Iteration --------- @@ -36,46 +36,66 @@ is translated into:: # body end -A simple example is an iterable collection of square numbers with a defined length:: +A simple example is an iterable collection of square numbers with a defined length: - immutable Squares - count::Int - end - Base.start(::Squares) = 1 - Base.next(S::Squares, state) = (state*state, state+1) - Base.done(S::Squares, s) = s > S.count +.. doctest:: + + julia> immutable Squares + count::Int + end + Base.start(::Squares) = 1 + Base.next(S::Squares, state) = (state*state, state+1) + Base.done(S::Squares, s) = s > S.count; -With only those definitions, the ``Squares`` type is already pretty powerful. We can iterate over all the elements:: +With only those definitions, the ``Squares`` type is already pretty powerful. We can iterate over all the elements: - julia> for i in Squares(10) - print(i, ", ") +.. doctest:: + + julia> for i in Squares(7) + println(i) end - 1, 4, 9, 16, 25, 36, 49, 64, 81, 100, + 1 + 4 + 9 + 16 + 25 + 36 + 49 + +We can compute the sum of all squares up to a certain number: -We can compute the sum of all squares up to a certain number:: +.. doctest:: julia> sum(Squares(1803)) 1955361914 -Or even the mean and standard deviation:: +Or even the mean and standard deviation: + +.. doctest:: julia> mean(Squares(100)), std(Squares(100)) (3383.5,3024.355854282583) -There are a few more methods we can extend to give Julia more information about this iterable collection. We know that the elements in a ``Squares`` collection will always be ``Int``. By extending the :func:`eltype` method, we can give that information to Julia and help it make more specialized code in the more complicated methods. We also know the number of elements in our collection, so we can extend :func:`length`, too:: +There are a few more methods we can extend to give Julia more information about this iterable collection. We know that the elements in a ``Squares`` collection will always be ``Int``. By extending the :func:`eltype` method, we can give that information to Julia and help it make more specialized code in the more complicated methods. We also know the number of elements in our collection, so we can extend :func:`length`, too: - Base.eltype(::Type{Squares}) = Int # Note that this is defined for the type - Base.length(S::Squares) = S.count +.. doctest:: -Now, when we ask Julia to :func:`collect` all the elements into an array it can preallocate a ``Vector{Int}`` of the right size instead of blindly ``push!``\ ing each element into a ``Vector{Any}``:: + julia> Base.eltype(::Type{Squares}) = Int # Note that this is defined for the type + Base.length(S::Squares) = S.count; + +Now, when we ask Julia to :func:`collect` all the elements into an array it can preallocate a ``Vector{Int}`` of the right size instead of blindly ``push!``\ ing each element into a ``Vector{Any}``: + +.. doctest:: julia> collect(Squares(100))' # transposed to save space 1x100 Array{Int64,2}: 1 4 9 16 25 36 49 64 81 100 … 9025 9216 9409 9604 9801 10000 -While we can rely upon generic implementations, we can also extend specific methods where we know there is a simpler algorithm. For example, there's a formula to compute the sum of squares, so we can override the generic iterative version with a more performant solution:: +While we can rely upon generic implementations, we can also extend specific methods where we know there is a simpler algorithm. For example, there's a formula to compute the sum of squares, so we can override the generic iterative version with a more performant solution: + +.. doctest:: - julia> sum(S::Squares) = (n = S.count; return n*(n+1)*(2n+1)÷6) + julia> Base.sum(S::Squares) = (n = S.count; return n*(n+1)*(2n+1)÷6) sum(Squares(1803)) 1955361914 @@ -92,7 +112,9 @@ Methods to implement Brief description :func:`endof(X) ` The last index, used in ``X[end]`` ====================================== ================================== -For the ``Squares`` collection above, we can easily compute the ``i``\ th element of the collection by squaring it. We can expose this as an indexing expression ``S[i]``. To opt into this behavior, ``Squares`` simply needs to define :func:`getindex`:: +For the ``Squares`` collection above, we can easily compute the ``i``\ th element of the collection by squaring it. We can expose this as an indexing expression ``S[i]``. To opt into this behavior, ``Squares`` simply needs to define :func:`getindex`: + +.. doctest:: julia> function Base.getindex(S::Squares, i::Int) 1 <= i <= S.count || throw(BoundsError(S, i)) @@ -101,7 +123,9 @@ For the ``Squares`` collection above, we can easily compute the ``i``\ th elemen Squares(100)[23] 529 -Additionally, to support the syntax ``S[end]``, we must define :func:`endof` to specify the last valid index:: +Additionally, to support the syntax ``S[end]``, we must define :func:`endof` to specify the last valid index: + +.. doctest:: julia> Base.endof(S::Squares) = length(S) Squares(23)[end] @@ -134,16 +158,24 @@ If a type is defined as a subtype of ``AbstractArray``, it inherits a very large A key part in defining an ``AbstractArray`` subtype is :func:`Base.linearindexing`. Since indexing is such an important part of an array and often occurs in hot loops, it's important to make both indexing and indexed assignment as efficient as possible. Array data structures are typically defined in one of two ways: either it's most efficient to access the elements using just one index (using linear indexing) or it intrinsically accesses the elements with indices specified for every dimension. These two modalities are identified by Julia as ``Base.LinearFast()`` and ``Base.LinearSlow()``. Converting a linear index to multiple indexing subscripts is typically very expensive, so this provides a traits-based mechanism to enable efficient generic code for all array types. -Returning to our collection of squares from above, we could instead define it as a subtype of an ``AbstractArray``:: +Returning to our collection of squares from above, we could instead define it as a subtype of an ``AbstractArray``: - immutable SquaresVector <: AbstractArray{Int, 1} - count::Int - end - Base.size(S::SquaresVector) = (S.count,) - Base.linearindexing(::Type{SquaresVector}) = Base.LinearFast() - Base.getindex(S::SquaresVector, i::Int) = i*i +.. doctest:: -Note that it's very important to specify the two parameters of the ``AbstractArray``; the first defines the :func:`eltype`, and the second defines the :func:`ndims`. But that's it takes for our squares type to be an iterable, indexable, and completely functional array:: + julia> immutable SquaresVector <: AbstractArray{Int, 1} + count::Int + end + Base.size(S::SquaresVector) = (S.count,) + Base.linearindexing(::Type{SquaresVector}) = Base.LinearFast() + Base.getindex(S::SquaresVector, i::Int) = i*i; + +Note that it's very important to specify the two parameters of the ``AbstractArray``; the first defines the :func:`eltype`, and the second defines the :func:`ndims`. But that's it takes for our squares type to be an iterable, indexable, and completely functional array: + +.. testsetup:: + + srand(1); + +.. doctest:: julia> s = SquaresVector(7) 7-element SquaresVector: @@ -163,28 +195,33 @@ Note that it's very important to specify the two parameters of the ``AbstractArr julia> s \ rand(7,2) 1x2 Array{Float64,2}: - 0.0116789 0.0155006 + 0.0151876 0.0179393 -As a more complicated example, let's define our own toy N-dimensional sparse-like array type built on top of ``Dict``:: +As a more complicated example, let's define our own toy N-dimensional sparse-like array type built on top of ``Dict``: - immutable SparseArray{T,N} <: AbstractArray{T,N} - data::Dict{NTuple{N,Int}, T} - dims::NTuple{N,Int} - end - SparseArray{T}(::Type{T}, dims::Int...) = SparseArray(T, dims) - SparseArray{T,N}(::Type{T}, dims::NTuple{N,Int}) = SparseArray{T,N}(Dict{NTuple{N,Int}, T}(), dims) +.. doctest:: - Base.size(A::SparseArray) = A.dims - Base.similar{T}(A::SparseArray, ::Type{T}, dims::Dims) = SparseArray(T, dims) - # Define scalar indexing and indexed assignment up to 3-dimensions - Base.getindex{T}(A::SparseArray{T,1}, i1::Int) = get(A.data, (i1,), zero(T)) - Base.getindex{T}(A::SparseArray{T,2}, i1::Int, i2::Int) = get(A.data, (i1,i2), zero(T)) - Base.getindex{T}(A::SparseArray{T,3}, i1::Int, i2::Int, i3::Int) = get(A.data, (i1,i2,i3), zero(T)) - Base.setindex!{T}(A::SparseArray{T,1}, v, i1::Int) = (A.data[(i1,)] = v) - Base.setindex!{T}(A::SparseArray{T,2}, v, i1::Int, i2::Int) = (A.data[(i1,i2)] = v) - Base.setindex!{T}(A::SparseArray{T,3}, v, i1::Int, i2::Int, i3::Int) = (A.data[(i1,i2,i3)] = v) + julia> immutable SparseArray{T,N} <: AbstractArray{T,N} + data::Dict{NTuple{N,Int}, T} + dims::NTuple{N,Int} + end + SparseArray{T}(::Type{T}, dims::Int...) = SparseArray(T, dims) + SparseArray{T,N}(::Type{T}, dims::NTuple{N,Int}) = SparseArray{T,N}(Dict{NTuple{N,Int}, T}(), dims) + SparseArray{T,N} + + julia> Base.size(A::SparseArray) = A.dims + Base.similar{T}(A::SparseArray, ::Type{T}, dims::Dims) = SparseArray(T, dims) + # Define scalar indexing and indexed assignment up to 3-dimensions + Base.getindex{T}(A::SparseArray{T,1}, i1::Int) = get(A.data, (i1,), zero(T)) + Base.getindex{T}(A::SparseArray{T,2}, i1::Int, i2::Int) = get(A.data, (i1,i2), zero(T)) + Base.getindex{T}(A::SparseArray{T,3}, i1::Int, i2::Int, i3::Int) = get(A.data, (i1,i2,i3), zero(T)) + Base.setindex!{T}(A::SparseArray{T,1}, v, i1::Int) = (A.data[(i1,)] = v) + Base.setindex!{T}(A::SparseArray{T,2}, v, i1::Int, i2::Int) = (A.data[(i1,i2)] = v) + Base.setindex!{T}(A::SparseArray{T,3}, v, i1::Int, i2::Int, i3::Int) = (A.data[(i1,i2,i3)] = v); + +Notice that this is a ``LinearSlow`` array, so we must manually define :func:`getindex` and :func:`setindex!` for each dimensionality we'd like to support. Unlike the ``SquaresVector``, we are able to define :func:`setindex!`, and so we can mutate the array: -Notice that this is a ``LinearSlow`` array, so we must manually define :func:`getindex` and :func:`setindex!` for each dimensionality we'd like to support. Unlike the ``SquaresVector``, we are able to define :func:`setindex!`, and so we can mutate the array:: +.. doctest:: julia> A = SparseArray(Float64,3,3) 3x3 SparseArray{Float64,2}: @@ -194,9 +231,9 @@ Notice that this is a ``LinearSlow`` array, so we must manually define :func:`ge julia> rand!(A) 3x3 SparseArray{Float64,2}: - 0.418674 0.0901867 0.835166 - 0.85045 0.211394 0.0715443 - 0.569111 0.0535879 0.747284 + 0.28119 0.0203749 0.0769509 + 0.209472 0.287702 0.640396 + 0.251379 0.859512 0.873544 julia> A[:] = 1:length(A); A 3x3 SparseArray{Float64,2}: @@ -204,14 +241,18 @@ Notice that this is a ``LinearSlow`` array, so we must manually define :func:`ge 2.0 5.0 8.0 3.0 6.0 9.0 -Since the ``SparseArray`` is mutable, we were able to override :func:`similar`. This means that when a base function needs to return an array, it's able to return a new ``SparseArray``:: +Since the ``SparseArray`` is mutable, we were able to override :func:`similar`. This means that when a base function needs to return an array, it's able to return a new ``SparseArray``: + +.. doctest:: julia> A[1:2,:] 2x3 SparseArray{Float64,2}: 1.0 4.0 7.0 2.0 5.0 8.0 -And now, in addition to all the iterable and indexable methods from above, these types can interact with eachother and use all the methods defined in the standard library for ``AbstractArrays``:: +And now, in addition to all the iterable and indexable methods from above, these types can interact with eachother and use all the methods defined in the standard library for ``AbstractArrays``: + +.. doctest:: julia> A[SquaresVector(3)] 3-element SparseArray{Float64,1}: From 0174b52c827ac5c3b9f0704cc4efcd4e06143092 Mon Sep 17 00:00:00 2001 From: Matt Bauman Date: Sun, 21 Jun 2015 00:55:18 -0400 Subject: [PATCH 3/5] Update after review * Use sequence instead of collection in some places * Change eltype and length descriptions * Indentation fixes * Talk about how defining getindex manually is a game of whackamole without support from AbstractArray * Link to Array Indexing section [av skip] --- doc/manual/interfaces.rst | 38 ++++++++++++++++++++++++++------------ 1 file changed, 26 insertions(+), 12 deletions(-) diff --git a/doc/manual/interfaces.rst b/doc/manual/interfaces.rst index b2fea54cd089d..95e7648abed11 100644 --- a/doc/manual/interfaces.rst +++ b/doc/manual/interfaces.rst @@ -16,27 +16,27 @@ Required methods Brief description :func:`next(iter, state) ` Returns the current item and the next state :func:`done(iter, state) ` Tests if there are any items remaining **Important optional methods** **Default definition** **Brief description** -:func:`eltype(IterType) ` ``Any`` The container's element type -:func:`length(iter) ` (*undefined*) The container's length +:func:`eltype(IterType) ` ``Any`` The type the items returned by :func:`next` +:func:`length(iter) ` (*undefined*) The number of items, if known ================================= ======================== =========================================== -Sequential iteration is implemented by the methods :func:`start`, :func:`done`, and :func:`next`. Instead of mutating objects as they are iterated over, Julia provides these three methods to keep track of the iteration state externally from the object. The :func:`start(iter)` method returns an initial ``state`` object that gets passed along to :func:`done(iter, state)`, which tests if there are any elements remaining, and :func:`next(iter, state)`, which returns a tuple containing the current element and an updated ``state``. The ``state`` object can be anything, and is generally considered to be an implementation detail private to the iterable object. +Sequential iteration is implemented by the methods :func:`start`, :func:`done`, and :func:`next`. Instead of mutating objects as they are iterated over, Julia provides these three methods to keep track of the iteration state externally from the object. The :func:`start(iter) ` method returns the initial state for the iterable object ``iter``. That state gets passed along to :func:`done(iter, state) `, which tests if there are any elements remaining, and :func:`next(iter, state) `, which returns a tuple containing the current element and an updated ``state``. The ``state`` object can be anything, and is generally considered to be an implementation detail private to the iterable object. Any object that has these three methods appropriately defined can be used in a ``for`` loop since the syntax:: for i in iter # or "for i = iter" - # body + # body end is translated into:: state = start(iter) while !done(iter, state) - (i, state) = next(iter, state) - # body + (i, state) = next(iter, state) + # body end -A simple example is an iterable collection of square numbers with a defined length: +A simple example is an iterable sequence of square numbers with a defined length: .. doctest:: @@ -76,7 +76,7 @@ Or even the mean and standard deviation: julia> mean(Squares(100)), std(Squares(100)) (3383.5,3024.355854282583) -There are a few more methods we can extend to give Julia more information about this iterable collection. We know that the elements in a ``Squares`` collection will always be ``Int``. By extending the :func:`eltype` method, we can give that information to Julia and help it make more specialized code in the more complicated methods. We also know the number of elements in our collection, so we can extend :func:`length`, too: +There are a few more methods we can extend to give Julia more information about this iterable collection. We know that the elements in a ``Squares`` sequence will always be ``Int``. By extending the :func:`eltype` method, we can give that information to Julia and help it make more specialized code in the more complicated methods. We also know the number of elements in our sequence, so we can extend :func:`length`, too: .. doctest:: @@ -112,7 +112,7 @@ Methods to implement Brief description :func:`endof(X) ` The last index, used in ``X[end]`` ====================================== ================================== -For the ``Squares`` collection above, we can easily compute the ``i``\ th element of the collection by squaring it. We can expose this as an indexing expression ``S[i]``. To opt into this behavior, ``Squares`` simply needs to define :func:`getindex`: +For the ``Squares`` iterable above, we can easily compute the ``i``\ th element of the sequence by squaring it. We can expose this as an indexing expression ``S[i]``. To opt into this behavior, ``Squares`` simply needs to define :func:`getindex`: .. doctest:: @@ -131,6 +131,20 @@ Additionally, to support the syntax ``S[end]``, we must define :func:`endof` to Squares(23)[end] 529 +Note, though, that the above *only* defines :func:`getindex` with one integer index. Indexing with anything other than an ``Int`` will throw a ``MethodError`` saying that there was no matching method. In order to support indexing with ranges or vectors of Ints, separate methods must be written: + +.. doctest:: + + julia> Base.getindex(S::Squares, i::Number) = S[convert(Int, i)] + Base.getindex(S::Squares, I) = [S[i] for i in I] + Squares(10)[[3,4.,5]] + 3-element Array{Int64,1}: + 9 + 16 + 25 + +While this is starting to support more of the :ref:`indexing operations supported by some of the builtin types `, there's still quite a number of behaviors missing. This ``Squares`` sequence is starting to look more and more like a vector as we've added behaviors to it. Instead of defining all these behaviors ourselves, we can officially define it as a subtype of an ``AbstractArray``. + Abstract Arrays --------------- @@ -142,10 +156,10 @@ Methods to implement :func:`getindex(A, i::Int) ` (if ``LinearFast``) Linear scalar indexing :func:`getindex(A, i1::Int, ..., iN::Int) ` (if ``LinearSlow``, where ``N = ndims(A)``) N-dimensional scalar indexing :func:`setindex!(A, v, i::Int) ` (if ``LinearFast``) Scalar indexed assignment -:func:`setindex!(A, v, i1::Int, ..., iN::Int) ` (if ``LinearSlow``, where ``N = ndims(A)``) N-dimensional scalar indexed assignment with N ``Int`` arguments +:func:`setindex!(A, v, i1::Int, ..., iN::Int) ` (if ``LinearSlow``, where ``N = ndims(A)``) N-dimensional scalar indexed assignment **Optional methods** **Default definition** **Brief description** -:func:`getindex(A, I...) ` defined in terms of scalar :func:`getindex` Multidimensional and nonscalar indexing -:func:`setindex!(A, I...) ` defined in terms of scalar :func:`setindex!` Multidimensional and nonscalar indexed assignment +:func:`getindex(A, I...) ` defined in terms of scalar :func:`getindex` :ref:`Multidimensional and nonscalar indexing ` +:func:`setindex!(A, I...) ` defined in terms of scalar :func:`setindex!` :ref:`Multidimensional and nonscalar indexed assignment ` :func:`start`/:func:`next`/:func:`done` defined in terms of scalar :func:`getindex` Iteration :func:`length(A) ` ``prod(size(A))`` Number of elements :func:`similar(A) ` ``similar(A, eltype(A), size(A))`` Return a mutable array with the same shape and element type From c131150bfdfb89ba37c893b7757a62e0c2c2efc0 Mon Sep 17 00:00:00 2001 From: Matt Bauman Date: Sun, 21 Jun 2015 13:25:55 -0400 Subject: [PATCH 4/5] Update after Milan's comments * complicated -> rich * fix linearindexing description coherence between phrases by changing the first part of the phrase * add a paragraph about how linearindexing impacts which getindex method(s) must get defined * Fix missing *what* * each other [av skip] --- doc/manual/interfaces.rst | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/doc/manual/interfaces.rst b/doc/manual/interfaces.rst index 95e7648abed11..7aeb2358933bb 100644 --- a/doc/manual/interfaces.rst +++ b/doc/manual/interfaces.rst @@ -168,9 +168,11 @@ Methods to implement :func:`similar(A, ::Type{S}, dims::NTuple{Int}) ` ``Array(S, dims)`` Return a mutable array with the specified element type and dimensions ========================================================== ============================================ ======================================================================================= -If a type is defined as a subtype of ``AbstractArray``, it inherits a very large set of complicated behaviors including iteration and multidimensional indexing built on top of single-element access. +If a type is defined as a subtype of ``AbstractArray``, it inherits a very large set of rich behaviors including iteration and multidimensional indexing built on top of single-element access. -A key part in defining an ``AbstractArray`` subtype is :func:`Base.linearindexing`. Since indexing is such an important part of an array and often occurs in hot loops, it's important to make both indexing and indexed assignment as efficient as possible. Array data structures are typically defined in one of two ways: either it's most efficient to access the elements using just one index (using linear indexing) or it intrinsically accesses the elements with indices specified for every dimension. These two modalities are identified by Julia as ``Base.LinearFast()`` and ``Base.LinearSlow()``. Converting a linear index to multiple indexing subscripts is typically very expensive, so this provides a traits-based mechanism to enable efficient generic code for all array types. +A key part in defining an ``AbstractArray`` subtype is :func:`Base.linearindexing`. Since indexing is such an important part of an array and often occurs in hot loops, it's important to make both indexing and indexed assignment as efficient as possible. Array data structures are typically defined in one of two ways: either it most efficiently accesses its elements using just one index (linear indexing) or it intrinsically accesses the elements with indices specified for every dimension. These two modalities are identified by Julia as ``Base.LinearFast()`` and ``Base.LinearSlow()``. Converting a linear index to multiple indexing subscripts is typically very expensive, so this provides a traits-based mechanism to enable efficient generic code for all array types. + +This distinction determines which scalar indexing methods the type must define. ``LinearFast()`` arrays are simple: just define :func:`getindex(A::ArrayType, i::Int) `. When the array is subsequently indexed with a multidimensional set of indices, the fallback :func:`getindex(A::AbstractArray, I...)` efficiently converts the indices into one linear index and then calls the above method. ``LinearSlow()`` arrays, on the other hand, require methods to be defined for each supported dimensionality with ``ndims(A)`` ``Int`` indices. For example, the builtin ``SparseMatrix`` type only supports two dimensions, so it just defines :func:`getindex(A::SparseMatrix, i::Int, j::Int)`. The same holds for :func:`setindex!`. Returning to our collection of squares from above, we could instead define it as a subtype of an ``AbstractArray``: @@ -183,7 +185,7 @@ Returning to our collection of squares from above, we could instead define it as Base.linearindexing(::Type{SquaresVector}) = Base.LinearFast() Base.getindex(S::SquaresVector, i::Int) = i*i; -Note that it's very important to specify the two parameters of the ``AbstractArray``; the first defines the :func:`eltype`, and the second defines the :func:`ndims`. But that's it takes for our squares type to be an iterable, indexable, and completely functional array: +Note that it's very important to specify the two parameters of the ``AbstractArray``; the first defines the :func:`eltype`, and the second defines the :func:`ndims`. That supertype and those three methods are all it takes for ``SquaresVector`` to be an iterable, indexable, and completely functional array: .. testsetup:: @@ -228,7 +230,7 @@ As a more complicated example, let's define our own toy N-dimensional sparse-lik # Define scalar indexing and indexed assignment up to 3-dimensions Base.getindex{T}(A::SparseArray{T,1}, i1::Int) = get(A.data, (i1,), zero(T)) Base.getindex{T}(A::SparseArray{T,2}, i1::Int, i2::Int) = get(A.data, (i1,i2), zero(T)) - Base.getindex{T}(A::SparseArray{T,3}, i1::Int, i2::Int, i3::Int) = get(A.data, (i1,i2,i3), zero(T)) + Base.getindex{T}(A::SparseArray{T,3}, i1::Int, i2::Int, i3::Int) = get(A.data, (i1,i2,i3), zero(T)) Base.setindex!{T}(A::SparseArray{T,1}, v, i1::Int) = (A.data[(i1,)] = v) Base.setindex!{T}(A::SparseArray{T,2}, v, i1::Int, i2::Int) = (A.data[(i1,i2)] = v) Base.setindex!{T}(A::SparseArray{T,3}, v, i1::Int, i2::Int, i3::Int) = (A.data[(i1,i2,i3)] = v); @@ -264,7 +266,7 @@ Since the ``SparseArray`` is mutable, we were able to override :func:`similar`. 1.0 4.0 7.0 2.0 5.0 8.0 -And now, in addition to all the iterable and indexable methods from above, these types can interact with eachother and use all the methods defined in the standard library for ``AbstractArrays``: +And now, in addition to all the iterable and indexable methods from above, these types can interact with each other and use all the methods defined in the standard library for ``AbstractArrays``: .. doctest:: From dea117b1cd6a4ca153e0095055fe5e25821688f9 Mon Sep 17 00:00:00 2001 From: Matt Bauman Date: Sun, 21 Jun 2015 13:49:45 -0400 Subject: [PATCH 5/5] More cross-links! [av skip] --- doc/manual/arrays.rst | 5 ++++- doc/manual/interfaces.rst | 12 +++++++++--- doc/stdlib/arrays.rst | 2 ++ doc/stdlib/collections.rst | 13 +++++++++---- 4 files changed, 24 insertions(+), 8 deletions(-) diff --git a/doc/manual/arrays.rst b/doc/manual/arrays.rst index 2c571e7df46ee..68d3ae1bc42e2 100644 --- a/doc/manual/arrays.rst +++ b/doc/manual/arrays.rst @@ -12,7 +12,10 @@ attention to their array implementation at the expense of other containers. Julia does not treat arrays in any special way. The array library is implemented almost completely in Julia itself, and derives its performance from the compiler, just like any other code written in -Julia. +Julia. As such, it's also possible to define custom array types by +inheriting from ``AbstractArray.`` See the :ref:`manual section on the +AbstractArray interface ` for more details +on implementing a custom array type. An array is a collection of objects stored in a multi-dimensional grid. In the most general case, an array may contain objects of type diff --git a/doc/manual/interfaces.rst b/doc/manual/interfaces.rst index 7aeb2358933bb..e8bfa7b816457 100644 --- a/doc/manual/interfaces.rst +++ b/doc/manual/interfaces.rst @@ -6,6 +6,8 @@ A lot of the power and extensibility in Julia comes from a collection of informal interfaces. By extending a few specific methods to work for a custom type, objects of that type not only receive those functionalities, but they are also able to be used in other methods that are written to generically build upon those behaviors. +.. _man-interfaces-iteration: + Iteration --------- @@ -22,7 +24,7 @@ Required methods Brief description Sequential iteration is implemented by the methods :func:`start`, :func:`done`, and :func:`next`. Instead of mutating objects as they are iterated over, Julia provides these three methods to keep track of the iteration state externally from the object. The :func:`start(iter) ` method returns the initial state for the iterable object ``iter``. That state gets passed along to :func:`done(iter, state) `, which tests if there are any elements remaining, and :func:`next(iter, state) `, which returns a tuple containing the current element and an updated ``state``. The ``state`` object can be anything, and is generally considered to be an implementation detail private to the iterable object. -Any object that has these three methods appropriately defined can be used in a ``for`` loop since the syntax:: +Any object defines these three methods is iterable and can be used in the :ref:`many functions that rely upon iteration `. It can also be used directly in a ``for`` loop since the syntax:: for i in iter # or "for i = iter" # body @@ -101,6 +103,8 @@ While we can rely upon generic implementations, we can also extend specific meth This is a very common pattern throughout the Julia standard library: a small set of required methods define an informal interface that enable many fancier behaviors. In some cases, types will want to additionally specialize those extra behaviors when they know a more efficient algorithm can be used in their specific case. +.. _man-interfaces-indexing: + Indexing -------- @@ -145,6 +149,8 @@ Note, though, that the above *only* defines :func:`getindex` with one integer in While this is starting to support more of the :ref:`indexing operations supported by some of the builtin types `, there's still quite a number of behaviors missing. This ``Squares`` sequence is starting to look more and more like a vector as we've added behaviors to it. Instead of defining all these behaviors ourselves, we can officially define it as a subtype of an ``AbstractArray``. +.. _man-interfaces-abstractarray: + Abstract Arrays --------------- @@ -168,13 +174,13 @@ Methods to implement :func:`similar(A, ::Type{S}, dims::NTuple{Int}) ` ``Array(S, dims)`` Return a mutable array with the specified element type and dimensions ========================================================== ============================================ ======================================================================================= -If a type is defined as a subtype of ``AbstractArray``, it inherits a very large set of rich behaviors including iteration and multidimensional indexing built on top of single-element access. +If a type is defined as a subtype of ``AbstractArray``, it inherits a very large set of rich behaviors including iteration and multidimensional indexing built on top of single-element access. See the :ref:`arrays manual page ` and :ref:`standard library section ` for more supported methods. A key part in defining an ``AbstractArray`` subtype is :func:`Base.linearindexing`. Since indexing is such an important part of an array and often occurs in hot loops, it's important to make both indexing and indexed assignment as efficient as possible. Array data structures are typically defined in one of two ways: either it most efficiently accesses its elements using just one index (linear indexing) or it intrinsically accesses the elements with indices specified for every dimension. These two modalities are identified by Julia as ``Base.LinearFast()`` and ``Base.LinearSlow()``. Converting a linear index to multiple indexing subscripts is typically very expensive, so this provides a traits-based mechanism to enable efficient generic code for all array types. This distinction determines which scalar indexing methods the type must define. ``LinearFast()`` arrays are simple: just define :func:`getindex(A::ArrayType, i::Int) `. When the array is subsequently indexed with a multidimensional set of indices, the fallback :func:`getindex(A::AbstractArray, I...)` efficiently converts the indices into one linear index and then calls the above method. ``LinearSlow()`` arrays, on the other hand, require methods to be defined for each supported dimensionality with ``ndims(A)`` ``Int`` indices. For example, the builtin ``SparseMatrix`` type only supports two dimensions, so it just defines :func:`getindex(A::SparseMatrix, i::Int, j::Int)`. The same holds for :func:`setindex!`. -Returning to our collection of squares from above, we could instead define it as a subtype of an ``AbstractArray``: +Returning to the sequence of squares from above, we could instead define it as a subtype of an ``AbstractArray{Int, 1}``: .. doctest:: diff --git a/doc/stdlib/arrays.rst b/doc/stdlib/arrays.rst index 4cde84a8baf3a..f392157dddce1 100644 --- a/doc/stdlib/arrays.rst +++ b/doc/stdlib/arrays.rst @@ -1,5 +1,7 @@ .. currentmodule:: Base +.. _stdlib-arrays: + ******** Arrays ******** diff --git a/doc/stdlib/collections.rst b/doc/stdlib/collections.rst index ef9bb1de368cf..0914c0cebaea1 100644 --- a/doc/stdlib/collections.rst +++ b/doc/stdlib/collections.rst @@ -4,6 +4,8 @@ Collections and Data Structures ********************************* +.. _stdlib-collections-iteration: + Iteration --------- @@ -11,18 +13,21 @@ Sequential iteration is implemented by the methods :func:`start`, :func:`done`, :func:`next`. The general ``for`` loop:: for i = I # or "for i in I" - # body + # body end is translated into:: state = start(I) while !done(I, state) - (i, state) = next(I, state) - # body + (i, state) = next(I, state) + # body end -The ``state`` object may be anything, and should be chosen appropriately for each iterable type. +The ``state`` object may be anything, and should be chosen appropriately for +each iterable type. See the :ref:`manual section on the iteration interface +` for more details about defining a custom iterable +type. .. function:: start(iter) -> state