Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Completing the StridedArray interface #186

Open
Jutho opened this issue Mar 3, 2015 · 20 comments
Open

RFC: Completing the StridedArray interface #186

Jutho opened this issue Mar 3, 2015 · 20 comments
Labels
arrays [a, r, r, a, y, s] design Design of APIs or of the language itself

Comments

@Jutho
Copy link
Contributor

Jutho commented Mar 3, 2015

Many methods in Julia base, especially those that interface with external C libraries like BLAS, LAPACK and FFTW are only defined for StridedArray, which currently is a Union type. It has been argued before by @timholy that properties like strided and contiguous should actually be a trait (a THT).

However, my main question is about the interface for a StridedArray, which is currently incomplete and somewhat inconsistent. The manual describes StridedArray as being an array A such that, increasing index k by 1 is equivalent to linear indexing where the linear index is increased by stride(A,k). But that is of course not true, since linear indexing is expected to be such that the linear index increases by prod(size(A)[1:k-1]). A simple example is A=rand(5,5);B=sub(A,1:3,1:3). Clearly, stride(B,2)=5, but linear indexing with i+5*(j-1) would not give me element (i,j), you need to index with i+3*(j-1) for that. In that respect, stride(A,k) is also defined for a general AbstractArray and just returns prod(size(A)[1:k-1]), which does not correspond to its description in the manual as "distance in memory".

So what is the role of StridedArray in Julia. Is it only to interface with C libraries and to be used in combination with pointer? It is certainly useful to also exploit the stridedness in pure Julia methods, see e.g. permutedims!. Currently, this is dealt with by checking explicitly whether the StridedArray is a SubArray, and in that case, to extract the underlying parent array and the offset, and then apply linear indexing to that. I noticed that parent is now part of the exported method list, but there is no corresponding method to get the offset or first_index. Maybe it would be good to have this method as well and then define these methods (stride and strides, parent, offset?) as the interface for a strided array (as type or as trait).

@timholy
Copy link
Member

timholy commented Mar 3, 2015

It's a tough call; on one hand it seems like "internal implementation detail" but on the other hand we export StridedArray itself. I lean towards filling this interface out, but I guess the other approach would be to stop exporting StridedArray and parent. (Actually, given that many AbstractArrays wrap another array, parent seems more fundamental, but OTOH it's not very meaningful without more details about the wrapper.)

@Jutho
Copy link
Contributor Author

Jutho commented Mar 3, 2015

Indeed, I could also see the other proposal work. Stop exporting StridedArray, maybe even rename it to BlasArray in the context of BLAS, just like BlasFloat etc. Or probably just BlasVector and BlasMatrix. I guess it would even be possible to restrict BlasMatrix such that it is automatically stride 1 in the first index, (e.g. union of DenseArray and subarrays with UnitRange indexing in the first index?). I don't know anything about the requirements of FFTW. But then, I think there is also no point of exporting strides, or maybe it does, but it should be moved in the C interface section of the docs, and it should not be defined for general AbstractArray.

@lindahua
Copy link

@Jutho The value of strided array is not limited to BLAS. There are many C libraries out there that require strided or contiguous memory layout. Unexporting such types would discourage new packages that work with such C libraries.

I think we should clearly define the semantics of a strided array, and standardize the interface.

From a syntax standpoint, all strided array types should provide these functions (in addition to those that provided by all arrays): strides and pointer. From a semantics standard point, the underlying memory layout should conform to what these methods say.

@Jutho
Copy link
Contributor Author

Jutho commented Apr 19, 2015

I agree that the concept of strided arrays is very useful. Both for C libraries, but maybe also for pure julia methods. The main point about this issue is that the interface for accessing a strided array is not complete.

Either strided array is only for C libraries, then pointer and strides is sufficient as interface but it should belong in the C interop part of the standard library. If pure julia methods should also be allowed to use it, then the interface is not complete and too conflated with linear indexing.

@mbauman mbauman added the arrays [a, r, r, a, y, s] label Aug 29, 2017
@dlfivefifty
Copy link
Contributor

dlfivefifty commented Jan 5, 2018

Ref: JuliaLang/julia#25321

This added documentation for a strided array interface consisting of overriding strides(A) and Base.unsafe_convert(::Type{Ptr{T}}, A). I didn't need to use a trait to do this.

I may put together another pull request to deprecate StridedArray by adding support for a trait that specifies how an arrays memory is layed out, allowing support for banded matrices and triangular matrices in addition to strided arrays. What I'm thinking is something like the following:

abstract type ArrayMemoryLayout{T} end

struct UnknownLayout{T} <: ArrayMemoryLayout{T} end
struct StridedLayout{T,trans} <: ArrayMemoryLayout{T} end
struct SymmetricStridedLayout{T} <: ArrayMemoryLayout{T} end
struct UpperTriangularLayout{T} <: ArrayMemoryLayout{T} end


memorylayout(::Type{A}) where A <: AbstractArray{T} where T = UnknownLayout{T}()
memorylayout(::Type{A}) where A <: Array{T} where T = StridedLayout{T,:N}()
memorylayout(A::AbstractArray) = memorylayout(typeof(A))
...

Then linear algebra routines will dispatch to BLAS and LAPACK when memorylayout returns StridedLayout{<:BlasFloat}, etc. Also, BandedMatrices.jl can add a BandedLayout and SymBandedLayout, and enable automatic dispatching to the appropriate LAPACK routines.

Note that this construction is not quite the same as a trait/interface as memory layout is always unique, whereas a single array can have multiple traits/interfaces.

I still need to think how to handle adjoints/transposes, but maybe something like this will work:

adjoint(::ArrayMemoryLayout{T}) where T = UnknownLayout{T}()
adjoint(::StridedLayout{T,:N}) where T = StridedLayout{T,:C}()

memorylayout(::Type{A}) where A <: Adjoint{T, AA} where {T, AA<:AbstractArray} = adjoint(memorylayout(AA))

@mbauman
Copy link
Member

mbauman commented Jan 5, 2018

Yes, the big question is what the kinds of properties are important enough to make it into the trait hierarchy and how that gets put together. In general we'd been thinking that this trait can layer onto 1.x as a new feature, but I'd most value a second set of eyes in making sure that things are indeed structured in such a way to support that.

My current thought is that this will require multiple traits that work together for different orthogonal dimensions of variance in how arrays get stored. Here's a start of a document on what I've been thinking: https://gist.github.com/mbauman/3fa8a1ac00f574480968a4f47be2c361

@dlfivefifty
Copy link
Contributor

dlfivefifty commented Jan 5, 2018

Let me reiterate my main point: "trait" isn't the right word here since memory storage is unique, hence we can ask a type what it is. So instead of querying isstrided(A) (which doesn't scale well when you have isbanded(A), isuppertriangular(A), etc.) we can query memorylayout(A) which tells us the memory layout.

I'm imagining that instead of the current code

mul!(y::StridedVector{T}, A::StridedVecOrMat{T}, x::StridedVector{T}) where {T<:BlasFloat} = gemv!(y, 'N', A, x)
mul!(y::StridedVector{T}, transA::Transpose{<:Any,<:StridedVecOrMat{T}}, x::StridedVector{T}) where {T<:BlasFloat} = (A = transA.parent; gemv!(y, 'T', A, x))

we would have

mul!(y, A, x) = _mul!(y, A, x, memoryformat(y), memoryformat(A), memoryformat(x))
_mul!(y::AbstractVector{T}, A::AbstractMatrix{T}, x::AbstractVector{T}, ::StridedFormat{T,:N}, ::StridedFormat{T,:N}, ::StridedFormat{T,:N}) where T<:BlasFloat = gemv!(y, 'N', A, x)
_mul!(y::AbstractVector{T}, A::AbstractMatrix{T}, x::AbstractVector{T}, ::StridedFormat{T,:N}, ::StridedFormat{T,:T}, ::StridedFormat{T,:N}) where T<:BlasFloat = gemv!(y, 'T', A, x)

_mul!(y::AbstractVector, A::AbstractMatrix, x::AbstractVector, ::UnknownFormat, ::StridedFormat, ::StridedFormat)= # native julia fallback

(We don't need the transA.parent call in the proposed version since we would add pointer(A::Transpose) = pointer(A.parent) and strides(A::Transpose) = strides(A.parent).)

@mbauman
Copy link
Member

mbauman commented Jan 5, 2018

Yup, you have roughly the same vision I do. I would just call memoryformat — exactly as you've written it — a trait. I don't see the distinction you're trying to make. The fact that you can ask a type (and not just values) about a property like this is precisely what makes it an effective technique; it'd be type-unstable and slow otherwise. That's the core of a trait.

@dlfivefifty
Copy link
Contributor

It’s only a point about semantics. “Traits” implies possibly multiple traits at the same time (a matrix can be banded and upper triangular at the same time) whereas the actual storage in memory is unique (the entries of a banded and upper triangular matrix could be stored either in eitherBandedLayout Or UpperTriangularLayout , but only one of these is valid for a given implementation)

@Jutho
Copy link
Contributor Author

Jutho commented Jan 5, 2018

Also mentioned in JuliaLang/julia#10889: There might be some advantage to making it really a hierarchy instead of just a bunch of mutually exclusive types, e.g.

abstract type ArrayMemoryLayout{T} end
abstract type UnknownMemoryLayout{T} <: AbstractMemoryLayout{T} end
abstract type StridedFormat{T,...} <: AbstractMemoryLayout{T} end
abstract type UnitStridedFormat{T,...} <: StridedFormat{T} end
struct Contiguous{T,...} <: UnitStridedFormat{T} end

since some routines just require strided, other require strided with unit first stride, and e.g. linear indexing requires contiguous layout. The downside is that you need to use the type itself instead of the singleton instance as trait (unless you also make a concrete struct for every level in the hierarchy), such that the method definitions would need to use arguments of the form ::Type{<:UnitStridedFormat}.

Note added: I think that's also the contents of gist of @mbaumann, and so there should be an additional level between StridedFormat and UnitStrided which is MonotonousStrided or something alike.

@Jutho
Copy link
Contributor Author

Jutho commented Jan 30, 2018

@dlfivefifty , thanks for all your nice work for completing the strided interface. I do have a question/request, after reading
https://docs.julialang.org/en/latest/manual/interfaces/#man-interface-strided-arrays-1

It would be good if information about the strided memory layout is not only something to link to external C or Fortran libraries, but can also be used in Julia algorithms, i.e. to write native Julia algorithms that access memory in an optimized way. For plain Arrays, one would just use linear indexing using indices that you computed using the stride information. But what for other types that have only strides and unsafe_convert(Ptr,..)? For SubArrays, there is parent and first_index to basically use linear indexing into the parent array. parent actually also works on Array and returns itself, first_index does not.

Maybe parent and first_index can become part of the Strided interface, so that in a Julia algorithm, where parent should always return a LinearIndexing array.

@dlfivefifty
Copy link
Contributor

Is your suggestion that parent(A::Array) = A ?

@Jutho
Copy link
Contributor Author

Jutho commented Jan 30, 2018

That is already the case in Base. But first_index(A::Array) = 1 does not exist.

@dlfivefifty
Copy link
Contributor

It would be good if information about the strided memory layout is not only something to link to external C or Fortran libraries, but can also be used in Julia algorithms

It is in PR JuliaLang/julia#25558: the order memory is accessed by generic_matvecmul! in Julia-native code depends on whether A is ColumnMajor or RowMajor.

@mbauman
Copy link
Member

mbauman commented Jan 31, 2018

I'm not sure parent is actually a useful generic function. The only way you know what to do with the parent is if you also know what the wrapper was. It's used quite a bit across the package ecosystem, but in every single case I've examined, its argument is constrained to a specific implementation with the enclosing function.

first_index is not exported, and I think should remain so.

@Jutho How do you use parent and first_index? Is it accomplishing something beyond what pointer and stride[s] can do?

@StefanKarpinski
Copy link
Member

I've always been a bit suspicious of the array parent function.

@dlfivefifty
Copy link
Contributor

Here’s an example: both A and view(A’,:,:)’ have the same entries stored in the same order, but their parents are two different matrices.

@mbauman
Copy link
Member

mbauman commented Feb 1, 2018

Amusingly, I was looking for a solution to a broken test for one of my PRs and found an attempt to use parent generically — by myself. The key was to recurse until you found a fixed point.

JuliaLang/julia#15209 (comment)

Even then, it's not sufficient for that task as some arrays have multiple "parents" — like Bidiagonal's two vectors or (in the context of aliasing detection) SubArray's indices.

@Jutho
Copy link
Contributor Author

Jutho commented Feb 2, 2018

My main use case is the algorithms TensorOperations.jl, and a new package Strided.jl that I am working on (both on https://github.com/Jutho). If you want to permutedims! one multidimensional array into another (or any joint operation between two (or even three or more) multidimensional arrays with strides that are completely different and, importantly, non-monotously increasing) you want to write some kind of blocked algorithm to exploit memory locality (cache and all that), similar to transpose and matrix multiplication.

The easiest way to do this is to work with strides directly and compute some kind of linear index (memory offset) for each of the arrays involved. But how should you then use this index, in a Julia algorithm. Doing unsafe_load(pointer(A), i) does not seem very julian, and, last time I checked,

  • this only works for bit types
  • this does not combine with vectorization (@simd)

The fact that pointer(A) exists, means that there always exists some underlying dense array, not all entries of which might be valid, but at least the once that you correctly compute using the strides should be. So maybe I should just use my own 'parent' function, let's call it, say, denseparent(A) = unsafe_wrap(Array, pointer(A), dims; own = false).

@brenhinkeller
Copy link
Contributor

C.f. ArrayInterface.jl (and things that build on it like StrideArrays.jl)

@nsajko nsajko added the design Design of APIs or of the language itself label May 4, 2024
@KristofferC KristofferC transferred this issue from JuliaLang/julia Nov 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrays [a, r, r, a, y, s] design Design of APIs or of the language itself
Projects
None yet
Development

No branches or pull requests

10 participants