Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

getindex(.) return type does not match eltype #2

Open
colinxs opened this issue Jul 17, 2019 · 71 comments
Open

getindex(.) return type does not match eltype #2

colinxs opened this issue Jul 17, 2019 · 71 comments

Comments

@colinxs
Copy link

colinxs commented Jul 17, 2019

First off, awesome idea/package! It's been quite useful in my own work.

I was really excited to use your package in conjunction with StructArrays but ran into an issue (that I attempted to fix with a PR to StructArrays) as explained here. The owner of that package pointed out, however, that the real issue is the following:

julia> x=nestedview(rand(2,3,10), 2);
julia> x[1] isa eltype(x)
false

What are your thoughts on modifying eltype to be consistent with the true return type of getindex?

-Colin

@oschulz
Copy link
Collaborator

oschulz commented Jul 18, 2019

Thanks!

Yes, the getindex/eltype inconsistency is annoying, esp. when using it with StructArray or TypedTable.

I definitely want to solve that, but it's a bit tricky - just using the underlying array type as eltype is not enough, because we get into trouble (e.g. with broadcast) if it's a view.

@oschulz
Copy link
Collaborator

oschulz commented Jul 18, 2019

There's also this problem: If we just make eltype returns the right thing, then there would be a mismatch between eltype and the element type deduced from T in foo(A::AbstractArray{T) on an ArraysOfArrays array. I think a lot of code is written based on the assumtion that T and eltype(A) match.

@piever
Copy link

piever commented Jul 18, 2019

Just to pitch in (as this was brought up in JuliaArrays/StructArrays.jl#84), I think that changing the eltype should be done not by adding a custom eltype method but by giving the correct type parameter to AbstractArray, i.e. ArrayOfArray<:AbstractArray{SubArray...}. Implementation-wise, this should be an extra type parameter of ArrayOfArray that gets computed in an internal constructor.

@oschulz
Copy link
Collaborator

oschulz commented Jul 18, 2019

I fully agree, @piever. That's what makes it a bit more tricky - but I have an idea (like you said, via constructor), I'll try it out.

@oschulz
Copy link
Collaborator

oschulz commented Aug 21, 2019

@piever, @colinxs, haven't forgotten about this - holiday season, but it's definitely on my short-term ToDo list.

@oschulz
Copy link
Collaborator

oschulz commented Oct 21, 2019

Update: Haven't forgotten about this, just had to be put on hold for some other urgent work. Will get back on this soon.

@oschulz
Copy link
Collaborator

oschulz commented Mar 17, 2020

Sorry this is taking so long - it's not forgotten and definitely still on my to-do list!

@colinxs
Copy link
Author

colinxs commented Mar 18, 2020

@oschulz no worries, I actually figured this out and have been using it in my own codebases:

# For AbstractArray A and indices I, compute V = typeof(view(A, I...)) by:
# 1) Try to infer V from typeof(A) and typeof(I) by calling viewtype(typeof(A), typeof(I))
# 2) If !isconcretetype(V), fall back to typeof(view(A, I...))
# Custom subtypes of AbstractArray (e.g. UnsafeArray) could provide customized implementations
# of viewtype(A::AbstractArray, I::Tuple) or viewtype(::Type{<:AbstractArray}, ::Type{<:Tuple})
# if required.

function viewtype(A::AbstractArray, I::Tuple)
    T = viewtype(typeof(A), typeof(I))
    _viewtype(A, I, T, Val(isconcretetype(T)))
end
viewtype(A::AbstractArray, I...) = viewtype(A, I)

function viewtype(::Type{A}, ::Type{I}) where {A<:AbstractArray,I<:Tuple}
    Core.Compiler.return_type(view, _view_signature(A, I))
end
Base.@pure _view_signature(::Type{A}, ::Type{I}) where {A,I<:Tuple} = Tuple{A, I.parameters...}

_viewtype(A, I, T, ::Val{true}) = T
function _viewtype(A, I, T, ::Val{false})
    try
        return typeof(view(A, I...))
    catch e
        Istring = join(map(string, I), ", ")
        printstyled(stderr,
        """
        Unable to infer typeof(view($(typeof(A)), $(join(map(i -> string(typeof(i)), I), ", "))))
        Only other option is to try typeof(view(A, $Istring)) but that resulted in below error.
        Try passing in valid indices or implement:
            viewtype(::Type{$(typeof(A))}, ::Type{$(typeof(I))})
        """, color=:light_red)
        rethrow(e)
    end
end

I've got an in-works implementation something similar to ArrayOfSimilarArrays that I'll make a PR for.

Thanks again for keeping on top of this! :)

@colinxs
Copy link
Author

colinxs commented Mar 18, 2020

Also, two other options for _viewtype(A, I, T, ::Val{false}) that I considered were:

_viewtype(A, I, T, ::Val{false}) = @inbounds view(A, I...)
@propagate_inbounds _viewtype(A, I, T, ::Val{false}) = view(A, I...)

The first has the benefit of working for empty arrays, but that seems unsafe. The second option
I rejected because that would require e.g. @inbounds nestedview(A_flat, 2), which seemed weird to do when calling a constructor.

@oschulz
Copy link
Collaborator

oschulz commented Apr 14, 2020

Thanks, I really got to add this soon!

@colinxs
Copy link
Author

colinxs commented Apr 14, 2020

As an alternate idea, you could use the SlicedArray that I made here.

I ended up needing the ability to slice along arbitrary dimensions instead just the inner M dimensions and ran across JuliennedArrays.jl. I took the best parts of that and ArraysOfArrays to make SlicedArray which I use internally for Lyceum. An ArrayOfArrays{T,M} could be implemented as:

function sliceinner(A::AbstractArray{<:Any,L}, ::Val{M}) where {L,M}
  slice(A, ntuple(_ -> Colon(), Val(M))..., ntuple(_ -> *, Val(L-M))...)
end

You could also use JuliennedArrays directly, SlicedArray is the same idea just with more overloads for Base functions, comprehensive tests, and better performance. I'd like to upstream those changes in the future but haven't had time yet. If this sounds useful I could register SpecialArrays (currently it's only registered in the Lyceum registry).

As an aside, you might find some of the test utilities I wrote useful, given that you have a lot of packages providing various custom arrays. That test suite is what originally prompted the changes to ElasticArrays I made in JuliaArrays/ElasticArrays.jl#20. Perhaps factoring that out to something like ArrayTestTools.jl would be useful to JuliaArrays/others?

@oschulz
Copy link
Collaborator

oschulz commented Apr 14, 2020

I could register SpecialArrays

Hm, since ArraysOfArrays is already registered, and since there would be conflicts with some exported functions (like innersize), I guess I would prefer to backport your improvements to ArrayOfSimilarArrays. If the package has to evolve in the process (incl. some API changes), that would be fine. I've actually been planning to move it under the "JuliaArrays" organization to make it a bit more "official" - talked to Tim Holy about it a JuliaCon, just haven't gotten 'round to it yet.

Perhaps factoring that out to something like ArrayTestTools.jl would be useful to JuliaArrays/others?

Yes, that might indeed be very useful!

@colinxs
Copy link
Author

colinxs commented Apr 14, 2020

Makes sense to backport the changes ArraysOfArrays.jl! Given that SlicedArray (and also FlattenedArray, which is a flattened view of a nested array) were inspired largely by JuliennedArrays.jl it'd be good to check with its dev (@bramtayl) first? The way forward could then be to upstream SlicedArray/FlattenedArray to JuliennedArrays.jl and then have ArraysOfArrays.jl depend on that? The alternative would be to just put everything in ArraysOfArrays.jl directly.

In the meantime, I'll work on factoring out ArrayTestTools.jl.

@bramtayl
Copy link

It makes me so happy when I find out people are reading my packages! Happy to accept PRs to JuliennedArrays

@oschulz
Copy link
Collaborator

oschulz commented Apr 14, 2020

@bramtayl, I just came across JuliennedArrays a while ago - from what I saw, it's quite similar to
ArrayOfSimilarArrays, but trades increased flexibility (slicing in any dimension) for some performance?

Would you possibly be interested in moving it into ArraysOfArrays, or would you prefer to keep it as a lightweight separate package?

@oschulz
Copy link
Collaborator

oschulz commented Apr 14, 2020

@colinxs having a FlattenedArray in ArraysOfArrays would be great - I've been meaning to add something like that, to replace the current generic implementation of flatview

@inline flatview(A::AbstractArray{<:AbstractArray}) = Base.Iterators.flatten(A)

with something that actually returns an AbstractArray-subtype. One of the things that kept sliding to the bottom of my to-do list ... :-)

@bramtayl
Copy link

@oschultz I'd be thrilled to move it into ArraysOfArrays so I don't have to maintain the code anymore. I'd be curious to know more about performance differences

@colinxs
Copy link
Author

colinxs commented Apr 14, 2020

hahah and I love when devs are super responsive and eager to accept PRs :).

@bramtayl, when you have a moment want to look over SlicedArray and FlattenedArray. The major changes are:

Slices

  • renamed to SlicedArray to match Base's naming convention (SubArray, PermutedDimsArray)
  • Exposes slice(A, dims::Tuple), slice(A, dims::Integer...), and slice(A, Union{Colon,typeof(*)}) as the user-facing API (similar to SubArray and view). slice(A, :, *, :) would be equivalent to Slices(A, True(), False(), True()), it's just syntactical sugar.
  • Adds several methods for Base functions like resize!, append! etc.
  • Adds a drop-in replacement for Base.mapslices
  • Adds adjoints for Zygote.jl (conditionally loaded using Requires.jl)

Align

  • Renamed to FlattenedArray
  • Exposes flatview(A) as the user-facing API
  • Removed the ability to specify alongs since you can get the same functionality by composing with PermutedDimsArray
  • Adds adjoints for Zygote.jl

I also significantly re-wrote the internals for better inference/performance and added full test coverage.

@colinxs
Copy link
Author

colinxs commented Apr 14, 2020

Concerning the performance difference: there isn't any. All of the fancy indexing that allows slicing along arbitrary dimensions effectively compiles out.

@bramtayl
Copy link

@colinxs those all sound like improvements to me. Earlier versions of JuliennedArrays used * and : to specify dimensions (I think that was a Tim Holy suggestion). I changed to True and False because I got excited about the wider potential of Typed Bools (hasn't panned out yet). I'm not sure I understand about PermutedDimsArray.

@oschulz
Copy link
Collaborator

oschulz commented Apr 14, 2020

I'd be thrilled to move it into ArraysOfArrays so I don't have to maintain the code anymore.

Haha, and here I thought we'd maintain it together. ;-)

I'd be curious to know more about performance differences

Good point, we should definitely do some benchmarking!

@oschulz
Copy link
Collaborator

oschulz commented Apr 14, 2020

Regarding PermutedDimsArray - when I wrote ArrayOfSimilarArrays I actually thought that applications that needed to slice along another axis could us an ArrayOfSimilarArrays wrapped around a PermutedDimsArray. Would be interesting to see if that could achieve similar performance to JuliennedArrays.Slices. If so, maybe we could reduce code there quite a bit and make things easier to maintain if we join it all in ArraysOfArrays.

@oschulz
Copy link
Collaborator

oschulz commented Apr 14, 2020

Adds adjoints for Zygote.jl

Oh, that would be nice to have here (ArraysOfArrays) too!

@colinxs
Copy link
Author

colinxs commented Apr 15, 2020

@bramtayl PermutedDimsArray just remaps dimensions e.g.:

julia> A=rand(2,3)
2×3 Array{Float64,2}:
 0.307533  0.822543  0.530326
 0.591914  0.782564  0.948809

julia> B=PermutedDimsArray(A, (2,1))
3×2 PermutedDimsArray(::Array{Float64,2}, (2, 1)) with eltype Float64:
 0.307533  0.591914
 0.822543  0.782564
 0.530326  0.948809

julia> A[1,2] == B[2,1]
true

So if you had A = [rand(2,3) for i=1:4], Align(A, True(), True(), False()) and PermutedDimsArray(Align(A, False(), True(), True()), (2, 3, 1)) would be equal.

@bramtayl
Copy link

Oh I see. Yes that makes sense to me. I'm not sure if there would be a performance penalty for PermutedDimsArray?

@colinxs
Copy link
Author

colinxs commented Apr 15, 2020

@oschulz

I thought about doing the same thing, but you miss out on some optimizations with the extra layer of indirection. If we had:

A = rand(2,3,4)
B = slice(A, :, *, *)

Then doing B[:, 1] as implemented is basically a no-op because we can just forward the Colon to the parent array and "re-slice". See this comment and the subsequent getindex definition for an example.

If you rely on PermutedDimsArray, then you'd be falling back to Cartesian indexing and collecting the output in a new array.

@colinxs
Copy link
Author

colinxs commented Apr 15, 2020

@bramtayl

SlicedArray still uses True/False internally, but I figured using :/* was a friendlier user-facing API.

@oschulz
Copy link
Collaborator

oschulz commented Apr 15, 2020

I like : / *!

But in your example above - slice(A, :, *, *) could already be done with just an ArrayOfSimilarArrays, without PermutedDimsArray, right? Or do if mix : and * up, here`?

@oschulz
Copy link
Collaborator

oschulz commented Apr 17, 2020

Thanks for writing the summary!

@oschulz
Copy link
Collaborator

oschulz commented Apr 17, 2020

Should we avoid having variable return types? e.g. flatview(slice(rand(2,3,4), args...))

I would strongly favor using the optimal return type for each case, as long as it can be done in a type-stable way. So if appropriate, flatview should just return the parent array.

@c42f
Copy link
Member

c42f commented Apr 17, 2020

could you return admin rights to ArraysOfArrays and UnsafeArrays to me

Of course, done. I think you should now be able to add collaborators and manage all the things.

@oschulz
Copy link
Collaborator

oschulz commented Apr 17, 2020

Thanks, @c42f ! @colinxs, you can write to ArraysOfArrays now - probably best to still make a PR for the new features though, will make it easier to discuss. Travis/Documenter for ArraysOfArrays is configured to generate doc previews for PRs, btw.

@colinxs
Copy link
Author

colinxs commented Apr 23, 2020

Haven't gotten around to making the PRs yet as I'm cleaning things up in SpecialArrays.jl first, but I wanted to give an update on the replacement for JuliennedArrays.Align. The behaviour of Align can be achieved with FlattenedArray + PermutedDimsArray and benchmarking shows very little overhead (1-2%). Given the simplicity of that approach (and some of the tricks that PermutedDimsArray does) I'm going to move forward with this approach.

@colinxs
Copy link
Author

colinxs commented Apr 29, 2020

@oschulz @bramtayl you happen to see my message on Slack?

@bramtayl
Copy link

Just commented there. Sorry, I don't check slack very often.

@oschulz
Copy link
Collaborator

oschulz commented Apr 29, 2020

No, sorry, missed it - I'm an intermittent slacker ... :-)

@oschulz
Copy link
Collaborator

oschulz commented Apr 29, 2020

@colinxs, looks like a bug in ArraysOfArrays, can you open an issue an repost your comment from slack?

@colinxs
Copy link
Author

colinxs commented Apr 29, 2020

@oschulz done!

Concerning the merging of SpecialArrays/JuliennedArrays: Given that we all appear to be fairly busy, I think it's easiest for everyone if SpecialArrays is merged into JuliennedArrays.

ArraysOfArrays has many moving parts and is quite different, while SpecialArrays/JuliennedArrays are effectively the same thing, just implemented a bit differently. Given that @bramtayl has been working on the challenges of type stable "dimension surgery" (Brandon I think I've read damn near every issue you've commented on), he may also be able to better maintain/push forward the whole mapslices story.

@oschulz, given that you have many packages depending on ArraysOfArrays, this may also make things easier for you.

If everyone is alright with this I'll make a PR this weekend. There's two registered packages depending on JuliennedArrays, both of which have compat entries upper bounding it to v0.2, so I think we can just include this in the next minor release without a deprecation cycle. @bramtayl could you add me as a collaborator?

@bramtayl
Copy link

I think I've read damn near every issue you've commented on

Im sorry you had to suffer through them!

could you add me as a collaborator?

Sure!

@oschulz
Copy link
Collaborator

oschulz commented Apr 29, 2020

Given that we all appear to be fairly busy, I think it's easiest for everyone if SpecialArrays is merged into JuliennedArrays.

Sure, I guess that's maybe the fastest route for you

@oschulz, given that you have many packages depending on ArraysOfArrays, this may also make things easier for you.

Not that many, but those do need it - I'd be fine to adapt to API changes, though. I just needed the statistics support, but I believe that's now fully compatible with Statistics/StatsBase, semantically.

Maybe longer term, we can still converge? Until then, it would be nice if we could avoid conflicts between exported functions (e.g. flatview, nestedview), though.

@colinxs
Copy link
Author

colinxs commented Apr 29, 2020

@bramtayl no worries, it was educational! I'll go ahead and make one big PR then and we can disect from there.

I definitely see things converging in the future! Right now there's a ton of packages with similar goals (RecursiveArrayTools, SliceMap, StarSlice, etc.) but I think the diversity has lead to better ideas (e.g. I may not have made SpecialArrays if both ArraysOfArrays and JuliennedArrays didn't already exist).

flatview and the inner_* are the only conflicts right now. Perhaps we can discuss how to work around that in the PR to JuliennedArrays.

Sorry for all the trouble!

@oschulz
Copy link
Collaborator

oschulz commented Apr 30, 2020

"RecursiveArrayTools" is actually almost fully orthogonal - from what I understand, it's purpose access nested array with a flat-array-like API. So kinda the reverse from what he do, actually.

@colinxs
Copy link
Author

colinxs commented Apr 30, 2020

Yup, it's purpose is definitely nested-->flat, but that's just the inverse of flat->nested. As the JuliaennedArrays/SpecialArrays code shows they share quite a bit in common.

@oschulz
Copy link
Collaborator

oschulz commented May 3, 2020

I definitely see things converging in the future!

@colinxs , as a first step in that direction, how would you feel about publishing your FlattenedArray implementation as a lightweight, standalone "FlattenedArrays.jl" or so? I'm sure it would be useful on it's own for many use cases anyhow.

@colinxs
Copy link
Author

colinxs commented May 6, 2020

@oschulz I think it's easiest to keep it with SlicedArray gives that they are inverses of each-other. There's very few dependencies, however:

(SpecialArrays) pkg> st
Project SpecialArrays v0.1.1
Status `~/workspace/juliadev/SpecialArrays/Project.toml`
  [79e6a3ab] Adapt v1.0.1
  [ffbed154] DocStringExtensions v0.8.1
  [ae029012] Requires v1.0.1
  [c4a57d5a] UnsafeArrays v1.0.1
  [e88e6eb3] Zygote v0.4.20

julia> @time using SpecialArrays
  0.253651 seconds (421.11 k allocations: 22.848 MiB)

Note that Zygote is loaded with Requires.

Still haven't made the PR into JuliennedArrays, I've been cleaning things up on my end before :)

@oschulz
Copy link
Collaborator

oschulz commented May 6, 2020

Sure, sounds good. Maybe ArraysOfArray can depend on JuliennedArrays then for a while, until we come up with a plan to converge everything.

@colinxs
Copy link
Author

colinxs commented May 7, 2020

Sounds good! I haven't yet made the PR to JuliennedArrays, but did just finish cleaning/simplifying a bunch of things in Lyceum/SpecialArrays.jl#2. I'll make the PR this weekend and we can chat specifics there :)

cc: @bramtayl

@oschulz
Copy link
Collaborator

oschulz commented May 29, 2020

I think that efficient arrays-of-arrays data structure could become very relevant in the future, since with Julia v1.5, views become very cheap. Here are some benchmarks: JuliaArrays/UnsafeArrays.jl#8

@Moelf
Copy link
Contributor

Moelf commented Sep 27, 2021

could become very relevant in the future

bump, the future is now! Let's do Jim's (future) work for him
image

@oschulz
Copy link
Collaborator

oschulz commented Sep 27, 2021

I wrote to Jim last night, letting him know that we already have it :-)

@Moelf
Copy link
Contributor

Moelf commented Sep 27, 2021

@oschulz
Copy link
Collaborator

oschulz commented Sep 27, 2021

AK supports n-jaggedness

Do you mean vectors of vectors of [...] of arrays? ArraysOfArrays.VectorOfArrays can do that, actually.

@oschulz
Copy link
Collaborator

oschulz commented Sep 27, 2021

I definitely reinvented StructArray in UnROOT.jl

Do you think it would be possible to use StructArray then, instead, longer term? It's a very well supported package.

@Moelf
Copy link
Contributor

Moelf commented Sep 27, 2021

let's see if we survive into longer-term... but yes that's definitely something worth trying

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants