`A[i, end]` assumes that `i` indexes one dimension #35681

mcabbott · 2020-05-01T12:10:24Z

Indexing by end is lowered here to lastindex(A, 2), which isn't correct when the other indices take up more than one dimension (or none):

julia> zeros(2,2,1)[CartesianIndex(2,2), 1]
0.0

julia> zeros(2,2,1)[CartesianIndex(2,2), end]
ERROR: BoundsError: attempt to access 2×2×1 Array{Float64,3} at index [2, 2, 2]
Stacktrace:
 [1] getindex at ./array.jl:788 [inlined]
 [2] getindex(::Array{Float64,3}, ::CartesianIndex{2}, ::Int64) at ./multidimensional.jl:552
 [3] top-level scope at REPL[19]:1

julia> g(A, i, j) = A[i, end, j];

julia> @code_lowered g(zeros(4,3,2), CartesianIndex(1,1), CartesianIndex())
CodeInfo(
1 ─ %1 = Base.lastindex(A, 2)
│   %2 = Base.getindex(A, i, %1, j)
└──      return %2
)

I assumed this worked, but @c42f actually checked!

The text was updated successfully, but these errors were encountered:

StefanKarpinski · 2020-05-01T13:35:18Z

I'm not really sure how this can be fixed. cc @mbauman

mcabbott · 2020-05-01T13:54:52Z

Another error case, from @simeonschaub:

julia> f(n) = CartesianIndex(1, n);

julia> rand(2,3)[f(end)]
ERROR: BoundsError: attempt to access 2×3 Array{Float64,2} at index [1, 6]

One idea is to make all such things into runtime errors. Perhaps by lowering to A[FixDim(1, f(lastindex(A,1)))] and then checking with something like this:

function to_indices(A, ax, I::Tuple{FixDim, Vararg})
    I[1].dim == ndims(A) + 1 - length(ax) || error("can't use end here")
    I[1].val isa CartesianIndex && !( I[1].val isa CartesianIndex{1}) && error("can't use end here")
    to_indices(A, ax, (I[1].val, tail(I)...))
end

mbauman · 2020-05-01T14:15:09Z

Yeah, I don’t have a good solution here so we explicitly warn about this in the CartesianIndex docs.

mcabbott · 2020-05-01T14:47:16Z

Alright, there is indeed a giant yellow warning box about this!

c42f · 2020-05-03T01:48:21Z

Actually I think there's some hope to fix this, at least for non-pathological uses of end. We just need to lower end more conservatively by not assuming that each preceding index in a Expr(:ref) corresponds to exactly one dimension. Rather, push the computation of the number of dimensions into a function (let's call it count_index_dims for now):

julia> A = zeros(2,2,1)
2×2×1 Array{Float64,3}:
[:, :, 1] =
 0.0  0.0
 0.0  0.0

julia> c = CartesianIndex(2,2)
CartesianIndex(2, 2)

julia> A[c, end]
ERROR: BoundsError: attempt to access 2×2×1 Array{Float64,3} at index [2, 2, 2]
Stacktrace:
 [1] getindex at ./array.jl:788 [inlined]
 [2] getindex(::Array{Float64,3}, ::CartesianIndex{2}, ::Int64) at ./multidimensional.jl:543
 [3] top-level scope at REPL[21]:1

Sketch of alternative lowering for A[c, end]:

julia> count_index_dims(a, inds) = length(Base.to_indices(a, axes(a), inds))
count_index_dims (generic function with 1 method)

julia> getindex(A, c, lastindex(A, count_index_dims(A,(c,))+1))
0.0

For your g(A, i, j) = A[i, end, j] example, it should be lowered to

g(A, i, j) = getindex(A, i, lastindex(A, count_index_dims(A,(i,))), j)

The case from @simeonschaub is truly diabolical, I like it! But edge cases like that shouldn't prevent us from fixing the common situations.

mbauman · 2020-05-04T16:40:23Z

julia> count_index_dims(a, inds) = length(Base.to_indices(a, axes(a), inds))
count_index_dims (generic function with 1 method)

julia> getindex(A, c, lastindex(A, count_index_dims(A,(c,))+1))
0.0

Yes, perhaps I have been letting the lack of the perfect get in the way of the good here; 👍 for re-opening this.

I don't particularly like calling to_indices multiple times, although I suppose it's not as bad now that we use lazy logical indexing. Another option would be to simply also emit a check to ensure the number of indexed dimensions is what we wanted and error otherwise. That could also be entirely hidden behind a @boundscheck block.

c42f · 2020-05-05T04:30:17Z

I don't particularly like calling to_indices multiple times.

Right, that does seem kind of icky. I feel like we should have the end in A[i, j, end] become something simpler like

lastindex_afterdims(A, i, j)

Read "lastindex for the dimension which comes after dims i,j. (Better naming suggestions extremely welcome!)

For splatting, eg, A[i, j, ks..., end] we lower to:

lastindex_afterdims(A, i, j, ks...)

Current lowering rules just implement the following extremely simple rule:

lastindex_afterdims(A, inds...) = lastindex(length(inds) + 1)

I could have a go at doing this lowering tweak and you can separately work out exactly what the best implementation is for CartesianIndex, etc?

tkf · 2020-05-05T05:14:52Z

How about lowering f(begin, end) to LazyIndex(f) and handle everything in to_indices? That is to say, to_indices can count leading dimensions and use it to materialize a LazyIndex when it is encountered.

c42f · 2020-05-05T05:24:45Z

How about lowering f(begin, end) to LazyIndex(f) and handle everything in to_indices

That sounds kind of promising but I don't quite understand your notation. How do we handle A[i, min(j,end)]?

tkf · 2020-05-05T05:31:47Z

What I meant was something like this:

# A[i, min(j,end)]
getindex(A, i, LazyIndex((_begin, _end) -> min(j, _end)))

# A[i, j, ks..., begin:2:end]
getindex(A, i, j, ks..., LazyIndex((_begin, _end) -> _begin:2:_end))

tkf · 2020-05-05T05:35:05Z

Hmm.... Actually, it might break some code that does something custom and bypasses to_indices?

c42f · 2020-05-05T05:50:09Z

Yes backward compat seems quite hard with that option.

It's a really interesting idea though, I quite like the way that it allows getindex itself to be aware of and handle the special syntax.

I can't really see how we can have a lazy version and also backward compatibility, unless we lower to some new getindex_ which handles the LazyIndex wrappers and passes the results to normal getindex by default. But that seems like an awful lot of machinery to fix an edge case :-/

tkf · 2020-05-05T06:17:33Z

OK, that's too bad. I thought we'll have faster closures after this as everybody will start nudging Jeff about indexing slowed down....

tkf · 2020-05-05T06:19:37Z

So I guess we need something like this?

indexdim(x) = indexdim(typeof(x))
indexdim(::Type) = 1  # best guess?
indexdim(::Type{T}) where {T <: CartesianIndex} = length(T)
indexdim(::Type{T}) where {T <: AbstractArray} = indexdim(eltype(T))
indexndims(args...) = sum(indexdim, args)
lastindex_afterdims(A, inds...) = lastindex(A, indexndims(inds...) + 1)
firstindex_afterdims(A, inds...) = firstindex(A, indexndims(inds...) + 1)

tkf · 2020-05-05T06:32:16Z

This issue makes me wonder if this lowering

julia> Meta.@lower x[i, f(end)...]
:($(Expr(:thunk, CodeInfo(
    @ none within `top-level scope'
1 ─ %1 = Base.lastindex(x, 2)
│   %2 = f(%1)
│   %3 = Core.tuple(x, i)
│   %4 = Core._apply_iterate(Base.iterate, Base.getindex, %3, %2)
└──      return %4
))))

can be improved. If x is an Array{T, 3} and i is an Int, it kind of makes sense to lower f(end)... to f((lastindex(x, 2), lastindex(x, 3)))...? Of course, it'd mean x[i, f(end)..., f(end)...] would be an error at lowering time.

c42f · 2020-05-05T06:41:57Z

Can we avoid lifting indexdim into the type domain? Also the Array case is quite troubling. For correctness It would need to be roughly:

indexdim(a::AbstractArray) =
    mapreduce(indexdim, (d1,d2)->(d1==d2 ? d1 : error("xxx")), a)

Which catches terrible oddities such as A[[1, CartesianIndex(2,2)], end], while allowing A[Any[1,2], end] to work correctly.

tkf · 2020-05-05T07:22:26Z

Ah, of course, Any array is the problem. Though it sounds really hard to solve this without going to LazyIndex direction. I agree type-level functions are ugly but I still think it's the upper limit (or something very close to it) of what we can do without breaking anything.

Which catches terrible oddities such as A[[1, CartesianIndex(2,2)], end]

Maybe we can make something like A.[[1, CartesianIndex(2,2)], end, [CartesianIndex(3,3), 4]] work with #30845. (Though I have no idea when you'd need this.)

c42f · 2020-05-05T07:47:13Z

Is there anything wrong in principle with the mapreduce approach to solve the Any problem? I expect it can be completely optimized away for arrays of concrete type, and indexing with an array of Any is never going to be fast anyway.

tkf · 2020-05-05T19:32:52Z

I expect it can be completely optimized away for arrays of concrete type

How about Array{Union{Int,Nothing}}? I think this is a likely type to get if you use findfirst or something. Though maybe special casing Union{T,Missing} and Union{T,Nothing} is not so hard?

I'm also a bit worried about relying on the optimization in the happy path. I can see that

Base.@pure donothing(_) = nothing
@btime foreach(donothing, $(zeros(1000000)))

is optimized away but it looks like this happens at LLVM level, not at Julia level.

mbauman · 2020-05-05T21:03:17Z

Yeah, this is why I ended up with a warning box 😛

I still think a reasonable option is to lower to an error function that simply asserts we're not mixing ends/begins and CartesianIndex. Note that we can dispatch on AbstractArray{T} where T >: CartesianIndex to catch all unions and Any that can possibly contain CartesianIndex (with a fallback no-op, of course).

tkf · 2020-05-05T21:42:47Z

asserts we're not mixing ends/begins and CartesianIndex

Where do you assert this? My guess is that doing this in to_indices would have the same compat issue as LazyIndex and doing this in lowering (= before getindex) would be equivalent to lastindex_afterdims in terms of the complexity of the code.

mbauman · 2020-05-05T21:58:57Z

Currently:

julia> Meta.@lower A[end, begin, 1:end]
:($(Expr(:thunk, CodeInfo(
    @ none within `top-level scope'
1 ─ %1 = Base.lastindex(A, 1)
│   %2 = Base.axes(A, 2)
│   %3 = Base.first(%2)
│   %4 = Base.lastindex(A, 3)
│   %5 = 1:%4
│   %6 = Base.getindex(A, %1, %3, %5)
└──      return %6
))))

This would just add a line before the final %6 = getindex that simply did assert_no_cartesians(%1, %3, %5). We'd only need to emit the assert_no_cartesians if begin/end was present. And the entire implementation of assert_no_cartesians can be inside a @boundscheck block.

tkf · 2020-05-05T22:03:29Z

I see, yes, that's much simpler.

c42f · 2020-05-08T06:28:12Z

We'd only need to emit the assert_no_cartesians if begin/end was present

I like that this is simpler, it's true that introducing a pile of complexity for this relative edge case would be somewhat disappointing.

It still needs to handle icky situations like when [1, CartesianIndex(2,2)] is passed though, unless it's an approximate mechanism which only produces an error in a subset of the more obvious cases.

timholy · 2020-10-27T12:04:41Z

While we wait for the scheme-heroics, EndpointRanges.jl is a simple workaround:

julia> a = reshape(1:24, 2, 3, 2, 1, 2)
2×3×2×1×2 reshape(::UnitRange{Int64}, 2, 3, 2, 1, 2) with eltype Int64:
[:, :, 1, 1, 1] =
 1  3  5
 2  4  6

[:, :, 2, 1, 1] =
 7   9  11
 8  10  12

[:, :, 1, 1, 2] =
 13  15  17
 14  16  18

[:, :, 2, 1, 2] =
 19  21  23
 20  22  24

julia> a[CartesianIndex(2, 2), end, CartesianIndex(1, 1)]
ERROR: BoundsError: attempt to access 2×3×2×1×2 reshape(::UnitRange{Int64}, 2, 3, 2, 1, 2) with eltype Int64 at index [2, 2, 3, 1, 1]
Stacktrace:
 [1] throw_boundserror(A::Base.ReshapedArray{Int64, 5, UnitRange{Int64}, Tuple{}}, I::NTuple{5, Int64})
   @ Base ./abstractarray.jl:601
 [2] checkbounds
   @ ./abstractarray.jl:566 [inlined]
 [3] _getindex
   @ ./abstractarray.jl:1146 [inlined]
 [4] getindex(::Base.ReshapedArray{Int64, 5, UnitRange{Int64}, Tuple{}}, ::CartesianIndex{2}, ::Int64, ::CartesianIndex{2})
   @ Base ./abstractarray.jl:1120
 [5] top-level scope
   @ REPL[31]:1

julia> using EndpointRanges

julia> a[CartesianIndex(2, 2), iend, CartesianIndex(1, 1)]
10

julia> a[2, 2, 2, 1, 1]
10

fredrikekre added the arrays [a, r, r, a, y, s] label May 1, 2020

mcabbott closed this as completed May 1, 2020

c42f reopened this May 3, 2020

JeffBezanson added the compiler:lowering Syntax lowering (compiler front end, 2nd stage) label May 4, 2020

tkf mentioned this issue Jun 10, 2020

Broadcasting dictionaries with unmatched keys gives confusing result andyferris/Dictionaries.jl#19

Closed

mbauman mentioned this issue Sep 15, 2020

"end" indexing not working properly SciML/EllipsisNotation.jl#19

Open

mcabbott mentioned this issue Oct 27, 2020

Failure in mixed indexing with multidimensional logicals #38192

Closed

mcabbott mentioned this issue Nov 4, 2020

indexing with Begin and End as types #38296

Closed

simeonschaub mentioned this issue Apr 25, 2021

begin/end wrong with CartesianIndices #40601

Closed

mbauman mentioned this issue Jun 16, 2021

Unable to index with non-concrete Array{CartesianIndex} #41242

Open

Tokazama mentioned this issue Aug 3, 2021

Cleaning up indexing and dimension mapping code. JuliaArrays/ArrayInterface.jl#181

Merged

mcabbott mentioned this issue May 18, 2022

Ellipsis slicing #5405

Open

Tokazama mentioned this issue May 31, 2022

Create staticint.jl #44538

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`A[i, end]` assumes that `i` indexes one dimension #35681

`A[i, end]` assumes that `i` indexes one dimension #35681

mcabbott commented May 1, 2020 •

edited

Loading

StefanKarpinski commented May 1, 2020

mcabbott commented May 1, 2020

mbauman commented May 1, 2020

mcabbott commented May 1, 2020

c42f commented May 3, 2020 •

edited

Loading

mbauman commented May 4, 2020

c42f commented May 5, 2020 •

edited

Loading

tkf commented May 5, 2020

c42f commented May 5, 2020

tkf commented May 5, 2020

tkf commented May 5, 2020

c42f commented May 5, 2020

tkf commented May 5, 2020

tkf commented May 5, 2020

tkf commented May 5, 2020

c42f commented May 5, 2020 •

edited

Loading

tkf commented May 5, 2020

c42f commented May 5, 2020

tkf commented May 5, 2020

mbauman commented May 5, 2020

tkf commented May 5, 2020

mbauman commented May 5, 2020

tkf commented May 5, 2020

c42f commented May 8, 2020

timholy commented Oct 27, 2020

A[i, end] assumes that i indexes one dimension #35681

A[i, end] assumes that i indexes one dimension #35681

Comments

mcabbott commented May 1, 2020 • edited Loading

StefanKarpinski commented May 1, 2020

mcabbott commented May 1, 2020

mbauman commented May 1, 2020

mcabbott commented May 1, 2020

c42f commented May 3, 2020 • edited Loading

mbauman commented May 4, 2020

c42f commented May 5, 2020 • edited Loading

tkf commented May 5, 2020

c42f commented May 5, 2020

tkf commented May 5, 2020

tkf commented May 5, 2020

c42f commented May 5, 2020

tkf commented May 5, 2020

tkf commented May 5, 2020

tkf commented May 5, 2020

c42f commented May 5, 2020 • edited Loading

tkf commented May 5, 2020

c42f commented May 5, 2020

tkf commented May 5, 2020

mbauman commented May 5, 2020

tkf commented May 5, 2020

mbauman commented May 5, 2020

tkf commented May 5, 2020

c42f commented May 8, 2020

timholy commented Oct 27, 2020

`A[i, end]` assumes that `i` indexes one dimension #35681

`A[i, end]` assumes that `i` indexes one dimension #35681

mcabbott commented May 1, 2020 •

edited

Loading

c42f commented May 3, 2020 •

edited

Loading

c42f commented May 5, 2020 •

edited

Loading

c42f commented May 5, 2020 •

edited

Loading