Try implementing N-dimensional indexing for fast linear SubArrays #30266

mbauman · 2018-12-04T19:39:47Z

This is largely a test balloon to see what Nanosoldier thinks.

@nanosoldier runbenchmarks(ALL, vs = ":master")

nanosoldier · 2018-12-05T02:45:03Z

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan

mbauman · 2018-12-05T05:09:46Z

Well that's interesting. I'll have to dig into this a bit more — that improved a few benchmarks but it didn't improve the ones I was targeting.

StefanKarpinski · 2018-12-05T17:10:35Z

that improved a few benchmarks but it didn't improve the ones I was targeting.

Win?

mbauman · 2018-12-05T20:08:29Z

Just to expand on this a little bit: this PR only changes things for views of arrays with fast linear indexing: V = view(A, I, J, K). Assuming A uses linear indexing as its canonical lookup, there are three ways to compute the linear index into A from V[i, j, k] as we cache the offset and stride.

convert the tuple (i, j, k) into one linear index into V, then multiply by stride and add offset to find the linear index into A. This ends up being:
```
 # precompute offset and stride at construction time, then
 offset + ((i-1) + (j-1)*size(V, 1) + (k-1)*size(V,2))*stride + 1
```
In cases where we statically know that they're unit-strider, then we can avoid the multiplication and -1/+1 dance. This is definitely simpler and we do it when we can:
```
 # precompute offset and stride at construction time, then
 offset + (i + (j-1)*size(V, 1) + (k-1)*size(V,2))
```
Or we could just re-index into the stored indices and then squash them down into a linear index into A. It's not immediately obvious if this will be a win as it depends on the complexity of reindexing:
```
 I[i] + (J[j]-1)*size(A, 1) + (K[k]-1)*size(A, 2)
```

Now, the complexity of re-indexing into our indices (I[i] and such) is limited because only a limited set of indices are able to be represented with the alternate offset/stride computation. The regressions in #30218 occurred where I, J, and K were all integers, so the indexing operations were totally free. In general we're limited — by definition — to only having up to one range in our set of indices (with everything else being :s or integers, which are free), so that's just be an addition and maybe a multiplication. So (3) should be no worse than (1) or (2).

But here's the kicker: that's all a red herring. The thing that is foiling LLVM's vectorization is computing the offset and stride at construction time. That's why this PR didn't solve the issue.

mbauman · 2018-12-06T20:38:29Z

There was a 2x regression in LinAlg — not sure if real or not.

@nanosoldier runbenchmarks("linalg", vs = ":master")

Assuming that comes back clean, I think we should merge this.

I think we'll have to eat the synthetic benchmark regression from #29895, though — it ends up being an improvement for the usage of 0-d subarrays, but it's the construction of them that ends up foiling vectorization in these particular cases.

nanosoldier · 2018-12-06T21:27:59Z

Something went wrong when running your job:

NanosoldierError: failed to run benchmarks against comparison commit: failed process: Process(`make -j3 USECCACHE=1`, ProcessExited(2)) [2]

Logs and partial data can be found here
cc @ararslan

mbauman · 2018-12-07T20:49:13Z

Try again?

@nanosoldier runbenchmarks("linalg", vs = ":master")

nanosoldier · 2018-12-07T22:20:39Z

Your benchmark job has completed - no performance regressions were detected. A full report can be found here. cc @ararslan

JeffBezanson · 2018-12-12T22:39:26Z

base/subarray.jl

@@ -222,13 +222,14 @@ end

 # In general, we simply re-index the parent indices by the provided ones
 SlowSubArray{T,N,P,I} = SubArray{T,N,P,I,false}
-function getindex(V::SlowSubArray{T,N}, I::Vararg{Int,N}) where {T,N}
+function getindex(V::SubArray{T,N}, I::Vararg{Int,N}) where {T,N}


Brilliant optimization --- just remove the word "slow"! 😄

JeffBezanson · 2018-12-12T22:40:14Z

I assume this is intended to fix a regression for 1.1?

mbauman · 2018-12-12T22:42:54Z

Yeah, that was the goal but I missed and improved performance elsewhere instead. I suppose we could still target it for 1.1 as a mitigating perf improvement.

…0266) (cherry picked from commit 433ba13)

Try implementing N-dimensional indexing for fast linear SubArrays

7723880

kshyatt added the arrays [a, r, r, a, y, s] label Dec 4, 2018

JeffBezanson reviewed Dec 12, 2018

View reviewed changes

JeffBezanson added performance Must go faster backport pending 1.1 labels Dec 12, 2018

JeffBezanson added this to the 1.1 milestone Dec 12, 2018

KristofferC mentioned this pull request Dec 12, 2018

Backports for 1.1.0 #30309

Merged

52 tasks

JeffBezanson merged commit 433ba13 into master Dec 17, 2018

JeffBezanson deleted the mb/cartesiansubarray branch December 17, 2018 20:51

KristofferC pushed a commit that referenced this pull request Dec 20, 2018

Try implementing N-dimensional indexing for fast linear SubArrays (#3…

64db937

…0266) (cherry picked from commit 433ba13)

KristofferC removed the backport pending 1.1 label Jan 5, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Try implementing N-dimensional indexing for fast linear SubArrays #30266

Try implementing N-dimensional indexing for fast linear SubArrays #30266

mbauman commented Dec 4, 2018

nanosoldier commented Dec 5, 2018

mbauman commented Dec 5, 2018

StefanKarpinski commented Dec 5, 2018

mbauman commented Dec 5, 2018

mbauman commented Dec 6, 2018

nanosoldier commented Dec 6, 2018

mbauman commented Dec 7, 2018

nanosoldier commented Dec 7, 2018

JeffBezanson Dec 12, 2018

JeffBezanson commented Dec 12, 2018

mbauman commented Dec 12, 2018

Try implementing N-dimensional indexing for fast linear SubArrays #30266

Try implementing N-dimensional indexing for fast linear SubArrays #30266

Conversation

mbauman commented Dec 4, 2018

nanosoldier commented Dec 5, 2018

mbauman commented Dec 5, 2018

StefanKarpinski commented Dec 5, 2018

mbauman commented Dec 5, 2018

mbauman commented Dec 6, 2018

nanosoldier commented Dec 6, 2018

mbauman commented Dec 7, 2018

nanosoldier commented Dec 7, 2018

JeffBezanson Dec 12, 2018

Choose a reason for hiding this comment

JeffBezanson commented Dec 12, 2018

mbauman commented Dec 12, 2018