Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

type stable strides #16541

Merged
merged 1 commit into from
Jun 16, 2016
Merged

type stable strides #16541

merged 1 commit into from
Jun 16, 2016

Conversation

Jutho
Copy link
Contributor

@Jutho Jutho commented May 23, 2016

This PR introduces a type stable definition for strides, and a companion replacement for stride that looks nicer (though that's up to interpretation and debate). I think / hope that this follows the (non @generated function) programming style of some of the other array changes introduced by @timholy and @mbauman .

@tkelman
Copy link
Contributor

tkelman commented May 23, 2016

Would @inferred be able to test the type stability here?

@timholy
Copy link
Member

timholy commented May 24, 2016

👍

If these are performance-critical, then you might need some @_inline_metas to avoid the splat penalty.

@Jutho
Copy link
Contributor Author

Jutho commented May 24, 2016

I'll try to add an @inferred test later today.

What is the best way to check if @_inline_meta is necessary? I have not been closely following all the developments recently. Is this always required with these kind of constructions (involving splatting and recursive tuple construction) or only to force inlining when the default inline detection mechanism rules otherwise?

@timholy
Copy link
Member

timholy commented May 24, 2016

The best way is to test performance: @nanosoldier runbenchmarks("array", vs = ":master")

@timholy
Copy link
Member

timholy commented May 24, 2016

I should add that it's always possible that the benchmarks don't really depend on strides, so you might run your own. If they pick up problems that the existing benchmarks miss, it's pretty easy to add them to BaseBenchmarks.jl.

using BenchmarkTools

function benchstrides(f, A)
    s = f(A)
    sum(s)
end

A = rand(3,3,3,3,3,3,3);

# I'm on master, and I named your new function "newstrides"
julia> @benchmark benchstrides(strides, A)
Trial(765.00 ns)

julia> @benchmark benchstrides(newstrides, A)
Trial(3.73 μs)

Rather surprisingly, it's slower, and adding @inline doesn't seem to make it any faster.

You can also use @code_llvm to monitor the number of apparent stack allocations, but I'm not sure it's actually reflective of the actual memory usage once JIT-compiled. In the end, performance is really the only thing that matters.

Note that the apparently-unstable version of strides could now be written

strides{T,N}(a::AbstractArray{T,N}) = ntuple(i->stride(a,i), Val{N})

On benchstrides, this rates as the fastest by far (it's more than 4x faster than the version in master, and more than 20x faster than the one in this PR). I'm rather confused by this.

@timholy
Copy link
Member

timholy commented May 24, 2016

Weird. It's calls to asize_from that nukes your implementation. CC @JeffBezanson (afd4eb0)

Even though different by only a tiny bit, this is the fastest of them all:

newstrides3{T}(a::AbstractArray{T,0}) = ()
newstrides3(a::AbstractArray) = _strides3(a, (1,))
_strides3{_,N}(a::AbstractArray{_,N}, s::NTuple{N,Int}) = s
_strides3{_,N,M}(a::AbstractArray{_,N}, s::NTuple{M,Int}) = _strides3(a, (s...,s[M]*size(a, M)))

I guess I've just gotten in the habit of passing the array, and didn't realize how bad it was to call size.

@timholy
Copy link
Member

timholy commented May 24, 2016

See #16553.

@jrevels, looks like a nanosoldier error.

@jrevels
Copy link
Member

jrevels commented May 24, 2016

@jrevels, looks like a nanosoldier error.

Array benchmarks are unfortunately broken on master right now due to some linalg bug(s). I'm planning a patch to BenchmarkTools that will allow other benchmarks to run even if some error out, but until that patch lands or master is fixed, array benchmarks will fail to execute.


strides(a::AbstractArray) = ntuple(i->stride(a,i), ndims(a))::Dims
stride(a::AbstractArray, d) = d <= 1 ? 1 : size(a, d-1)*stride(a, d-1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

julia> str(a::AbstractArray, d) = d <= 1 ? 1 : size(a, d-1)*str(a, d-1)

julia> a = rand(2,2,2,2);

julia> f(n, a) = for i=1:n;stride(a,4);end

julia> g(n, a) = for i=1:n;str(a,4);end

julia> @time f(10000000, a);
  0.055935 seconds (4 allocations: 160 bytes)

julia> @time g(10000000, a);
  0.096278 seconds (4 allocations: 160 bytes)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, it is still slower, though I hadn't thought it was a factor of two. I'll change it back to the old more lengthy definition. I guess it was to be expected that loops win over recursion. I just wanted to mimick the strides definition where recursive is the way to go to construct tuples in a type stable manner.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me, the first one evaluates in the same time, the second one in about 0.06 seconds, so about 20 % slower. The fastest one, however, seems to be the following definitition

mystride2(a::AbstractArray, d) = d < 1 ? 1 : d > ndims(a) ? length(a) : strides(a)[d]

where strides is definitely already the one from this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But that's probably only true if you are probing the last stride, otherwise you are calculating too much. I just don't have that much difference between f and g, but it is true that f (the old definition of strides) is slightly faster.

@Jutho
Copy link
Contributor Author

Jutho commented Jun 12, 2016

I finally got some time to update this. If the tests turn green (hopefully), is there anything that remains to be done? What about documentation? (I haven't created any PR anymore since the new doc system is in place so I am a bit outdated on that).

@tkelman
Copy link
Contributor

tkelman commented Jun 12, 2016

It looks like these are actually documented already, is anything that you're changing here externally visible to users? It may be worth taking the opportunity to move the docstrings out of base/docs/helpdb/Base.jl and put them inline instead if you're going to make any changes to them. Running make docs will propagate those changes to the RST version of the docs which you can then also commit.

_strides{T,N}(out::NTuple{N}, A::AbstractArray{T,N}) = out
function _strides{M,T,N}(out::NTuple{M}, A::AbstractArray{T,N})
@_inline_meta
_strides((out..., out[M]*size(a, M)), A)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

capital A ?

@Jutho
Copy link
Contributor Author

Jutho commented Jun 13, 2016

Thanks for the review and for catching the bug. I changed to capital A in the last minute since I thought that was more common in Base, though both seem to be used.

Nothing needs to be changed to the docs, but I add the docstrings inline tonight.

@Jutho
Copy link
Contributor Author

Jutho commented Jun 14, 2016

Ok I think this completes this PR, unless there are other comments

if i > ndims(a)
return length(a)
end
i > ndims(a) && return length(a)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this makes the code coverage reporting a little less specific in terms of which branches are getting tested, but I guess we've got enough other issues with coverage right now

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should actively change code to a format that generates worse coverage specificity.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good time to bump #11802?

@tkelman
Copy link
Contributor

tkelman commented Jun 15, 2016

LGTM, merge if green I think.

@Jutho
Copy link
Contributor Author

Jutho commented Jun 16, 2016

Not sure what happened with Appveyor?

@tkelman
Copy link
Contributor

tkelman commented Jun 16, 2016

#16556

@tkelman tkelman merged commit e124787 into JuliaLang:master Jun 16, 2016
@Jutho
Copy link
Contributor Author

Jutho commented Jun 16, 2016

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants