-
Notifications
You must be signed in to change notification settings - Fork 149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unified, efficient code for mutable and immutable AbstractArrays #32
Comments
Already done :) m = @SMatrix rand(3,3)
m[1,:] # SVector{3}
m[(1,),:] # SMatrix{1,3}
m[[1],:] # 1x3 Matrix
rand(3)[(1,2,3)] # SVector{3} Or at least that's the way it should work... I believe there are missing methods, especially beyond 2D. |
Sure, that's great. What I was getting at is that if we did have constant propagation, we'd get into issues like: v = @SVector [1,2,3]
v[1:2] # 1
v[range] # 2 For 1 we can potentially return SVector in a wonderful world of integrated const prop + inference. Admittedly this is pretty specific and speculative. I'm sure there's some much more basic and important issues which should be in the list above. |
Thanks for opening this issue Chris. I have thought about this deeply before. From memory the compiler changes necessary to make things nice are:
There might have been others. These two (in addition to more robust constant prop and branch elimination), would (I think) remove all the If we want fast mutable containers, then we need:
With this final one, I think we can fix all the API issues with JuliaLang/julia#17115. My idea is:
I realize the last is pretty API-breaking... if that concept won't fly then at the least we could define The final part of the API I would like to touch on is dispatch-by-size. If you want to do this with StaticArrays then we can do it with a trait/thunk: f(x::StaticVector} = _f(Val{length(typeof(x))}, x)
_f(::Type{Val{3}}, x) = ... # code for size 3
_f(::Type{Val{4}}, x) = ... # code for size 4
_f{N}(::Type{Val{N}}, x) = error("Expected length 3 or 4") However trying this with a standard function g(x::AbstractVector)
if length(x) === 3
# code for size 3
elseif length(x) === 4
# code for size 4
else
error("Expected length 3 or 4")
end
end but unfortunately a user couldn't later add a definition that makes sense for length 5. Sorry - that was a bit of a disorganized mind-dump - I hope it makes some sense. |
I think we might need static ranges for that, e.g. On the other hand... if this occurs it is because the return type is inferable. If |
One idiom that I'm currently trying to work around is this sort of thing: n = length(v)
v2 = similar(v, 2*n) (or similarly, make a vector of length of the minimum dimension of a matrix, etc) The One thought is to have @inline similar(SV::StaticVector, s::Int) = similar_type(SV, s)()
# etc where In summary, to make this work we would need to change in StaticArrays:
Again, we need this feature in the compiler
With these, more of the routines in |
Regarding loop unrolling. If the range is statically known LLVM will unroll the loop if its heuristic finds it beneficial to do so. So is there really a need for an |
Umm you might be right but I did some benchmarking of matrix multiplication and explicit unrolling of up to thousands of individual multiply/add instructions in 3 nested loops was beneficial. I think LLVM would unroll the sum of a three vector well but nothing very complex (it would be too conservative). I also suspect having optimización opportunities happen sooner in Julia would have other benefits. For example, you could iterate over tuple of mixed types and inference could make this type stable. Finally having automatic heuristics is great but having optional user control might be even better (depending if it gets abused). Inlining is a great model here. |
While unrolling might be beneficial for micro benchmarks it might not always carry over to more realistic scenarios due to icache misses and the instructions pushing out other hot data from L2 cache. But yeah, user control is probably good (and maybe not too hard to implement?) |
Good point about icache pollution. I'm be a bit doubtful about the performance benefit of fully unrolling vectors larger than size 20 or something in real code (warning: I'm pulling "20" out of thin air here - don't take it seriously, run a benchmark instead ;-) ). I don't know the |
I think I'll close this. There's a bit of interesting discussion here of issues we never completely solved, but I think the compiler has moved on enough that we'd need to take a completely fresh look at the problems. |
As mentioned in #31, nobody wants to write two versions of code, one for mutable arrays, vs one for immutable arrays like
StaticArray
. I'd like a central place where we identify the main issues, and discuss minimal changes to theAbstractArray
interface or julia compiler which would allow packages likeStaticArrays
to exist without rewriting a lot of algorithms from Base. Here's a very incomplete list, I hope people will jump in with more detail:API
similar()
assumes a mutable container can be created efficiently filled in an iterative way. Julep: setfield! for mutable references to immutables JuliaLang/julia#17115 goes some way to solving this, though we'd be working with a statically sized buffer, something likeRef{T<:StaticArray}
and would need to unwrap this to return the value.hvcat()
API doesn't reflect the structure of the input expression AST in a way which can give a type inferred output. (Potentially solvable with constant prop, without API changes?)Optimization
similar()
with a size argument is not type inferrable forStaticArray
. Constant prop in inference could potentially fix this in many cases, Ref. Do some constant propagation before type inference runs JuliaLang/julia#5560.getindex()
with fancy indexing is not type inferrable forStaticArray
. Constant prop in inference could fix a bunch of important instances of this, eg literal rangesv[3:5]
could be known to be aSVector{3}
. But what about non-inferrable cases, even after constant prop? We'd like to return anArray
in those cases, can we achieve this in principle?StaticArrays
we're forced to do it at the AST level via@generated
functions which no doubt pushes the compiler pretty hard and generates a heap of garbage. More interesting would be an@unroll
meta to instruct codegen to unroll a given loop if the range is statically known.Obviously one thing we can't fix is differences in the most appropriate algorithm for small vs large objects. For example, eigen decomposition of 2x2 matrices should probably just be done using a specialized analytical expression for that size.
The text was updated successfully, but these errors were encountered: