-
-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Matrix Multiplication API #473
Comments
Ref. #160, JuliaLang/julia#20053, #464. Best! |
Thanks for the references. I suppose this issue has more to do with adding the |
As a starting point, would there be support for having |
Yes. That would be a good start. We need to consider the ordering of the arguments though. Originally, I thought it was more natural to follow the BLAS ordering but we fairly consistent about having output arguments first which is also the case for the current three-argument |
Why not just introduce a generic |
Sounds good. We can continue the discussion about the actual argument ordering and method renaming in #160. This is more about just having the five-argument version available, so I'll keep the current
Are you suggesting renaming |
To be clearer, I do think we should end up having a single method |
I think |
Is the notation "finalised" as the official syntax? And this would be in addition to I'm not too big a fan of having |
I'd say no. Originally, I liked the idea of following BLAS (and the order also matched how this is usually written mathematically) but now I think it makes sense to just add the scaling arguments as optional fourth and fifth arguments. |
So just to clarify, you'd like optional arguments in the sense function mul!(C, A, B, α=1, β=0)
...
end The other option would be optional keyword arguments function mul!(C, A, B; α=1, β=0)
...
end but I'm not sure people are too happy with unicode. |
I'm very happy with unicode but it is true that we try always have an ascii option and it would be possible here. The names |
#26117) * mul!(α, A, x, β, y) -> mul!(y, A, x, a, b) (#23919) * missing )
In my opinion, a more logical nomenclature would be to let The
|
Another vote to have |
I also have been tempted to write |
I'm in favour, though I think it should probably have a different name than just |
Maybe |
Maybe we can have something like |
Isn’t this a form of |
Note that we do mutate non-first elements in some cases, e.g. |
Wouldn't it be nice if you leave an option to the implementers to dispatch on the types of the scalars α and β? It's easy to add sugars for end-users. |
I thought we already settled on |
Thanks! It would be nice if that's documented. |
SparseArrays uses it, but I don't recall it being discussed anywhere. |
In some ways the It seems to be a case of pedantry vs consistency whether to call this |
There is an interesting exception to this when |
I'd vote for strong zero (:heart:) if it were "Merge for 1.x" instead of "Merge for 1.3". Wouldn't the options make sense if all "Merge for 1.3" are "Merge 1.x"? |
Did you check out LazyArrays.jl? FYI, it has really nice fused matrix multiplication support. You can also use |
I actually truly need the generic mul! since we use a wide variety of structured and sparse and plain old dense matrices, for representing different optimization problems most efficiently. I'm here for the genericity and speed. |
I see. And I just remembered that we discussed things in LazyArrays.jl so of course you already knew it... Regarding "ASAP", Julia's four months release cycle is, at least as a design, to avoid "merge rush" just before feature freeze. I know it's not fair for me to say this because I've tried the same thing before... But I think someone needs to mention this as a reminder. The bright side is Julia is super easy to build. You can start using it right after it is merged until next release. Edit: release -> merge |
Thanks. I find deadlines to be helpful motivators, and I would like to avoid letting this go stale again. So I would propose that we try to use the deadline as a goal. |
It's great that you are actively injecting energy to this thread! |
A 5-argument |
Thanks, I will try LazyArrays again. Since the 5-arg mul seems to be generally considered as a temporary stopgap (until some solution like LazyArrays can be used in 2.0), I think we can aim to get it merged without necessarily being the ideal or perfect solution for the long run. @chethega when do you think we should count the tallies for the new nonbinding vote? |
@tkf Sure, you're right that strong/weak/undef zeros make sense for 1.x as well. However, I think there are quite a few people who would rather have 1.3
Triage must make a decision (delay / strong / weak / backstop) this week for the 1.3 alpha. I think Thursday 15th or Wednesday 14th are sensible options for triage to make a count, and take it into consideration. I probably won't be able to join on Thursday, so someone else will have to count. Realistically, it is OK to be conservative here, and miss the deadline, and continue the discussion and wait for 1.4. On the other hand, we may already have reached consensus without noticing it: @andreasnoack made some powerful arguments that a zero coefficient should be a strong zero. It may well be that he managed to convince all the weak zero proponents. It may well be that there is a large majority that really wants 5-arg mul!, preferably last year, and does not really care about this little detail. If that was the case, then it would be a pity to further delay the feature, just because no one wants to shut up the discussion. |
Why not just throw an error for now: β == 0.0 && any(isnan,C) && throw(ArgumentError("use β = false")) |
I added that option to the poll. Great compromise idea! |
Just to set expectations: the feature freeze for 1.3 is in three days so there’s basically no way this can make it in time. We’re being pretty strict about feature freezes and branching since it’s the only part of the release cycle that we can really control the timing of. |
The work is already done in JuliaLang/julia#29634 though. Just needs to be adjusted and rebased. |
@tkf for JuliaLang/julia#29634 could you list the work that remains to be done (incl renaming and handling zeros according to the vote)? I know you're busy so perhaps we can find a way to split any remaining todos so the burden won't fall on you again. |
The TODOs I can think of ATM are:
My PR implements BLAS semantics of the |
Sorry, my memory was stale; my implementation was not consistent and propagates NaN sometimes. So additional TODO is to make the behavior of |
The |
Yes, it's completely an internal detail. My worries were (1) there may be too many specializations (beta=0 etc. is encoded in the type parameter) and (2) it decreases the readability of the source code. |
Those are valid concerns. We already produce a ton of specializations in the linear algebra code so it's good to think through if we really need to specialize here. My thinking has generally been that we should optimize for small matrices since it not free (as you said, it complicates the source code and could increase compile times) and people are better off using |
FYI soft zeros do have simple implementations: if iszero(β) && β !== false && !iszero(α)
lmul!(zero(T),y) # this handles soft zeros correctly
BLAS.gemv!(α, A, x, one(T), y) # preserves soft zeros
elseif iszero(α) && iszero(β)
BLAS.gemv!(one(T), A, x, one(T), y) # puts NaNs in the correct place
lmul!(zero(T), y) # everything not NaN should be zero
elseif iszero(α) && !iszero(β)
BLAS.gemv!(one(T), A, x, β, y) # puts NaNs in the correct place
BLAS.gemv!(-one(T), A, x, one(T), y) # subtracts out non-NaN changes
end |
@andreasnoack Sorry, I forgot that we actually needed the specialization to optimize the inner-most loop for some structured matrices like |
Great! |
Fixed by JuliaLang/julia#29634 |
@andreasnoack Thanks a lot for reviewing and merging this in time! Now that it is merged for 1.3 it started makes me very nervous about the implementation 😄. I appreciate if people here can test their linear algebra code more thoroughly when 1.3-rc comes out! |
No need to worry, there will be plenty of time for 1.3 RCs + PkgEvals to shake out bugs. |
Currently, there is the following line in the sparse matmult code:
https://github.com/JuliaLang/julia/blob/056b374919e11977d5a8d57b446ad1c72f3e6b1d/base/sparse/linalg.jl#L94-L95
I am assuming that this means we want to have the more general
A_mul_B*!(α,A,B,β,C) = αAB + βC
methods which overwriteC
(the BLASgemm
API) for dense arrays. Is that still the case? (It also seems possible to keep both APIs, i.e. keep theA_mul_B*!(C,A,B)
methods, which would simply be defined asA_mul_B*!(C,A,B) = A_mul_B*!(1,A,B,0,C)
.)I would personally like to have the
gemm
API defined for all array types (this has been expressed elsewhere). Implementing these methods for dense arrays seems fairly simple, since they would just directly callgemm
et al. The sparse case is already implemented. The only real modification would be to the pure julia generic matmult, which does not acceptα
andβ
arguments.This would lead to generic code that works with any type of array/number. I currently have a simple implementation of expm (after having done the modification to
_generic_matmatmult!
) which works with bigfloats and sparse arrays.The text was updated successfully, but these errors were encountered: