Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support fma (fused multiply-add) #31

Merged
merged 1 commit into from
Oct 4, 2016
Merged

Support fma (fused multiply-add) #31

merged 1 commit into from
Oct 4, 2016

Conversation

timholy
Copy link
Contributor

@timholy timholy commented Oct 4, 2016

In trying to use this for various real world applications, I finally got tired of not having decent ranges. The result was JuliaLang/julia#18777, for which fma support seems likely to be important.

I also have a package brewing so we can use this on 0.5.

@ajkeller34
Copy link
Collaborator

This looks great, and I really appreciate your efforts to improve ranges. Even better that some of this functionality will be available on 0.5!

Before merging, I wanted to bring your attention to a problem I've encountered, which seems like it will be relevant here. For this PR, the problem is not limiting, so I'm willing to push through this PR and leave an open issue regarding promotion, if you'd like. This is a larger problem though, appearing also in #24 regarding @fastmath.

The problem is that because quantities have different types for different units and dimensions, promoting fails to return a common type if you promote quantities of differing dimensions. If you try and do promote(1m, 1N), for example, you will get (1 m,1 N), although promote(1m, 1mm) gives (1//1 m,1//1000 m). The result is that if a user tries to call your more generic method improperly, by adding quantities of differing dimension, I think a stack overflow will occur rather than a graceful failure: the method will keep calling itself, because promotion couldn't return a common type.

One idea I had would be to have a values-based (rather than type-based) quantity as a fallback, which could represent any units or dimensions in one type, at the expense of run-time penalties. This way, I could guarantee that promotion could always yield a common type (type-based quantities could be promoted to value-based quantities if there were no other option). Obviously I don't like the idea of sacrificing performance to get @fastmath to work, however, and it seems like overkill to fix the problem of graceful failure in this PR.

@ajkeller34
Copy link
Collaborator

On second thought it's not really as if the problem I'm describing is worth holding this up, so I'll just merge it. At least you're aware of the issue now.

@ajkeller34 ajkeller34 merged commit 2ab08cf into PainterQubits:master Oct 4, 2016
@timholy timholy deleted the pull-request/5a2cd545 branch October 4, 2016 21:19
@timholy
Copy link
Contributor Author

timholy commented Oct 4, 2016

I've noticed this as a problem, too.

Perhaps we need to use the following design:

foo{T}(a::T, b::T) = # the real method, i.e., the one that implements the computation
foo(a, b) = _foo(promote(a, b)...)
_foo{T}(a::T, b::T) = foo(a, b)   # promotion worked
_foo(a, b) = error("promotion of $a and $b failed to produce a common type")

@ajkeller34
Copy link
Collaborator

I can certainly implement that design pattern in Unitful for your fma methods, and elsewhere as appropriate. I feel like line 241 of fastmath.jl may need a PR along these lines for Unitful to work with it though... I've tried, but it's pretty awkward to work around that method, especially when unitless numbers and quantities are mixed in legitimate ways for some operations. I'll think about it a little more.

@timholy
Copy link
Contributor Author

timholy commented Oct 5, 2016

Can you submit a julia PR? If not, I'll do it (eventually).

@ajkeller34
Copy link
Collaborator

I'll add that to my list, but do let me know if I'm taking too long and you want to take care of it instead... having some build problems with Julia on macOS Sierra that I need to figure out.

Please see here, lines 590–625, for a proposed change to your fma contributions. It passes more test cases than before and errors gracefully, but I have not benchmarked the performance and the code is not exactly elegant. There are no detected ambiguities.

@ajkeller34
Copy link
Collaborator

Having learned my lesson about submitting julia PRs prematurely some months ago, I thought some more and I think I may be able to get fastmath working without one. In trying to get around an existing method in base like this (e.g. line 241 of fastmath.jl):

f(x::Number, y::Number, zs::Number...) = f(promote(x,y,zs...)...) # method (1)

It seems you can write these methods, with no dispatch ambiguity:

f(x::Number, y::Quantity) = _f(promote(x,y)...) # (2)
f(x::Quantity, y::Number) = _f(promote(x,y)...) # (3)
f(x::Quantity, y::Quantity) = _f(promote(x,y)...) # (4)
f(x::Number, y::Number, z::Number, ts::Number...) = _f(promote(x,y,z,ts...)...) # (5)

Method (1) would only be called when passing two arguments, neither of which are Quantitys. I feel silly for not trying this before, I assumed method (5) would be ambiguous with method (1). Until I see some new problem with this route I don't think we will need a julia PR after all.

Anyway, thanks for humoring me, I'll stop bothering you now 😅

@timholy
Copy link
Contributor Author

timholy commented Oct 6, 2016

Cool github trick: navigate to the page containing the code you want to link to, select the range of line numbers you want to highlight (you can use the shift key), hit the 'y' key on your keyboard (this inserts the current commit sha into the url, making it a "permalink" even as the code evolves), and then copy paste the url into a discussion page.

The one issue I see with the _fma methods is they seem to do a lot of runtime checking. Can any of that be done by the type system? fma (when hardware support is available) is pretty high performance, so having all those branches will be bad news.

@timholy
Copy link
Contributor Author

timholy commented Oct 6, 2016

With regards to eliminating runtime checking: you can either wait until we get triangular dispatch (julia 0.6, see JuliaLang/julia#18457) or do stuff like this:

_fma{T<:Quantity}(x, y::T, z::T) = _fma_dimension(x, dimension(x), y, z)
_fma_dimension(x, ::Dimension{()}, y, z) = fma(x, y, z)
_fma_dimension(x, ::Dimension, y, z) = throw(DimensionError())

The general trick is to add arguments that "peel out" the things you need to examine, then use dispatch to direct the execution to either error messages or computation.

All this gets inlined by the compiler, so (unlike the runtime checks) it doesn't have any negative impact on performance.

@ajkeller34
Copy link
Collaborator

Thank you for the tips! Some follow-up questions:

  • The dimension and unit functions in Unitful give a result based only on the type of the argument. Should I be annotating the dimension and unit functions to indicate that to the compiler (maybe with @pure or some such thing)? I guess even then a runtime check like this would still be a runtime check, and would not be magically moved to compile time.
  • What's the best way to verify things are being inlined? Would you want explicit @inline annotations on any of that?

@ajkeller34
Copy link
Collaborator

Alright, what I'm specifically confused about is the following. Using master branch, which has your fma implementation, the following calls here:

julia> @code_llvm Unitful.fma(1.0, 1.0m, 2.0m)

define %Quantity @julia_fma_71023(double, %Quantity*, %Quantity*) #0 {
top:
  %3 = getelementptr inbounds %Quantity, %Quantity* %1, i64 0, i32 0
  %4 = load double, double* %3, align 8
  %5 = getelementptr inbounds %Quantity, %Quantity* %2, i64 0, i32 0
  %6 = load double, double* %5, align 8
  %7 = call double @llvm.fma.f64(double %0, double %4, double %6)
  %8 = insertvalue %Quantity undef, double %7, 0
  ret %Quantity %8
}

Whereas with my implementation in the fma branch, the following calls here, which appears to have a run-time check, though code_llvm suggests otherwise:

julia> @code_llvm Unitful._fma(1.0, 1.0m, 2.0m)

define %Quantity @julia__fma_70971(double, %Quantity*, %Quantity*) #0 {
top:
  %3 = getelementptr inbounds %Quantity, %Quantity* %1, i64 0, i32 0
  %4 = load double, double* %3, align 8
  %5 = getelementptr inbounds %Quantity, %Quantity* %2, i64 0, i32 0
  %6 = load double, double* %5, align 8
  %7 = call double @llvm.fma.f64(double %0, double %4, double %6)
  %8 = insertvalue %Quantity undef, double %7, 0
  ret %Quantity %8
}

Even more surprisingly, this doesn't appear to have a run-time check either:

julia> @code_llvm Unitful._fma(1.0N/m, 1.0m, 2.0N)

define %Quantity.66 @julia__fma_71230(%Quantity.73*, %Quantity*, %Quantity.66*) #0 {
top:
  %3 = getelementptr inbounds %Quantity.66, %Quantity.66* %2, i64 0, i32 0
  %4 = getelementptr inbounds %Quantity.73, %Quantity.73* %0, i64 0, i32 0
  %5 = load double, double* %4, align 8
  %6 = getelementptr inbounds %Quantity, %Quantity* %1, i64 0, i32 0
  %7 = load double, double* %6, align 8
  %8 = load double, double* %3, align 8
  %9 = call double @llvm.fma.f64(double %5, double %7, double %8)
  %10 = insertvalue %Quantity.66 undef, double %9, 0
  ret %Quantity.66 %10
}

So I don't think they are doing run-time checks, despite appearances...?

@timholy
Copy link
Contributor Author

timholy commented Oct 6, 2016

Hey, if the compiler is eliding them, great! Much easier this way.

@ajkeller34
Copy link
Collaborator

Alright, so fma is maybe 90% working, but I can't devote any more time to this right now; seemingly small changes to the code (to me anyway) gave pretty different llvm outputs, and I kept hitting method ambiguities. You may want to checkout master and test for yourself that you're satisfied. I tried to avoid regressing on the method you implemented, since I presume it is the most important case for you. runtests.jl gives some indication of where I saw problems and where llvm output was "good" or "bad." Unless you're using dimensionless quantities a lot you shouldn't run into trouble, at least on julia 0.5.0.

With a view on future changes: This has been very interesting to see the limits of promotion as I've currently implemented it. I think one problem is that ::Number gives you no insight about whether something has dimensions or not, since Quantity <: Number. My hope is that I could simplify all of this with tweaks to promotion on my end. For instance, promote_type(typeof(1.0), typeof(3m)) will result in Number right now, but the situation may improve if it instead returned Quantity{Float64}. If Numbers from base were wrapped in a dimensionless, unitless Quantity type, maybe dispatch could be simplified, since then I could just write fma(x::Quantity, y::Quantity, z::Quantity) and go from there, rather than starting with the 7 generic fma methods I need now to work around what is in base.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants