Skip to content

Commit

Permalink
muladd in evalpoly macro
Browse files Browse the repository at this point in the history
  • Loading branch information
simonbyrne committed Jan 22, 2015
1 parent 45a27da commit 1df19b8
Showing 1 changed file with 3 additions and 3 deletions.
6 changes: 3 additions & 3 deletions base/math.jl
Original file line number Diff line number Diff line change
Expand Up @@ -61,8 +61,8 @@ macro evalpoly(z, p...)
for i = length(p)-2:-1:1
ai = symbol("a", i)
push!(as, :($ai = $a))
a = :($b + r*$ai)
b = :($(esc(p[i])) - s * $ai)
a = :(muladd(r, $ai, $b))
b = :(muladd(-s, $ai, $(esc(p[i]))))

This comment has been minimized.

Copy link
@stevengj

stevengj Apr 21, 2016

Member

This doesn't seem to generate efficient code in LLVM: it generates an explicit negation instruction for -s.

(Unfortunately, LLVM doesn't seem to have intrinsics for fused multiply-subtract and similar. Nor does @fastmath seem to help here; the compiler still isn't smart enough to turn the negation back into a subtraction.)

This comment has been minimized.

Copy link
@simonbyrne

simonbyrne Apr 21, 2016

Author Contributor

Interesting: I seem to recall trying this a long time ago and it worked, though I can't find it now.

This comment has been minimized.

Copy link
@stevengj

stevengj Apr 21, 2016

Member

It seems that LLVM does the right thing for integers, but not for floating-point values. That makes it seem like there should be some @fastmath-like flag to get it to work.

This comment has been minimized.

Copy link
@yuyichao

yuyichao Apr 21, 2016

Contributor

I believe this is due to FastISel. With it disabled, llvm does the right thing.

julia> f(x, y, z) = muladd(x, y, -z)
f (generic function with 1 method)

julia> @code_llvm f(1.0, 1.0, 1.0)

define double @julia_f_53973(double, double, double) #0 {
top:
  %3 = fsub double -0.000000e+00, %2
  %4 = call double @llvm.fmuladd.f64(double %0, double %1, double %3)
  ret double %4
}

julia> @code_native f(1.0, 1.0, 1.0)
        .text
Filename: REPL[1]
Source line: 0
        pushq   %rbp
        movq    %rsp, %rbp
Source line: 1
        vfmsub213sd     %xmm2, %xmm1, %xmm0
        popq    %rbp
        retq
        nopl    (%rax,%rax)

I don't think llvm has a fmulsub instruction but the (non fast) instruction selection can do that.

This comment has been minimized.

Copy link
@eschnett

eschnett Apr 21, 2016

Contributor

@fastmath should suffice to get this to work. LLVM does not expose low-level ways to let us explicitly fold the - into the muladd (there is e.g. no mulsub).

I think the issue here is that one cannot add a fast flag for the fmuladd LLVM intrinsic; the fast flag only works for regular arithmetic operations such as + and *. Thus, when @fastmath is used, it might make sense to expand this to x*y+z with respective fast annotations instead of an fmuladd intrinsic, and hope that LLVM's optimizer will synthesize the platform's fmuladd machine instruction.

This comment has been minimized.

Copy link
@yuyichao

yuyichao Apr 21, 2016

Contributor

I don't think fast is needed since negation of floating point doesn't loose any precision.

Lowering muladd to * and + will strictly give LLVM a harder time to optimize.

This comment has been minimized.

Copy link
@eschnett

eschnett Apr 21, 2016

Contributor

Without the fast flag, the compiler might not fuse the multiply and the add, which muladd allows.

This comment has been minimized.

Copy link
@yuyichao

yuyichao Apr 21, 2016

Contributor

I mean I don't think fast is needed to turn fmuladd into fmulsub.

This comment has been minimized.

Copy link
@stevengj

stevengj Apr 21, 2016

Member

I filed issue #15985 for future continuation of this discussion.

end
ai = :a0
push!(as, :($ai = $a))
Expand All @@ -72,7 +72,7 @@ macro evalpoly(z, p...)
:(r = x + x),
:(s = x*x + y*y),
as...,
:($ai * tt + $b))
:(muladd($ai, tt, $b)))

This comment has been minimized.

Copy link
@stevengj

stevengj Apr 21, 2016

Member

This is actually slower because of #15969, but I should have a patch for that shortly.

This comment has been minimized.

Copy link
@eschnett

eschnett Apr 21, 2016

Contributor

@stevengj Are you already working on a patch? If not, I might look into this.

This comment has been minimized.

Copy link
@stevengj

stevengj Apr 21, 2016

Member

Yes, see #15980

R = Expr(:macrocall, symbol("@horner"), :tt, p...)
:(let tt = $(esc(z))
isa(tt, Complex) ? $C : $R
Expand Down

0 comments on commit 1df19b8

Please sign in to comment.