`fma` calls the system libm, not Julia's openlibm #9890

eschnett · 2015-01-23T00:58:12Z

I am tracking down a test failure where fma returns a wrong result. This is a 64-bit Intel Linux system with glibc 2.12 installed.

code_native tells me that fma expands to a libcall on this system (as should be), and also tells me the address of this fma function, but without giving it a name. /proc/*/maps tells me that this address is part of a range where /lib64/libm-2.12.so is mapped. That is, Julia's fma call goes to the system libm, not to Julia's openlibm.

The reason is that we expand Julia's fma to an LLVM intrinsic, and the libcall is then generated by LLVM since there is no fma machine instruction available. LLVM chooses to call libm.

I suggest to check whether the system has an fma instruction when expanding Julia's fma function, and if not, expanding it to a call to openlibm instead of an LLVM intrinsic. This requires #9855 to be available.

This issue may be related to #9847.

The text was updated successfully, but these errors were encountered:

tkelman · 2015-01-23T01:51:22Z

I saw a similar fma-related test failure inside a Docker container of Centos 5 using the devtoolset, but the host system on Ubuntu 14.04 didn't reproduce.

eschnett · 2015-01-23T01:52:44Z

The failure should depend on the libc version.

vtjnash · 2015-01-23T05:34:20Z

you should be able to set UNTRUSTED_SYSTEM_LIBM if you want it to be much more likely that all libm calls in julia end up getting redirected to openlibm (this is necessary on windows, for example)

eschnett · 2015-01-23T13:00:27Z

Who looks at this variable -- Julia or LLVM? How does LLVM know about openlibm?

vtjnash · 2015-01-23T14:32:25Z

it's a makefile variable

tkelman · 2015-01-23T17:22:22Z

It statically links libopenlibm.a into ~~libjulia~~ the julia executable, last time I looked at it

ViralBShah · 2015-01-23T17:24:23Z

The real solution is for openlibm to have a Makefile variable that allows olm_ prefixes.

tkelman · 2015-01-23T17:42:20Z

Probably a good idea. We should be able to apply the same method I used for OpenBLAS.

nalimilan · 2015-01-23T17:48:26Z

But that won't fix things for distribution packages. :-/ Or should we ship both standard names and names with the olm_ prefix?

eschnett · 2015-01-23T17:55:47Z

Setting UNTRUSTED_SYSTEM_LIBM=1 did not help. The resulting julia executable does have fma linked in statically, but LLVM still chooses a dynamically linked in libm at run time.

vtjnash · 2015-01-23T17:58:59Z

I'm not sure how that would happen, unless libLLVM is also dynamically linked. You can't both distrust your system libm but trust your system llvm.

eschnett · 2015-01-23T18:01:37Z

I installed LLVM 3.5.1 manually. Yes, it's linked dynamically -- good point.

MikaelSlevinsky · 2015-12-24T13:01:38Z

Bizarre fma behaviour. I wrote simple Horner macros using Base.fma_llvm and Base.fma_libm. The LLVM result is incorrect, and the LIBM result is 100x slower:

julia> versioninfo()
Julia Version 0.4.1-pre+22
Commit 669222e* (2015-11-01 00:06 UTC)
Platform Info:
  System: Darwin (x86_64-apple-darwin14.5.0)
  CPU: Intel(R) Core(TM) i7-4980HQ CPU @ 2.80GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell)
  LAPACK: libopenblas64_
  LIBM: libopenlibm
  LLVM: libLLVM-3.3

julia> macro hornerfma_llvm(x, p...)
           ex = esc(p[end])
           for i = length(p)-1:-1:1
               ex = :(Base.fma_llvm(t, $ex, $(esc(p[i]))))
           end
           Expr(:block, :(t = $(esc(x))), ex)
       end

julia> macro hornerfma_libm(x, p...)
           ex = esc(p[end])
           for i = length(p)-1:-1:1
               ex = :(Base.fma_libm(t, $ex, $(esc(p[i]))))
           end
           Expr(:block, :(t = $(esc(x))), ex)
       end

julia> myexp1(x::Float64) = @hornerfma_llvm(x,1.0,1.0,0.5,0.16666666666666666,0.041666666666666664,0.008333333333333333,0.001388888888888889,0.0001984126984126984,2.48015873015873e-5,2.7557319223985893e-6,2.755731922398589e-7,2.505210838544172e-8,2.08767569878681e-9,1.6059043836821613e-10,1.1470745597729725e-11,7.647163731819816e-13,4.779477332387385e-14,2.8114572543455206e-15,1.5619206968586225e-16)
myexp1 (generic function with 1 method)

julia> myexp2(x::Float64) = @hornerfma_libm(x,1.0,1.0,0.5,0.16666666666666666,0.041666666666666664,0.008333333333333333,0.001388888888888889,0.0001984126984126984,2.48015873015873e-5,2.7557319223985893e-6,2.755731922398589e-7,2.505210838544172e-8,2.08767569878681e-9,1.6059043836821613e-10,1.1470745597729725e-11,7.647163731819816e-13,4.779477332387385e-14,2.8114572543455206e-15,1.5619206968586225e-16)
myexp2 (generic function with 1 method)

julia> @vectorize_1arg Float64 myexp1
myexp1 (generic function with 4 methods)

julia> @vectorize_1arg Float64 myexp2
myexp2 (generic function with 4 methods)

julia> myexp1(1.0)
2.528361447231352

julia> myexp2(1.0)
2.718281828459045

julia> x = linspace(-1,1,1_000_000);

julia> exp(x);@time exp(x);
  0.018781 seconds (6 allocations: 7.630 MB)

julia> myexp1(x);@time myexp1(x);
  0.012698 seconds (6 allocations: 7.630 MB)

julia> myexp2(x);@time myexp2(x);
  1.020566 seconds (6 allocations: 7.630 MB)

simonbyrne · 2015-12-24T13:35:57Z

@MikaelSlevinsky You have a Haswell machine, which has a native fma instruction, so LLVM is able to use that. The openlibm fma is implemented in software in terms of basic arithmetic and bit twiddling functions, see here. This is going to be much slower than a hardware fma, or even seperate multiply-add instructions.

Even if the libm fma did call the hardware instruction, it would still likely be slightly slower due to the overhead of the function call (this is the same reason we call the llvm sqrt instead of the libm one)

MikaelSlevinsky · 2015-12-24T13:41:01Z

Thanks, that clarifies the timing. But why is LLVM FMA incorrect in the macro?

simonbyrne · 2015-12-24T13:49:31Z

I'm not sure why you get that: I get the correct answer with the same code. Maybe try it on 0.4.2?

simonbyrne · 2015-12-24T13:51:48Z

Also, the built-in @horner macro uses the muladd function, which should use fma instructions in this case, but looking at @code_native it doesn't seem to work (at least on my Haswell machine). Hopefully this will get fixed when we upgrade LLVM.

MikaelSlevinsky · 2015-12-24T14:02:04Z

Ok, thanks. Merry Christmas!

eschnett · 2015-12-24T15:45:56Z

Julia 0.4 lowers muladd to LLVM 3.3 multiply and add instructions, setting flags that they can be combined. Apparently, LLVM 3.3 does not do this. LLVM 3.7 offers a special muladd intrinsic that Julia uses there, leading to a native fma instruction.

eschnett mentioned this issue Jan 23, 2015

test/numbers.jl fails on 32-bit platform #9847

Closed

eschnett mentioned this issue Jan 23, 2015

Automatically choose a correct fma implementation at run time #9898

Merged

tkelman closed this as completed in #9898 Jan 24, 2015

nalimilan mentioned this issue Oct 20, 2015

32 bit linux binaries failing numbers test #13550

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`fma` calls the system libm, not Julia's openlibm #9890

`fma` calls the system libm, not Julia's openlibm #9890

eschnett commented Jan 23, 2015

tkelman commented Jan 23, 2015

eschnett commented Jan 23, 2015

vtjnash commented Jan 23, 2015

eschnett commented Jan 23, 2015

vtjnash commented Jan 23, 2015

tkelman commented Jan 23, 2015

ViralBShah commented Jan 23, 2015

tkelman commented Jan 23, 2015

nalimilan commented Jan 23, 2015

eschnett commented Jan 23, 2015

vtjnash commented Jan 23, 2015

eschnett commented Jan 23, 2015

MikaelSlevinsky commented Dec 24, 2015

simonbyrne commented Dec 24, 2015

MikaelSlevinsky commented Dec 24, 2015

simonbyrne commented Dec 24, 2015

simonbyrne commented Dec 24, 2015

MikaelSlevinsky commented Dec 24, 2015

eschnett commented Dec 24, 2015

fma calls the system libm, not Julia's openlibm #9890

fma calls the system libm, not Julia's openlibm #9890

Comments

eschnett commented Jan 23, 2015

tkelman commented Jan 23, 2015

eschnett commented Jan 23, 2015

vtjnash commented Jan 23, 2015

eschnett commented Jan 23, 2015

vtjnash commented Jan 23, 2015

tkelman commented Jan 23, 2015

ViralBShah commented Jan 23, 2015

tkelman commented Jan 23, 2015

nalimilan commented Jan 23, 2015

eschnett commented Jan 23, 2015

vtjnash commented Jan 23, 2015

eschnett commented Jan 23, 2015

MikaelSlevinsky commented Dec 24, 2015

simonbyrne commented Dec 24, 2015

MikaelSlevinsky commented Dec 24, 2015

simonbyrne commented Dec 24, 2015

simonbyrne commented Dec 24, 2015

MikaelSlevinsky commented Dec 24, 2015

eschnett commented Dec 24, 2015

`fma` calls the system libm, not Julia's openlibm #9890

`fma` calls the system libm, not Julia's openlibm #9890