-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fma
calls the system libm, not Julia's openlibm
#9890
Comments
I saw a similar fma-related test failure inside a Docker container of Centos 5 using the devtoolset, but the host system on Ubuntu 14.04 didn't reproduce. |
The failure should depend on the libc version. |
you should be able to set UNTRUSTED_SYSTEM_LIBM if you want it to be much more likely that all libm calls in julia end up getting redirected to openlibm (this is necessary on windows, for example) |
Who looks at this variable -- Julia or LLVM? How does LLVM know about openlibm? |
it's a makefile variable |
It statically links libopenlibm.a into |
The real solution is for openlibm to have a Makefile variable that allows |
Probably a good idea. We should be able to apply the same method I used for OpenBLAS. |
But that won't fix things for distribution packages. :-/ Or should we ship both standard names and names with the |
Setting |
I'm not sure how that would happen, unless libLLVM is also dynamically linked. You can't both distrust your system libm but trust your system llvm. |
I installed LLVM 3.5.1 manually. Yes, it's linked dynamically -- good point. |
Bizarre fma behaviour. I wrote simple Horner macros using julia> versioninfo()
Julia Version 0.4.1-pre+22
Commit 669222e* (2015-11-01 00:06 UTC)
Platform Info:
System: Darwin (x86_64-apple-darwin14.5.0)
CPU: Intel(R) Core(TM) i7-4980HQ CPU @ 2.80GHz
WORD_SIZE: 64
BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell)
LAPACK: libopenblas64_
LIBM: libopenlibm
LLVM: libLLVM-3.3
julia> macro hornerfma_llvm(x, p...)
ex = esc(p[end])
for i = length(p)-1:-1:1
ex = :(Base.fma_llvm(t, $ex, $(esc(p[i]))))
end
Expr(:block, :(t = $(esc(x))), ex)
end
julia> macro hornerfma_libm(x, p...)
ex = esc(p[end])
for i = length(p)-1:-1:1
ex = :(Base.fma_libm(t, $ex, $(esc(p[i]))))
end
Expr(:block, :(t = $(esc(x))), ex)
end
julia> myexp1(x::Float64) = @hornerfma_llvm(x,1.0,1.0,0.5,0.16666666666666666,0.041666666666666664,0.008333333333333333,0.001388888888888889,0.0001984126984126984,2.48015873015873e-5,2.7557319223985893e-6,2.755731922398589e-7,2.505210838544172e-8,2.08767569878681e-9,1.6059043836821613e-10,1.1470745597729725e-11,7.647163731819816e-13,4.779477332387385e-14,2.8114572543455206e-15,1.5619206968586225e-16)
myexp1 (generic function with 1 method)
julia> myexp2(x::Float64) = @hornerfma_libm(x,1.0,1.0,0.5,0.16666666666666666,0.041666666666666664,0.008333333333333333,0.001388888888888889,0.0001984126984126984,2.48015873015873e-5,2.7557319223985893e-6,2.755731922398589e-7,2.505210838544172e-8,2.08767569878681e-9,1.6059043836821613e-10,1.1470745597729725e-11,7.647163731819816e-13,4.779477332387385e-14,2.8114572543455206e-15,1.5619206968586225e-16)
myexp2 (generic function with 1 method)
julia> @vectorize_1arg Float64 myexp1
myexp1 (generic function with 4 methods)
julia> @vectorize_1arg Float64 myexp2
myexp2 (generic function with 4 methods)
julia> myexp1(1.0)
2.528361447231352
julia> myexp2(1.0)
2.718281828459045
julia> x = linspace(-1,1,1_000_000);
julia> exp(x);@time exp(x);
0.018781 seconds (6 allocations: 7.630 MB)
julia> myexp1(x);@time myexp1(x);
0.012698 seconds (6 allocations: 7.630 MB)
julia> myexp2(x);@time myexp2(x);
1.020566 seconds (6 allocations: 7.630 MB)
|
@MikaelSlevinsky You have a Haswell machine, which has a native fma instruction, so LLVM is able to use that. The openlibm fma is implemented in software in terms of basic arithmetic and bit twiddling functions, see here. This is going to be much slower than a hardware fma, or even seperate multiply-add instructions. Even if the libm fma did call the hardware instruction, it would still likely be slightly slower due to the overhead of the function call (this is the same reason we call the llvm sqrt instead of the libm one) |
Thanks, that clarifies the timing. But why is LLVM FMA incorrect in the macro? |
I'm not sure why you get that: I get the correct answer with the same code. Maybe try it on 0.4.2? |
Also, the built-in |
Ok, thanks. Merry Christmas! |
Julia 0.4 lowers |
I am tracking down a test failure where
fma
returns a wrong result. This is a 64-bit Intel Linux system with glibc 2.12 installed.code_native
tells me thatfma
expands to a libcall on this system (as should be), and also tells me the address of thisfma
function, but without giving it a name./proc/*/maps
tells me that this address is part of a range where/lib64/libm-2.12.so
is mapped. That is, Julia'sfma
call goes to the system libm, not to Julia's openlibm.The reason is that we expand Julia's
fma
to an LLVM intrinsic, and the libcall is then generated by LLVM since there is nofma
machine instruction available. LLVM chooses to calllibm
.I suggest to check whether the system has an
fma
instruction when expanding Julia'sfma
function, and if not, expanding it to a call toopenlibm
instead of an LLVM intrinsic. This requires #9855 to be available.This issue may be related to #9847.
The text was updated successfully, but these errors were encountered: