Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatically choose a correct fma implementation at run time #9898

Merged
merged 1 commit into from
Jan 24, 2015

Conversation

eschnett
Copy link
Contributor

Fixes #9890. May fix #9847.

@eschnett eschnett mentioned this pull request Jan 23, 2015
@nalimilan
Copy link
Member

Do you have an idea about what's broken in the system libm? Or any simple C code to reproduce the bug? It would be nice to report this upstream too.

@eschnett
Copy link
Contributor Author

The broken system libm is from glibc 2.12, several years out of date (from 2010). Newer versions of glibc are fine (I tested 2.19).

I have a simple example that reproduces the problem; in fact, this is what is used at run time to check fma. Since this is just a function call, it could be easily converted to C where it should break as well. However, since this is already fixed upstream, there's nothing we can do except avoid broken installations.

We could presumably write a configure test, detect this a Julia build time, and pass in a flag to the run-time library. This would be considerably more complex than the current method. The current method is also "obviously correct", as it is guaranteed to check that fma implementation that Julia actually uses, instead of second-guessing this at configuration time.

@nalimilan
Copy link
Member

OK, great to know that it's been fixed upstream. I'm fine with the detection code.

@eschnett eschnett mentioned this pull request Jan 23, 2015
@staticfloat
Copy link
Member

I agree this is better than a build-time test, as what with us shipping binaries and such, build-time environment can vary quite a bit from the run-time environment. Note that this doesn't quite solve #9847 for me, but I think that's a separate issue, so that shouldn't hold this up.

@tkelman
Copy link
Contributor

tkelman commented Jan 24, 2015

Looks good to me, thanks for the fix.

I'll reopen #9847 in a second, given E's report that this doesn't quite fix it.

tkelman added a commit that referenced this pull request Jan 24, 2015
Automatically choose a correct `fma` implementation at run time
@tkelman tkelman merged commit 4ff8145 into JuliaLang:master Jan 24, 2015
@rickhg12hs
Copy link
Contributor

Unfortunately, this generates a bootstrap error on my 32-bit Linux.

constants.jl
error during bootstrap: LoadError(at "sysimg.jl" line 268: LoadError(at "constants.jl" line 52: ErrorException("assertion failed: float32(π) == float32(big(π))")))
make[1]: *** [/usr/local/src/julia/julia/usr/lib/julia/sys0.o] Error 1
make: *** [release] Error 2

@tkelman
Copy link
Contributor

tkelman commented Jan 24, 2015

Damn, thanks for testing. This should maybe be reverted if no one has a quick fix.

@rickhg12hs
Copy link
Contributor

Beats me why the fma edits would affect the constant conversions ... on a 32-bit system.

Who's the constant conversion expert? 8-)

@eschnett
Copy link
Contributor Author

You could remove the respective @asserts to complete the bootstrap, and then output float32(pi), big(pi), and float32(big(pi)). That would give a further pointer as to what is going wrong.

@rickhg12hs
Copy link
Contributor

Building now ...

@eschnett
Copy link
Contributor Author

There's a problem with the system fma on some 32-bit Linux systems, as it changes the rounding mode (see #9847). This could affect conversions. My patch above checks whether fma works by calling it, which could corrupt the rounding mode.

@rickhg12hs
Copy link
Contributor

... still building.

Wondering if you set_rounding after your fma tests if all might be good. ???

@eschnett
Copy link
Contributor Author

Yes, saving and restoring the rounding mode may help -- but it doesn't help for #9847; apparently there are two rounding modes for 32-bit systems.

@rickhg12hs
Copy link
Contributor

Well, the original assert was correct ...

$ ./julia 
               _
   _       _ _(_)_     |  A fresh approach to technical computing
  (_)     | (_) (_)    |  Documentation: http://docs.julialang.org
   _ _   _| |_  __ _   |  Type "help()" for help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 0.4.0-dev+2879 (2015-01-24 04:02 UTC)
 _/ |\__'_|_|_|\__'_|  |  Commit 4ff8145* (0 days old master)
|__/                   |  i686-redhat-linux

julia> float32(pi)
3.1415925f0

julia> big(pi)
3.141592653589793238462643383279502884197169399375105820974944592307816406286198e+00 with 256 bits of precision

julia> float32(big(pi))
3.1415927f0

julia> float32(pi) |> num2hex
"40490fda"

julia> float32(big(pi)) |> num2hex
"40490fdb"

julia> get_rounding(Float32)
Base.Rounding.RoundingMode{:Nearest}()

julia> set_rounding(Float32, RoundNearest)
0

julia> float32(big(pi))
3.1415927f0

julia> float32(pi)
3.1415925f0

julia> set_rounding(Float64, RoundNearest)
0

julia> float32(pi)
3.1415925f0

julia> float32(big(pi))
3.1415927f0

julia> set_rounding(BigFloat, RoundNearest)
0

julia> float32(pi)
3.1415925f0

julia> float32(big(pi))
3.1415927f0

Based on previous tests, I don't believe get_rounding, but set_rounding didn't seem to fix things.

@eschnett
Copy link
Contributor Author

What glibc version do you have installed?

@vtjnash
Copy link
Member

vtjnash commented Jan 24, 2015

I agree this is better than a build-time test, as what with us shipping binaries and such, build-time environment can vary quite a bit from the run-time environment. Note that this doesn't quite solve #9847 for me, but I think that's a separate issue, so that shouldn't hold this up.

i'm not sure I follow. this PR is a build-time test

@rickhg12hs
Copy link
Contributor

$ ldd ./julia 
    linux-gate.so.1 =>  (0xb7785000)
    libjulia.so => not found
    libdl.so.2 => /lib/libdl.so.2 (0x434e6000)
    librt.so.1 => /lib/librt.so.1 (0x43541000)
    libpthread.so.0 => /lib/libpthread.so.0 (0x434ed000)
    libstdc++.so.6 => /lib/libstdc++.so.6 (0x4361e000)
    libm.so.6 => /lib/libm.so.6 (0x434a1000)
    libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x43522000)
    libc.so.6 => /lib/libc.so.6 (0x432e1000)
    /lib/ld-linux.so.2 (0x432be000)
$ /lib/libc.so.6
GNU C Library (GNU libc) stable release version 2.17, by Roland McGrath et al.
Copyright (C) 2012 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.
Compiled by GNU CC version 4.8.3 20140624 (Red Hat 4.8.3-1).
Compiled on a Linux 3.14.17 system on 2014-08-28.
Available extensions:
    The C stubs add-on version 2.1.2.
    crypt add-on version 2.1 by Michael Glad and others
    GNU Libidn by Simon Josefsson
    Native POSIX Threads Library by Ulrich Drepper et al
    BIND-8.2.3-T5B
    RT using linux kernel aio
libc ABIs: UNIQUE IFUNC
For bug reporting instructions, please see:
<http://www.gnu.org/software/libc/bugs.html>.

@eschnett
Copy link
Contributor Author

I'm looking at (one of) the fma implementations in glibc. It sets the rounding mode to FE_TOWARDZERO, but doesn't to set it back.

@tkelman
Copy link
Contributor

tkelman commented Jan 24, 2015

i'm not sure I follow. this PR is a build-time test

D'oh, right, yeah this ends up making the decision at system image build time, doesn't it.

If these instructions are going to be buggy in a lot of places, we might have to make them explicitly opt-in somehow. Or move the check to some __init__ method where it will truly be run-time.

@simonbyrne
Copy link
Contributor

I guess the easiest option is to add a call to set_rounding after a problematic fma has been detected.

@eschnett
Copy link
Contributor Author

set_rounding doesn't seem to work reliably on 32-bit Linux systems. See #9847; this is also mentioned in glibc's documentation.

We may have to disable LLVM's fma intrinsic for the time being, and only use it when we know the hardware has an fma instruction. Instead, we should be using openlibm's fma function.

@eschnett
Copy link
Contributor Author

I don't have easy access to a 32-bit Linux system for testing. Could someone test whether the set_rounding idea works?

@staticfloat
Copy link
Member

@eschnett this seems to have an effect:

julia> 0.1 + 0.2
0.30000000000000004

julia> fma(1.0,1.0,1.0)
2.0

julia> 0.1 + 0.2
0.3

julia> set_rounding(Float64, RoundNearest)
0

julia> 0.1 + 0.2
0.30000000000000004

julia> versioninfo()
Julia Version 0.4.0-dev+2893
Commit 427417a (2015-01-24 21:24 UTC)
Platform Info:
  System: Linux (i686-linux-gnu)
  CPU: Intel Xeon E312xx (Sandy Bridge)
  WORD_SIZE: 32
  BLAS: libopenblas (DYNAMIC_ARCH NO_AFFINITY Nehalem)
  LAPACK: libopenblas
  LIBM: libopenlibm
  LLVM: libLLVM-3.3

(Note that I reverted 975a7dc in order to get this to compile)

@eschnett
Copy link
Contributor Author

Can you add set_rounding(Float32, RoundNearest) and set_rounding(Float64, RoundNearest) and the end of the code block introduced by this patch? Does this fix your bootstrap?

@staticfloat
Copy link
Member

I had to do:

Rounding.set_rounding_raw(Float32, Rounding.JL_FE_TONEAREST)
Rounding.set_rounding_raw(Float64, Rounding.JL_FE_TONEAREST)

Because if I didn't, I got "function undefined" errors when using set_rounding() due to to_fenv() not being defined. This is kind of weird since to_fenv() is most definitely defined, but it's defined inside of the Rounding module, and I think the module subsystem is in kind of a weird state at this point. But by adding those two lines, things seem to work just fine. Bootstrap finishes, although make testall still fails with the repr(0.1 + 0.2) != 0.3 test.

@eschnett
Copy link
Contributor Author

Was this repr failure present before, or was this also introduced by this change?

@staticfloat
Copy link
Member

The repr failure is the same as #9847, where we believe it is due to set_rounding() not being called after every fma() invocation.

@eschnett
Copy link
Contributor Author

I checked openlibm's fma, and it seems correct, whereas glibc's fma seems incorrect. The patch here switches the code to use openlibm's fma all the time, except when initializing the library. Thus I don't understand what is going on.

The best explanation I've heard is that get_rounding is not working correctly on 32-bit Linux systems. This would mean that restoring the original rounding mode via set_rounding would fail. We could forcefully call set_rounding(:Nearest) after every call to fma.

@staticfloat
Copy link
Member

On Ubuntu 14.04 32-bit, I'm using glibc 2.19. Your tests pass, but the rounding mode still gets clobbered:

julia> 0.1 + 0.2
0.30000000000000004

julia> Base.fma_llvm(1.0000305f0, 1.0000305f0, -1.0f0) == 6.103609f-5
true

julia> Base.fma_llvm(1.0000000009313226, 1.0000000009313226, -1.0) == 1.8626451500983188e-9
true

julia> 0.1 + 0.2
0.3

@simonbyrne
Copy link
Contributor

@staticfloat If this is still a problem with glibc, then we should probably file the bug upstream.

@eschnett eschnett deleted the choose-correct-fma branch January 31, 2015 16:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

fma calls the system libm, not Julia's openlibm test/numbers.jl fails on 32-bit platform
7 participants