-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
use Tang's algorithm for log and log1p #10008
Conversation
Is there any advantage of using the hex float representation? |
It ensures we get exactly the floats we want, which is critical for these types of functions: occasionally weird bugs do arise in It is also typically faster to parse, but that's not really a concern here. |
The other issue is what to do when |
MSVC compatibility is and has been broken since #9266 added some inline assembly in Julia's C code. Still not fixed because I didn't understand a possible workaround using longjmp that Jameson briefly mentioned on IRC a while ago. I think we're broken on anything that isn't GCC or Clang at the moment, according to the last issue from someone trying to use the Intel compiler. |
Yes, until we have this fixed and MSVC CI the compatibility can always be silently broken. Still, I think we should try the best to keep MSVC compatibility where possible. |
@tknopp I guess it depends on your definition of reliable: I don't think there is any implementation that has been proven correct, but the problem cases that arise typically involve incorrect rounding when using a lot of decimal places and extreme exponents, e.g. this one. We would be extremely unlikely to hit such issues here (at least if using a modern libc), so it shouldn't be an issue to change them. It's more that it makes the algorithm clearer: e.g. some numbers are chosen so that they have zeros in the last few bits. |
If its worth it then it should be used. In the end we need to clearly define anyway if hex floats are part of the language or not. |
To test it, you can e.g. compare a thousand (random?) values against openlibm with a certain accuracy. You could also compare to For Is this log function vectorizable? I assume it is not, given that there is a table lookup and an On what architectures does this give what of speed-up? For example, Intel's new AVX-512 instruction set contains a It would be interesting to keep the old I am planning a similar approach than you for |
True, but the cost of computing
No, as you point out I don't think this is a pattern that could be easily made to vectorise.
It varies: on a Xeon Westmere (no AVX), I saw about a 20% boost. On a Sandy Bridge i7, about 5%, and
We could create a
Agreed, I think having specialised |
I think we should not worry about Coming to think of it, I really would just like to have different namespaces for math functions - Base.Math.OpenLibm, Base.Math.SysLibm, Base.Math.Native (julia implementations). Then, we can have all of these co-exist, and export different defaults as our implementations improve, while still having the ability to call a specific version. It would also make it easy then, to have a |
fwiw, i'm pushing for removal of the hex float parsing support entirely and require using |
@vtjnash Distributing a working |
I'm pretty sure that the double-conversion library includes a strtod implementation. |
It would be nice to get this merged. |
Okay, I can change the hex floats to decimal floats (we can revert later if Windows support improves). It did occur to me that we could generate the tables via I don't have time to look at the |
|
||
# determine if hardware FMA is available | ||
# should probably check with LLVM, see #9855. | ||
const FMA_NATIVE = muladd(nextfloat(1.0),nextfloat(1.0),-nextfloat(1.0,2)) == -4.930380657631324e-32 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit of a hack until #9855 is solved.
The MSVC build is mostly working again (some small issues remaining, ref #10775) so using decimal floats would be preferred. I'll check VS 2015 at some point after it gets officially released, so far I've only been using the prerelease versions remotely on appveyor. |
Okay, I've updated and added some tests. |
Should we retain a way to call the Openlibm log? The exported one can be the new one. |
That's probably a good idea. Any ideas on a good way to do this? |
|
How about OpenLibm? In the future this could co-exist with the system Libm module too then. |
I am thinking that we release the new log then in a new module that contains fast julia implementations, and after it has been in the wild for a little bit, with more people kicking the tyres, we make it the default - kind of in two steps. Just being a bit paranoid. |
Since the API's are more or less compatible between openlibm and system libm, wouldn't the code on the Julia side be the same between the two, just with different library names in ccall? |
Yes - just some macros to generate both are needed. |
So is the idea that we have 3 modules:
and have a build variable to switch between them? |
Perhaps eventually. What do you think? Is it overkill? Perhaps the SystemLibm could just be a package, and we have just two then. |
One thing that I found when trying to wrap an alternate blas implementation as a package is that library bindings for things already in base would pretty much just be copying code straight out of base, and would be fairly messy to keep up to date over time (I didn't really bother trying). For libraries like blas, lapack, libm, (MKL's FFTW maybe?), it might be worth trying to make the base wrapper code a little more modular, so we expose a "wrap this library" entry point and base just uses one instance of that. |
Could we allow users to change the libname that gets passed to ccall, so that a different library gets picked up? That way we don't have to replicate code for every library and bloat the system image, but users can switch. In this particular case, I still prefer JuliaLibm to contain the new julia implementations as they come. |
Okay, so should I just wrap this PR in a module? Or would we better off keeping it in a package? |
My preference is to reinstate the original |
use Tang's algorithm for log and log1p
I just realized that it would be nice to update the documentation to mention the new faster log functions as well, and how to access them. |
Any suggestion for where would be appropriate? |
I was thinking of having a note in the documentation of the existing |
I've added a note in 939e9e2 (though I didn't mention Yeppp.jl) |
Thanks! |
This has been here for a while... time to update the default |
This implements
log
andlog1p
in native Julia, using Tang's table lookup algorithm (see #8869).Depending on architecture, it seems to be about 5-20% faster than openlibm, and is more accurate (relative error is below 0.56 ulps).
Some issues:
If we're happy to use this, it should be fairly straightforward to modify this code for
log2
andlog10
(as well as their1p
versions, see #6148).