use Tang's algorithm for log and log1p #10008

simonbyrne · 2015-02-01T19:22:41Z

This implements log and log1p in native Julia, using Tang's table lookup algorithm (see #8869).

Depending on architecture, it seems to be about 5-20% faster than openlibm, and is more accurate (relative error is below 0.56 ulps).

Some issues:

It doesn't include any tests: any suggestions for a good way to do this?
It uses hex float literals, which will cause problems with MSVC (strtod issue for MSVC #6349): if this issue is likely to get resolved soon (with VS2015) I'd prefer to wait, otherwise I can change the values to decimal.

If we're happy to use this, it should be fairly straightforward to modify this code for log2 and log10 (as well as their 1p versions, see #6148).

tknopp · 2015-02-01T19:26:17Z

Is there any advantage of using the hex float representation?

simonbyrne · 2015-02-01T19:32:19Z

It ensures we get exactly the floats we want, which is critical for these types of functions: occasionally weird bugs do arise in strtod functions (and since we depend on the system libc for this, we're at their mercy): see for example http://www.exploringbinary.com/tag/bug/.

It is also typically faster to parse, but that's not really a concern here.

tknopp · 2015-02-01T19:41:57Z

does this mean we cannot reliably parse a string decimal float with full accuracy?

#6349 was really hard to discover and IMHO we should make @tkelman's life as easy as possible when it comes to MSVC compatibility

simonbyrne · 2015-02-01T19:43:27Z

The other issue is what to do when USE_SYSTEM_LIBM is enabled.

tkelman · 2015-02-01T19:50:10Z

as easy as possible when it comes to MSVC compatibility

MSVC compatibility is and has been broken since #9266 added some inline assembly in Julia's C code. Still not fixed because I didn't understand a possible workaround using longjmp that Jameson briefly mentioned on IRC a while ago. I think we're broken on anything that isn't GCC or Clang at the moment, according to the last issue from someone trying to use the Intel compiler.

tknopp · 2015-02-01T19:59:15Z

Yes, until we have this fixed and MSVC CI the compatibility can always be silently broken. Still, I think we should try the best to keep MSVC compatibility where possible.

simonbyrne · 2015-02-01T20:03:01Z

@tknopp I guess it depends on your definition of reliable: I don't think there is any implementation that has been proven correct, but the problem cases that arise typically involve incorrect rounding when using a lot of decimal places and extreme exponents, e.g. this one.

We would be extremely unlikely to hit such issues here (at least if using a modern libc), so it shouldn't be an issue to change them. It's more that it makes the algorithm clearer: e.g. some numbers are chosen so that they have zeros in the last few bits.

tknopp · 2015-02-01T20:16:55Z

If its worth it then it should be used. In the end we need to clearly define anyway if hex floats are part of the language or not.

eschnett · 2015-02-01T20:36:10Z

To test it, you can e.g. compare a thousand (random?) values against openlibm with a certain accuracy. You could also compare to BigFloat's log function.

For Float32, you can even "prove" correctness by testing the function for all 2^32 possible Float32 values. (It's not a real proof since you'd be relying e.g. on BigFloat which hasn't been proven correct.)

Is this log function vectorizable? I assume it is not, given that there is a table lookup and an if statement. For some users, vectorizability is more important than even a 50% speed improvement. (openlibm's log function is not vectorizable.)

On what architectures does this give what of speed-up? For example, Intel's new AVX-512 instruction set contains a log helper instruction that may be more efficient than a manual implementation. There are also helper instruction to fix up results depending on sign, inf, nan, etc. This instruction set may not be important now, but if we implement special functions ourselves, then we need at some point a mechanism to switch between different implementations for different architectures.

It would be interesting to keep the old log function around under a different name (not exported from Base), so that one can compare performance more easily.

I am planning a similar approach than you for @fastmath, which would replace the regular log function with a faster one if that makes sense. I like your idea very much. For @fastmath, a vectorizable log implementation is probably the way to go.

simonbyrne · 2015-02-01T21:31:41Z

For Float32, you can even "prove" correctness by testing the function for all 2^32 possible Float32 values. (It's not a real proof since you'd be relying e.g. on BigFloat which hasn't been proven correct.)

True, but the cost of computing 2^32 BigFloat logs is likely to be prohibitive. I think having few thousand values spaced out over the range might be the way to go.

Is this log function vectorizable?

No, as you point out I don't think this is a pattern that could be easily made to vectorise.

On what architectures does this give what of speed-up?

It varies: on a Xeon Westmere (no AVX), I saw about a 20% boost. On a Sandy Bridge i7, about 5%, and
on a Haswell machine, it was ~~basically a wash~~ about 5-10% (after incorporating #9881).

It would be interesting to keep the old log function around under a different name (not exported from Base), so that one can compare performance more easily.

We could create a Base.Math.Libm submodule?

I am planning a similar approach than you for @fastmath, which would replace the regular log function with a faster one if that makes sense. I like your idea very much. For @fastmath, a vectorizable log implementation is probably the way to go.

Agreed, I think having specialised @fastmath functions is a good idea.

ViralBShah · 2015-02-02T06:08:18Z

I think we should not worry about USE_SYSTEM_LIBM if we are using an internal function. This can be noted in the documentation for log, and one should be able to get to it with Base.Math.log.

Coming to think of it, I really would just like to have different namespaces for math functions - Base.Math.OpenLibm, Base.Math.SysLibm, Base.Math.Native (julia implementations). Then, we can have all of these co-exist, and export different defaults as our implementations improve, while still having the ability to call a specific version. It would also make it easy then, to have a CrLibm.jl package.

vtjnash · 2015-02-03T00:08:24Z

fwiw, i'm pushing for removal of the hex float parsing support entirely and require using reinterpret(Float64, 0xABC) instead

eschnett · 2015-02-03T02:29:30Z

@vtjnash Distributing a working strtod with Julia should be straightforward. I like hex floats, since they are sometimes easier to understand than decimal ones. Do you want to have them removed from Julia, or just from this patch?

StefanKarpinski · 2015-02-03T02:47:27Z

I'm pretty sure that the double-conversion library includes a strtod implementation.

tkelman · 2015-02-03T02:50:45Z

I'm pretty sure that came up in #6349 and/or #5988, but a) we don't use that library any more, and more importantly b) the C++ implementation there is a very different API from the C version we use.

ViralBShah · 2015-04-16T10:11:04Z

It would be nice to get this merged.

simonbyrne · 2015-04-16T10:30:44Z

Okay, I can change the hex floats to decimal floats (we can revert later if Windows support improves).

It did occur to me that we could generate the tables via stagedfunctions, though perhaps that's taking it a bit too far (it would also make it more difficult to create a "julia-lite" without MPFR).

I don't have time to look at the log2 or log10 versions for a bit, but I'll try them out once I get a chance.

simonbyrne · 2015-04-16T10:33:33Z

base/special/log.jl

+
+# determine if hardware FMA is available
+# should probably check with LLVM, see #9855.
+const FMA_NATIVE = muladd(nextfloat(1.0),nextfloat(1.0),-nextfloat(1.0,2)) == -4.930380657631324e-32


This is a bit of a hack until #9855 is solved.

tkelman · 2015-04-16T13:00:58Z

The MSVC build is mostly working again (some small issues remaining, ref #10775) so using decimal floats would be preferred. I'll check VS 2015 at some point after it gets officially released, so far I've only been using the prerelease versions remotely on appveyor.

simonbyrne · 2015-04-18T17:23:57Z

Okay, I've updated and added some tests.

ViralBShah · 2015-04-21T18:56:57Z

Should we retain a way to call the Openlibm log? The exported one can be the new one.

simonbyrne · 2015-04-21T20:47:14Z

That's probably a good idea. Any ideas on a good way to do this?

tkelman · 2015-04-22T04:22:48Z

Libm module?

ViralBShah · 2015-04-22T04:32:52Z

How about OpenLibm? In the future this could co-exist with the system Libm module too then.

ViralBShah · 2015-04-22T04:34:30Z

I am thinking that we release the new log then in a new module that contains fast julia implementations, and after it has been in the wild for a little bit, with more people kicking the tyres, we make it the default - kind of in two steps. Just being a bit paranoid.

tkelman · 2015-04-22T04:35:22Z

Since the API's are more or less compatible between openlibm and system libm, wouldn't the code on the Julia side be the same between the two, just with different library names in ccall?

ViralBShah · 2015-04-22T04:39:48Z

Yes - just some macros to generate both are needed.

simonbyrne · 2015-04-22T20:03:53Z

So is the idea that we have 3 modules:

Base.Math.SystemLibm
Base.Math.OpenLibm
Base.Math.JuliaLibm

and have a build variable to switch between them?

ViralBShah · 2015-04-23T07:16:22Z

Perhaps eventually. What do you think? Is it overkill? Perhaps the SystemLibm could just be a package, and we have just two then.

tkelman · 2015-04-23T07:25:23Z

One thing that I found when trying to wrap an alternate blas implementation as a package is that library bindings for things already in base would pretty much just be copying code straight out of base, and would be fairly messy to keep up to date over time (I didn't really bother trying). For libraries like blas, lapack, libm, (MKL's FFTW maybe?), it might be worth trying to make the base wrapper code a little more modular, so we expose a "wrap this library" entry point and base just uses one instance of that.

ViralBShah · 2015-04-23T07:27:37Z

Could we allow users to change the libname that gets passed to ccall, so that a different library gets picked up? That way we don't have to replicate code for every library and bloat the system image, but users can switch.

In this particular case, I still prefer JuliaLibm to contain the new julia implementations as they come.

simonbyrne · 2015-04-23T11:41:21Z

Okay, so should I just wrap this PR in a module? Or would we better off keeping it in a package?

ViralBShah · 2015-04-23T12:28:31Z

My preference is to reinstate the original log, and create a JuliaLibm module into which this and other julia implementations go. Then we ask people to test it out a bit more, and make the new one default in a few days.

use Tang's algorithm for log and log1p

ViralBShah · 2015-04-28T14:28:15Z

I just realized that it would be nice to update the documentation to mention the new faster log functions as well, and how to access them.

simonbyrne · 2015-04-28T15:10:26Z

Any suggestion for where would be appropriate?

ViralBShah · 2015-04-28T16:41:37Z

I was thinking of having a note in the documentation of the existing log functions about the faster variants. Perhaps even link to Yeppp.jl?

simonbyrne · 2015-05-12T16:22:05Z

I've added a note in 939e9e2 (though I didn't mention Yeppp.jl)

ViralBShah · 2015-05-13T04:12:35Z

Thanks!

stevengj · 2017-09-12T17:54:18Z

This has been here for a while... time to update the default log and log1p functions?

simonbyrne force-pushed the log-tang branch from 6635832 to 119fa97 Compare February 1, 2015 19:46

ViralBShah added the performance Must go faster label Apr 16, 2015

simonbyrne reviewed Apr 16, 2015
View reviewed changes

simonbyrne force-pushed the log-tang branch from 119fa97 to 466c0bf Compare April 18, 2015 12:48

ViralBShah added this to the 0.4.0 milestone Apr 22, 2015

simonbyrne added 2 commits April 23, 2015 13:49

Use Tang's algorithm for log and log1p. Adds basic tests.

ef35957

log: use decimal instead of hex-float

384237c

simonbyrne force-pushed the log-tang branch from 466c0bf to ff7a5e3 Compare April 23, 2015 12:52

move log functions to JuliaLibm module

826127d

simonbyrne force-pushed the log-tang branch from ff7a5e3 to 826127d Compare April 23, 2015 13:42

ViralBShah added a commit that referenced this pull request Apr 28, 2015

Merge pull request #10008 from simonbyrne/log-tang

af7a8c2

use Tang's algorithm for log and log1p

ViralBShah merged commit af7a8c2 into JuliaLang:master Apr 28, 2015

simonbyrne deleted the log-tang branch April 28, 2015 15:01

use Tang's algorithm for log and log1p #10008

use Tang's algorithm for log and log1p #10008

Conversation

simonbyrne commented Feb 1, 2015

tknopp commented Feb 1, 2015

simonbyrne commented Feb 1, 2015

tknopp commented Feb 1, 2015

simonbyrne commented Feb 1, 2015

tkelman commented Feb 1, 2015

tknopp commented Feb 1, 2015

simonbyrne commented Feb 1, 2015

tknopp commented Feb 1, 2015

eschnett commented Feb 1, 2015

simonbyrne commented Feb 1, 2015

ViralBShah commented Feb 2, 2015

vtjnash commented Feb 3, 2015

eschnett commented Feb 3, 2015

StefanKarpinski commented Feb 3, 2015

tkelman commented Feb 3, 2015

ViralBShah commented Apr 16, 2015

simonbyrne commented Apr 16, 2015

simonbyrne Apr 16, 2015

Choose a reason for hiding this comment

tkelman commented Apr 16, 2015

simonbyrne commented Apr 18, 2015

ViralBShah commented Apr 21, 2015

simonbyrne commented Apr 21, 2015

tkelman commented Apr 22, 2015

ViralBShah commented Apr 22, 2015

ViralBShah commented Apr 22, 2015

tkelman commented Apr 22, 2015

ViralBShah commented Apr 22, 2015

simonbyrne commented Apr 22, 2015

ViralBShah commented Apr 23, 2015

tkelman commented Apr 23, 2015

ViralBShah commented Apr 23, 2015

simonbyrne commented Apr 23, 2015

ViralBShah commented Apr 23, 2015

ViralBShah commented Apr 28, 2015

simonbyrne commented Apr 28, 2015

ViralBShah commented Apr 28, 2015

simonbyrne commented May 12, 2015

ViralBShah commented May 13, 2015

stevengj commented Sep 12, 2017