Skip to content
This repository has been archived by the owner on Mar 11, 2020. It is now read-only.

Generic implementations? #15

Closed
StefanKarpinski opened this issue Aug 26, 2016 · 7 comments
Closed

Generic implementations? #15

StefanKarpinski opened this issue Aug 26, 2016 · 7 comments

Comments

@StefanKarpinski
Copy link

In C-based libm libraries, it doesn't make too much sense to share code between Float32 and Float64 implementations, let alone Complex{Float32} and Complex{Float64}. The monomorphism of the language and the complexity of writing generic implementations make it counterproductive – it's easier to understand what's happening when each implementation is concretely spelled out.

In Julia, I think we could do considerably better: multiple dispatch, macros, the occasional generated function, and other features should make it possible to write very generic versions of algorithms and still get maximal performance. Generic versions are also often easier to understand in much the same way that a more general theorem is easier to understand than a very specific application of it may be. This way we could fairly easily get efficient libm code for Float16, DoubleDouble, and ultimately Float128 once any hardware supports that. Thoughts?

@ViralBShah
Copy link
Member

See #14

We can certainly do quite a bit in this direction.

@musm
Copy link
Collaborator

musm commented Sep 1, 2016

Here's the situation the two major things that change between Float32 and Float64 methods are the constants used and the degree of the polynomial approximant.

Is it possible to do something where based on the type of the function we could, with zero overhead, call the correct constants array (two arrays of different length, one for Float32 and the other for Float64) and sum the polynomial via the horner macro using this array. I.e. build the two constant arrays, one for Float64 the other for Float32, then depending on the type signature call the correct table and apply the horner macro. I'm not sure if this can be done with zero overhead.

@simonbyrne
Copy link
Member

simonbyrne commented Sep 1, 2016

The two easiest options would be to either use an inlined function, e.g.

function foo(x)
   ...
   y = _foo(x)
   ...
end
@inline _foo(x::Float32) = @horner x ...
@inline _foo(x::Float64) = @horner x ...

or an if else block, e.g.

function foo{T}(x::T)
    ...
    if T == Float32
        y = @horner x ...
    else
        y = @horner x ...
    end
    ...
end

@vchuravy
Copy link

vchuravy commented Sep 1, 2016

The inlined function will generalise to Float16 and DoubleDouble so I would say that is preferable.

@musm
Copy link
Collaborator

musm commented Sep 2, 2016

So I discovered that in some cases having things inside let blocks can really throw of the compiler resulting in very bad code gen, e.g:

let
function foo(x)
   ...
   y = _foo(x)
   ...
end
@inline _foo(x::Float32) = @horner x ...
@inline _foo(x::Float64) = @horner x ...
end

Edit: already reported by simon as a bug

@musm
Copy link
Collaborator

musm commented Sep 16, 2016

@simonbyrne do you recall the exact issue number for the bug above?

@simonbyrne
Copy link
Member

simonbyrne commented Sep 17, 2016

I opened JuliaLang/julia#18201, but it's due to JuliaLang/julia#15276

@musm musm closed this as completed Jan 17, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants