LLVM compilation #548

sn6uv · 2016-09-14T12:00:54Z

Allows compiling simple expressions with llvmlite.

At the moment nothing else is aware of CompiledFunction so the performance benefits are marginal (most of the time is spent looking for patterns) but this has the potential to speedup plotting significantly.

In[1]:= cf = Compile[{{x, _Real}}, Sin[x + 2]]
Out[1]= CompiledFunction[{x}, Sin[2 + x], -CompiledCode-]

In[2]:= data = RandomReal[1, 100];
In[3]:= Map[cf, data]; // Timing
Out[3]= {0.084535, Null}

In[4]:= Map[Sin[#1 + 2]&, data]; // Timing
Out[4]= {0.120563, Null}

Requires a minor bugfix to llvmlite: numba/llvmlite#205.

ghost · 2016-09-14T12:02:03Z

Just please make it optional (so you can use mathics without llvmlite - because of pypy - it probably will only slowdown)

sn6uv · 2016-09-14T12:16:37Z

@TiberiumPy, yes the dependency on LLVM is optional. I think even PyPy can be optimised with this (although cffi would be better than ctypes for this).

The example above is still faster on PyPy (after warming up the JIT):

In[46]:= data = RandomReal[1, 10000];
In[47]:= Map[f, data]; // Timing
Out[47]= {1.31894, Null}

In[48]:= Map[cf, data]; // Timing
Out[48]= {0.940861, Null}

sn6uv · 2016-09-14T13:20:19Z

I've got some experimental plotting working (pushed to my CompilePlot branch). The speed-ups are significant:

Compile

(venv_pypy) angus@thinkpad> python mathics/benchmark.py -s Plot                                                                    ~/Mathics
Plot
  'Plot[0, {x, -3, 3}]'
      100 loops, avg: 11.8 ms, best: 5.45 ms, median: 8.62 ms per loop
  'Plot[x^2 + x + 1, {x, -3, 3}]'
     1000 loops, avg: 5.44 ms, best: 3.12 ms, median: 4.39 ms per loop
  'Plot[Sin[Cos[x^2]], {x, -3, 3}]'
     1000 loops, avg: 6.07 ms, best:    5 ms, median: 5.33 ms per loop
  'Plot[Sin[100 x], {x, -3, 3}]'
     1000 loops, avg: 5.18 ms, best: 4.43 ms, median:  4.6 ms per loop

(venv_pypy) angus@thinkpad> python mathics/benchmark.py -s Plot3D                                                                  ~/Mathics
Plot3D
  'Plot3D[0, {x, -1, 1}, {y, -1, 1}]'
      100 loops, avg: 18.4 ms, best: 9.09 ms, median: 13.6 ms per loop
  'Plot3D[x + y^2, {x, -3, 3}, {y, -2, 2}]'
      100 loops, avg: 20.5 ms, best: 13.4 ms, median: 16.5 ms per loop
  'Plot3D[Sin[x + y^2], {x, -3, 3}, {y, -3, 3}]'
      100 loops, avg: 46.4 ms, best: 36.1 ms, median:   39 ms per loop
  'Plot3D[Sin[100 x + 100 y ^ 2], {x, 0, 1}, {y, 0, 1}]'
      100 loops, avg: 38.9 ms, best: 31.2 ms, median: 33.1 ms per loop

Current

(venv_pypy) angus@thinkpad> python mathics/benchmark.py -s Plot                                                                    ~/Mathics
Plot
  'Plot[0, {x, -3, 3}]'
       10 loops, avg:  120 ms, best: 87.2 ms, median:  116 ms per loop
  'Plot[x^2 + x + 1, {x, -3, 3}]'
        5 loops, avg:  395 ms, best:  218 ms, median:  313 ms per loop
  'Plot[Sin[Cos[x^2]], {x, -3, 3}]'
        5 loops, avg:  499 ms, best:  328 ms, median:  445 ms per loop
  'Plot[Sin[100 x], {x, -3, 3}]'
        5 loops, avg:  245 ms, best:  164 ms, median:  216 ms per loop

(venv_pypy) angus@thinkpad> python mathics/benchmark.py -s Plot3D                                                                  ~/Mathics
Plot3D
  'Plot3D[0, {x, -1, 1}, {y, -1, 1}]'
       10 loops, avg:  141 ms, best: 68.2 ms, median:  138 ms per loop
  'Plot3D[x + y^2, {x, -3, 3}, {y, -2, 2}]'
        5 loops, avg:  277 ms, best:  162 ms, median:  258 ms per loop
  'Plot3D[Sin[x + y^2], {x, -3, 3}, {y, -3, 3}]'
        5 loops, avg:  366 ms, best:  272 ms, median:  339 ms per loop
  'Plot3D[Sin[100 x + 100 y ^ 2], {x, 0, 1}, {y, 0, 1}]'
        5 loops, avg:  378 ms, best:  323 ms, median:  351 ms per loop

teaalltr · 2016-09-14T15:12:02Z

What about adding packed array support? It would be very useful throughout Mathics. It could be implemented using either numpy's array or directly using SIMD llvm compiling. It should of course be completely transparent to the user and as transparent as possible to functions. Maybe automatic conversion could be done during idle time between evaluations.

sn6uv · 2016-09-15T09:50:29Z

Packed arrays would be a useful addition from a performance perspective. As you mentioned the important (and difficult) thing is to make it transparent to the user. I'm not sure llvm is a good candidate for this but numpy might be.

poke1024 · 2016-09-15T10:26:10Z

LLVM should handle SIMD optimizations for packed arrays beautifully: http://llvm.org/devmtg/2014-02/slides/golin-AutoVectorizationLLVM.pdf. Numpy will always have a disadvantage since it only packages certain operations, but cannot optimize the individual problem on an instruction level. I'm really excited about this LLVM introduction into Mathics.

sn6uv · 2016-09-15T16:02:01Z

Some really interesting stuff on calling back into python from within llvm: http://eli.thegreenplace.net/2015/calling-back-into-python-from-llvmlite-jited-code/.

I've got a basic Print function working with a similar approach but the callbacks are quite powerful. We can read/write the mathics symbol table (Definitions) at LLVM runtime and evaluate arbitrary un-compilable expressions a la MainEvaluate.

wolfv · 2016-09-19T17:45:49Z

I am wondering if compiling to LLVM is better than using the resulting SymPy nodes and then use SymPy code generation to get some C++ and compile and import that to Python e.g. with cppimport (https://github.com/tbenthompson/cppimport). Just a thought :)

sn6uv · 2016-09-21T01:43:10Z

I've rebased on master to fix conflicts in setup.py. I've fixed the llvmlite intrinsic issue so I think this can be merged soon.

sn6uv added enhancement performance labels Sep 14, 2016

sn6uv force-pushed the LLVMCompile branch 2 times, most recently from 62c000f to 86a12da Compare September 14, 2016 13:14

sn6uv force-pushed the LLVMCompile branch from 86a12da to e239e22 Compare September 14, 2016 15:00

teaalltr referenced this pull request in sn6uv/Mathics Sep 14, 2016

experimental compile support plot

6c96179

sn6uv force-pushed the LLVMCompile branch 6 times, most recently from 16c21be to fb79ede Compare September 15, 2016 08:04

sn6uv force-pushed the LLVMCompile branch 2 times, most recently from 23f843e to cc11fed Compare September 16, 2016 03:12

sn6uv mentioned this pull request Sep 17, 2016

In the long term, we might need a different language #564

Open

sn6uv added 7 commits September 21, 2016 11:40

basic LLVM compilation

bf4659b

add real support and move compile tests

baf318c

infer function return types from args

6fbf67e

compile arbitrary Plus

3094556

cleanup IR generation and implement Times

4854904

add some LLVM intrinsics

9ab7370

fix LLVM return type inference

1ec229f

sn6uv added 24 commits September 21, 2016 11:40

compile reciprocal trig functions

3577265

fix If computing both cases

490b9e7

fix if return branching

9538a8e

refactor compilation

ec8a6ad

refactor compilation utils

bda1a85

rename MathicsArg to CompileArg

e83f30f

implement Compile and CompiledFunction

f5947bd

fix PyPy compiled function pickle bug

dd9e276

cleanup IR generator

ce661c4

fix Power ir gen

4bbdb97

fix Power ir gen (powi only works with i32)

a3789d0

more Power tests

40198ec

use assertIs not assertEqual in compile tests

e0f00ee

cleanup int to real conversions

b08892d

implement python LLVM callbacks

a703911

don't rely on assertIs for type and value comparison

fbd248e

cleanup int to bool conversion

7910d93

fixup bool<>int>real conversions

e77e39c

more thorough compiled type testing

a238e6b

add some failing compile tests

d83dfaf

fix Compile arg checking

07bb408

fix Abs, Min, Max integer IR gen

4e35baa

docs and tests for compile

e05ea66

declare intrinsic types properly (solves numba/llvmlite#205)

40de5fd

sn6uv force-pushed the LLVMCompile branch from 4bf2d6f to 40de5fd Compare September 21, 2016 01:41

sn6uv added this to the 1.0 milestone Sep 21, 2016

check conflicting return types

fa32aab

sn6uv merged commit 6d9c95b into mathics:master Sep 23, 2016

sn6uv deleted the LLVMCompile branch September 23, 2016 01:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LLVM compilation #548

LLVM compilation #548

sn6uv commented Sep 14, 2016 •

edited

Loading

ghost commented Sep 14, 2016

sn6uv commented Sep 14, 2016

sn6uv commented Sep 14, 2016

teaalltr commented Sep 14, 2016

sn6uv commented Sep 15, 2016

poke1024 commented Sep 15, 2016

sn6uv commented Sep 15, 2016

wolfv commented Sep 19, 2016

sn6uv commented Sep 21, 2016

LLVM compilation #548

LLVM compilation #548

Conversation

sn6uv commented Sep 14, 2016 • edited Loading

ghost commented Sep 14, 2016

sn6uv commented Sep 14, 2016

sn6uv commented Sep 14, 2016

Compile

Current

teaalltr commented Sep 14, 2016

sn6uv commented Sep 15, 2016

poke1024 commented Sep 15, 2016

sn6uv commented Sep 15, 2016

wolfv commented Sep 19, 2016

sn6uv commented Sep 21, 2016

sn6uv commented Sep 14, 2016 •

edited

Loading