This project is no longer being maintained.
Parakeet was a runtime accelerator for an array-oriented subset of Python. In retrospect, I don't think that whole-function type specialization at the AST level is a scalable approach to speeding up a sufficiently large subset of Python. General-purpose Python code should probably be accelerated using a bytecode JIT, whereas high-performance numerical code should use a DSL with explicit parallel operators.
To accelerate a function, wrap it with Parakeet's @jit decorator:
import numpy as np
from parakeet import jit
alpha = 0.5
beta = 0.3
x = np.array([1,2,3])
y = np.tanh(x * alpha) + beta
@jit
def fast(x, alpha = 0.5, beta = 0.3):
return np.tanh(x * alpha) + beta
@jit
def loopy(x, alpha = 0.5, beta = 0.3):
y = np.empty_like(x, dtype = float)
for i in xrange(len(x)):
y[i] = np.tanh(x[i] * alpha) + beta
return y
@jit
def comprehension(x, alpha = 0.5, beta = 0.3):
return np.array([np.tanh(xi*alpha) + beta for xi in x])
assert np.allclose(fast(x), y)
assert np.allclose(loopy(x), y)
assert np.allclose(comprehension(x), y)
You should be able to install Parakeet from its PyPI package by running:
pip install parakeet
Parakeet is written for Python 2.7 (sorry internet) and depends on:
The default backend (which uses OpenMP) requires gcc
4.4+.
Windows: If you have a 32-bit Windows install, your compiler should come from Cygwin or MinGW. Getting Parakeet working on 64-bit Windows is non-trivial and seems to require colossal hacks.
Mac OS X: By default, your machine probably either has only clang or an outdated version of gcc
. You can get a more recent version using HomeBrew
If you want to use the CUDA backend, you need to have an NVIDIA graphics card and install both the CUDA Toolkit and PyCUDA.
Your untyped function gets used as a template from which multiple type specializations are generated (for each distinct set of input types). These typed functions are then churned through many optimizations before finally getting translated into native code.
- Ask questions on the discussion group
- Watch the Parakeet presentation from this year's PyData Boston, look at the HotPar slides from last year
- Contact the main developer directly
Parakeet cannot accelerate arbitrary Python code, it only supports a limited subset of the language:
- Scalar operations (i.e.
x + 3 * y
) - Control flow (if-statements, loops, etc...)
- Nested functions and lambdas
- Tuples
- Slices
- NumPy array expressions (i.e.
x[1:, :] + 2 * y[:-1, ::2]
) - Some NumPy library functions like
np.ones
andnp.sin
(look at the mappings module for a full list) - List literals (interpreted as array construction)
- List comprehensions (interpreted as array comprehensions)
- Parakeet's higher order array operations like
parakeet.imap
,parakeet.scan
, andparakeet.allpairs
Parakeet currently supports compilation to sequential C, multi-core C with OpenMP (default), or LLVM (deprecated). To switch between these options change parakeet.config.backend
to one of:
- "openmp": compiles with gcc, parallel operators run across multiple cores (default)
- "c": lowers all parallel operators to loops, compile sequential code with gcc
- "cuda": launch parallel operations on the GPU (experimental)
- "llvm": older backend, has fallen behind and some programs may not work
- "interp" : pure Python intepreter used for debugging optimizations, only try this if you think CPython is about 10,000x too fast for your taste