Skip to content

Latest commit

 

History

History
507 lines (353 loc) · 8.95 KB

functional-python.md

File metadata and controls

507 lines (353 loc) · 8.95 KB

Functional Programming in Python

About me

Data Scientist @ idalab (mainly Python) Used Ruby, JS, Python, Haskell, Swift for nontrivial projects Played with Clojure, Scala, Erlang, Elixir

http://kirelabs.org/fun-js

About this talk

  • Not a motivation of functional programming
  • How can FP by used in Python

Disclaimer

There should be one — and preferably only one — obvious way to do it. -- PEP 20 — The Zen of Python

^ Python has a philosophy and many of the things you will see go against this philosopy This goes so far that

Disclaimer (cont)

The fate of reduce() in Python 3000

not having the choice streamlines the thought process -- Guido van Rossum

^ 2005 Disagree: language should support me Don’t recommend you to use what I’m presenting here Show you the entrance to the rabbit hole Jupyter Notebook! Don’t be afraid to interrupt me!

Functional Programming (in Python)

  • first class functions
  • higher order functions
  • purity
  • immutability
  • composition
  • partial application & currying
  • recursion

^ Vocabulary of fuctional concepts

Functional Programming (in Python)

  • first class functions
  • higher order functions
  • purity
  • immutability (not today)
  • composition
  • partial application & currying
  • recursion (neither)

Purity

functions without side-effects

def add(a, b):
	return a + b
	
additions_made = 0
def add(a, b):
	global additions_made
	additions_made += 1
	return a + b

First class functions

def add(a, b):
	return a + b
	
add_function = add
	
add = lambda a,b: a + b

higher order functions

def timer(fn):
	def timed(*args, **kwargs):
		t = time()
		fn(*args, *kwargs)
		print "took {time}".format(time=time()-t)
		
	return timed

def compute():
	#…

timed_compute = timer(compute)
timed_compute()

Decorators

@timer
def compute():
    sleep(1)
    
compute()

Partial function application

def add1(num):
	return add(1, num)
add1(1)

# simpler
from functools import partial
add1 = partial(add, 1)
add1(1)

^ Toy example - just building up a toolbox for the fun stuff

Currying

[…] transforming a function that takes multiple arguments in such a way that it can be called as a chain of functions, each with a single argument (partial application) — Wikipedia

Currying

def curried_add(a):
	def inner(b):
		return add(a,b)
	return inner
		
add(1)    # => <function …>
add(1)(1) # => 2

Currying

from toolz import curry
add = curry(add)

add(1)    # => <function …>
add(1, 1) # => 2

Interlude: Closures

def curried_add(a):
	def inner(b):
		return add(a,b)
	return inner
		
add(1)    # => <function …>
add(1)(1) # => 2

Currying example from the stdlib

from operator import itemgetter, attrgetter, methodcaller

obj.method()

from operator import methodcaller
methodcaller("method")(obj)

^ We will see how this is useful

[fit] Functional collection transformations

map

map(f, iter)

[f(el) for el in seq]

filter

filter(p, seq)

[el for el in seq if p(el)]

reduce

from functools import reduce
reduce(f, seq, initial)

result = initial
for el in seq:
	result = f(result, el)

^ almost every time I see a reduce() call with a non-trivial function argument, I need to grab pen and paper to diagram what's actually being fed into that function before I understand what the reduce() is supposed to do

function composition

[f(x) for x in seq if p(x)]

map(f, filter(p, seq))

from toolz.curried import compose, map, filter
compute = compose(map(f), filter(p))
compute(seq)

^ Illustrational purposes

Example: A bad CSV parser (1/3)

csv = """firstName;lastName
Jim;Drake
Ben;James
Tim;Banes"""

target = [{'firstName': 'Jim', 'lastName': 'Drake'},
          {'firstName': 'Ben', 'lastName': 'James'},
          {'firstName': 'Tim', 'lastName': 'Banes'}]

Example: Imperative Python (2/3)

lines = csv.split("\n")
matrix = [line.split(';') for line in lines]
header = matrix.pop(0)
records = []
for row in matrix:
    record = {}
    for index, key in enumerate(header):
        record[key] = row[index]
    records.append(record)

Example: Functional Python (3/3)

from toolz.curried import compose, map
from functools import partial
from operator import methodcaller

split = partial(methodcaller, 'split')
split_lines = split("\n")
split_fields = split(';')
dict_from_keys_vals = compose(dict, zip)
csv_to_matrix = compose(map(split_fields), split_lines)

matrix = csv_to_matrix(csv)
keys = next(matrix)
records = map(partial(dict_from_keys_vals, keys), matrix)

Example: PySpark

docker run --rm -v ${PWD}:/home/jovyan/work -p 8888:8888 jupyter/pyspark-notebook
def sample(p):
    x, y = random(), random()
    return 1 if x*x + y*y < 1 else 0

count = sc.parallelize(range(0, NUM_SAMPLES)) \
	.map(sample) \
	.reduce(lambda a, b: a + b)
print("Pi is roughly %f" % (4.0 * count / NUM_SAMPLES))

^ http://spark.apache.org/docs/latest/programming-guide.html#transformations

Example: K-Means

(Stolen and modified from Joel Grus)

def kmeans(points, k):
	return until_convergence(
		iterate(
			find_new_means(points),
			random.sample(points, k)))

def until_convergence(it):
    return last(accumulate(no_repeat, it))
    
def no_repeat(prev, curr):
    if prev == curr: raise StopIteration
    else: return curr

import random
from toolz.curried import iterate, accumulate, curry, groupby, last, compose

def kmeans(k, points):
    return until_convergence(iterate(find_new_means(points), random.sample(points, k)))
    
@curry
def find_new_means(points, old_means):
    k = len(old_means)
    clusters = groupby(compose(str, closest_mean(old_means)), points).values()
    return list(map(cluster_mean, clusters))

@curry
def closest_mean(means, point):
    return min(means, key=squared_distance(point))

@curry
def squared_distance(p, q):
    return sum((p_i - q_i)**2 for p_i, q_i in zip(p, q))

def cluster_mean(points):
    num_points = len(points)
    dim = len(points[0]) if points else 0
    sum_points = [sum(point[j] for point in points)
                  for j in range(dim)]
    return [s / num_points for s in sum_points]

Main takeaways

  • FP is possible in Python (to a degree)
  • small composable functions are good
  • FP == build general tools and compose them

^ Functional programming enables writing small composable functions Decide for yourself and with your team if this is a good idea

[fit] Whats missing in Python (or what I am missing)

  • More list functions
  • Nicer lambda syntax
  • Automatic currying, composition syntax
  • ADTs (sum types)1
  • Pattern Matching

Functional libraries

(More list functions)

^ Matthew Rocklin last year’s keynote ALEXEY KACHAYEV

Nicer lambda syntax

map(lambda x: x**2, range(5)) # => [0, 1, 4, 9, 16]

from fn import _
map(_**2, range(5)) # => [0, 1, 4, 9, 16]

^ Might not work as expected in i.e. PySpark

Other interesting stuff

Other talks (where I have stolen material)

^ Matthew Rocklin PyData NYC 2013 - pytoolz ALEXEY KACHAYEV PyCon UA 2012 - fn.py Joel Grus: PyData Seattle 2015

More FP?

original

Thank you

Daniel Kirsch daniel.kirsch@idalab.de @kirel https://github.com/kirel/functional-python

Footnotes

  1. Possible but ugly http://stupidpythonideas.blogspot.de/2014/08/adts-for-python.html