Create simple visualisations of mathematical operations on small datasets by rendering an expression graph, show your friends or serialise it for later.
Examples follow using the Python interactive shell
Import some things from the module
from expr import Expr, NumExpr
Construct an expression
expr = Expr(
operation_name='+',
arguments=[
NumExpr(number=1),
Expr(
operation_name='/',
arguments=[
NumExpr(number=2),
NumExpr(number=3),
]
)
]
)
Get an answer
>> expr.resolve()
1.6666666666666665
Draw a graph
>> graph = expr.graph()
>> graph.write_png('example.png')
True
example.png
Import things using as
to save your typing fingers by aliasing those
characters away. In Python we don't really need to care if functions specify
keyword or positional signatures, so let's not bother with those wordy
keywords, hey.
>>> from expr import Expr as E, NumExpr as N
>>> expr = E('/', [N(22), N(7)])
>>> expr.resolve()
3.142857142857143
>>> expr.graph().write_png('pi.png')
True
pi.png
We can create expressions that involve more than just numbers ...
>>> import pandas
>>> from expr import (
... Expr as E,
... NumExpr as N,
... DataFrameExpr as D,
... )
Create some stupid datasets
>>> def two_by_four():
... data = [(n + 1, n + 1) for n in range(4)]
... return pandas.DataFrame.from_records(data=data, columns=['a', 'b'])
>>> df_A = two_by_four()
>>> df_B = two_by_four()
>>> df_A
a b
0 1 1
1 2 2
2 3 3
3 4 4
Create the expression object, the DataFrameExpr
object (aliased here
as D
) takes an optional argument name
which will be used as a label if
present, otherwise an automatically generated label will applied.
>>> expr = E('*', [N(3), E('+', [D(df_A, 'A'), D(df_B, 'B')])])
>>> expr.graph().write_png('dataframe.png')
True
>>> expr.resolve()
0 1
0 6 6
1 12 12
2 18 18
3 24 24
dataframe.png
Let's serialise the above example using JSON, any arguments passed to the
serialise
method are used when the serialiser function is applied.
>>> import json
>>> print(expr.serialise(json.dumps, indent=4))
{
"__type__": "Expr",
"operation_name": "*",
"arguments": [
{
"__type__": "NumExpr",
"number": 3.0
},
{
"__type__": "Expr",
"operation_name": "+",
"arguments": [
{
"__type__": "DataFrameExpr",
"name": "A",
"dataframe": {
"a": {
"0": 1,
"1": 2,
"2": 3,
"3": 4
},
"b": {
"0": 1,
"1": 2,
"2": 3,
"3": 4
}
}
},
{
"__type__": "DataFrameExpr",
"name": "B",
"dataframe": {
"a": {
"0": 1,
"1": 2,
"2": 3,
"3": 4
},
"b": {
"0": 1,
"1": 2,
"2": 3,
"3": 4
}
}
}
]
}
]
}
The final expression provided in this package is FuncExpr
, which applies
arbitrary functions to the arguments supplied (which should be objects
implementing a resolve
method). Keywords are also applied, probably for
specifying options, which do not have to be expression objects. A contrived
example of FuncExpr
use follows.
Do some smart imports
import pandas
from expr import (
Expr as E,
NumExpr as N,
DataFrameExpr as D,
FuncExpr as F,
)
Define a factory function for creating DataFrame
objects
def x_by_y(x, y, columns=False):
data = [[a + b for b in range(x)] for a in range(y)]
if columns is False:
columns = [chr(97 + c) for c in range(x)]
return pandas.DataFrame.from_records(data=data, columns=columns)
Crank out some df's for later use
df_A = x_by_y(3, 4)
df_B = x_by_y(3, 4)
df_C = x_by_y(3, 4)
Now set up out expression and look at its graph
expr = E('+',
[
E('*',
[
N(3),
F('pandas.concat',
[
D(x_by_y(1, 4, ['a'])),
D(x_by_y(1, 4, ['b'])),
D(x_by_y(1, 4, ['c'])),
], axis=1
)
]
),
D(x_by_y(3, 4))
]
)
G = expr.graph()
G.write_png('func.png')
func.png
Why not very clumsily visualise a call stack for your audience's pleasure?
If you like YAML, serialising pandas
objects, may be problematic, but
JSON should be fine.
Colours courtesy of clrs.cc