Added KanrenRelationSub for distributive rewrites #634

kc611 · 2021-10-28T16:12:21Z

This PR adds the following optimizations

xa + ya + z*a -> (x + y + z)*a
x/a + y/a + z/a -> (x + y + z)/a
xa - ya -> (x - y)*a
x/a - y/a -> (x - y)/a

Resolves #606

The graph now returns

import aesara
import aesara.tensor as at

eta_at = at.scalar("eta")
kappa_at = at.scalar("kappa")

graph_at = eta_at / kappa_at + (1 - eta_at) / kappa_at
graph_fn = aesara.function([eta_at, kappa_at], graph_at)

aesara.dprint(graph_fn.maker.fgraph)
# Elemwise{reciprocal,no_inplace} [id A] ''   0
#  |kappa [id B]

Gist elaborating the implementation:
https://gist.github.com/kc611/b33e45ed2086597ed9c9df4f387c84b0

brandonwillard · 2021-10-28T22:51:28Z

See the comments at the end of this reply. They refer to the fact that the distributive identity (i.e. what's being implemented here) comes with some numeric concerns.

By adding general distributive identities, we simplify the rewrite process, but we might also introduce issues (e.g. like the kinds mitigated by #275 and Kahan summation).

One way to approach this is to perform the distributive rewrite only as an intermediate rewrite within a larger sequence of rewrites that guarantee cancelation of terms.

This doesn't necessarily prevent such rewrites, but we need to consider the trade-offs and how we can navigate and possibly even avoid them.

kc611 · 2021-10-30T13:05:19Z

aesara/configdefaults.py

@@ -744,6 +744,15 @@ def add_compile_configvars():
        in_c_key=False,
    )

+    config.add(
+        "fastmath_opts",


Added this configuration which will make the rewrite optional.

Instead, don't make them canonicalizations (i.e. don't use register_canonicalize) and register them manually to a different optimization DB. As it currently stands, local_add_sub_collector is being called unnecessarily far too often.

In general, optimization selection and filtering is accomplished using the optimizations DBs and tags.

codecov · 2021-10-30T14:05:54Z

Codecov Report

Merging #634 (9d5537b) into main (240827c) will decrease coverage by 0.17%.
The diff coverage is 100.00%.

❗ Current head 9d5537b differs from pull request most recent head 5a864a2. Consider uploading reports for the commit 5a864a2 to get more accurate results

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #634      +/-   ##
==========================================
- Coverage   78.35%   78.18%   -0.18%     
==========================================
  Files         152      152              
  Lines       47685    47682       -3     
  Branches    10881    10882       +1     
==========================================
- Hits        37364    37280      -84     
- Misses       7773     7844      +71     
- Partials     2548     2558      +10

Impacted Files	Coverage Δ
aesara/tensor/math_opt.py	`86.64% <100.00%> (+0.41%)`	⬆️
aesara/graph/type.py	`75.92% <0.00%> (-3.23%)`	⬇️
aesara/compile/debugmode.py	`57.42% <0.00%> (-3.07%)`	⬇️
aesara/tensor/type_other.py	`80.76% <0.00%> (-2.38%)`	⬇️
aesara/tensor/sharedvar.py	`82.22% <0.00%> (-1.46%)`	⬇️
aesara/tensor/basic.py	`86.25% <0.00%> (-1.21%)`	⬇️
aesara/compile/function/pfunc.py	`82.25% <0.00%> (-1.08%)`	⬇️
aesara/tensor/type.py	`91.25% <0.00%> (-1.00%)`	⬇️
aesara/tensor/basic_opt.py	`84.37% <0.00%> (-0.78%)`	⬇️
aesara/sparse/type.py	`70.66% <0.00%> (-0.60%)`	⬇️
... and 35 more

brandonwillard · 2021-10-30T21:42:50Z

aesara/configdefaults.py

@@ -744,6 +744,15 @@ def add_compile_configvars():
        in_c_key=False,
    )

+    config.add(
+        "fastmath_opts",


Instead, don't make them canonicalizations (i.e. don't use register_canonicalize) and register them manually to a different optimization DB. As it currently stands, local_add_sub_collector is being called unnecessarily far too often.

In general, optimization selection and filtering is accomplished using the optimizations DBs and tags.

kc611 · 2021-11-03T10:12:50Z

Alright, so I registered onto a SequenceDB but when I do:

import aesara
import aesara.tensor as at
from aesara.graph.optdb import OptimizationQuery
from aesara.tensor.math_opt import fastmath_db
from aesara.graph.fg import FunctionGraph

eta_at = at.scalar("eta")
kappa_at = at.scalar("kappa")

graph_at = eta_at / kappa_at + (1 - eta_at) / kappa_at
graph_fn = FunctionGraph(
        inputs = [eta_at, kappa_at],
        outputs=[graph_at],
    )

fastmath_db.query(OptimizationQuery(include=["basic"])).optimize(graph_fn) #AttributeError: 'FromFunctionLocalOptimizer' object has no attribute 'optimize'

aesara.dprint(graph_fn)

Is it not supposed to be used like that ?

kc611 · 2021-11-03T17:31:15Z

xa + ya + za -> (x + y + z)a
x/a + y/a + z/a -> (x + y + z)/a
xa - ya -> (x - y)*a
x/a - y/a -> (x - y)/a

Is this particular collection of rewrites subject to #649, should we be separating them out?

brandonwillard · 2021-11-26T04:26:47Z

Is this particular collection of rewrites subject to #649, should we be separating them out?

Do you mean the +/- or arity difference? The +/- difference is a case of #648. Otherwise, these rewrites should be able to handle all arities.

brandonwillard · 2021-12-13T05:02:18Z

Now that we have kanren support, this would be a good exercise for that.

Using kanren would allow us to more succinctly perform all the intermediate computations without the need for potentially (numerically) destabilizing canonicalizations.

We would need to devise a kanren relation (well, a goal constructor) that uses the properties implemented here (i.e. distributive properties) to "search" for the reductions described in #606. We would need the resulting goals to succeed only when at least one reduction has been made. That could be a little tricky to do entirely in miniKanren, though.

Anyway, we can start discussing it here (or somewhere else).

kc611 · 2022-01-01T17:20:25Z

I made a kanren version of those distributive optimizations but it ind-of seems like it isn't able to generalize them to an arbitrary number of distributive optimizations. Have a look at https://gist.github.com/kc611/b33e45ed2086597ed9c9df4f387c84b0

brandonwillard · 2022-01-01T20:18:46Z

I made a kanren version of those distributive optimizations but it ind-of seems like it isn't able to generalize them to an arbitrary number of distributive optimizations. Have a look at https://gist.github.com/kc611/b33e45ed2086597ed9c9df4f387c84b0

Just added a comment in that Gist.

aesara/tensor/math_opt.py

tests/tensor/test_math_opt.py

brandonwillard

You can make the kanren goals more flexible with respect to the supported Ops, instead of creating goals for each one separately. For instance, make the Op a logic variable op_lv, add a goal that confirms the value of op_lv is one of the accepted Ops, and use op_lv in place of the Op.

To perform the check in the second step, conde can be used (e.g. conde([eq(op_lv, at.mul)], [eq(op_lv, at.true_div)], ...)), so can type constraints (e.g. isinstanceo). Just note that if you use something like the examples I just gave, they each have their own limitations (e.g. conde testing for at.mul will only work for those exact at.mul Op instances).

kc611 · 2022-01-15T18:21:11Z

tests/tensor/test_math_opt.py

+        "orig_operation, optimized_operation",
+        [
+            (a_at * x_at + a_at * y_at, a_at * (x_at + y_at)),
+            (x_at * a_at + y_at * x_at, (x_at + y_at) * a_at),


So I managed to generalize them for div and mul but it seems that these rewrites aren't taking the facts() into account. For instance this particular case passes but the case above it doesn't. Isn't this supposed to be handled by fact(commutative, at.mul) in the rewrite ? (or does that stand for something else)

This might have to do with the eq goal that's being used at specific steps (even behind the scenes). There are special eq_* goals that do (and don't) take into account the associativity/commutativity (AC) information set by facts, and you may need to use those explicitly at certain points. Just try not to use them when they're not needed; otherwise, the streams resulting from the goals will become very long due to all the permutations induced by AC relations.

twiecki · 2022-01-26T14:34:29Z

This might be the wrong place for this discussion, but there are also rewrites with respect to shapes that can be optimized. For example b * M * a where where a and b are scalars and M is a matrix are more efficient to rewrite to b * a * M so that there is only a single matrix multiplication, instead of two.

kc611 · 2022-01-26T14:54:12Z

If a and b are constants I guess that case would be handled by constant folding. But yeah, we would need a rewrite when they're singular variables if we are to implement something like that.

brandonwillard · 2022-01-26T19:11:07Z

This might be the wrong place for this discussion, but there are also rewrites with respect to shapes that can be optimized. For example b * M * a where where a and b are scalars and M is a matrix are more efficient to rewrite to b * a * M so that there is only a single matrix multiplication, instead of two.

Are you talking about transforming ((b * M) * a) to ((b * a) * M) so that there's only one matrix/scalar product? We have some things like that for sums and products in aesara.tensor.math_opt, and the AlgebraicCanonizer does some similar things, but it looks like we might not have a rewrite that covers that exact case.

Here's a way you can check these kinds of things:

import aesara

import aesara.tensor as at
from aesara.graph.opt_utils import optimize_graph


a, b = at.scalars("ab")
M = at.matrix("M")

z = (a * M) * b

aesara.dprint(z)
# Elemwise{mul,no_inplace} [id A] ''
#  |Elemwise{mul,no_inplace} [id B] ''
#  | |InplaceDimShuffle{x,x} [id C] ''
#  | | |a [id D]
#  | |M [id E]
#  |InplaceDimShuffle{x,x} [id F] ''
#    |b [id G]

# This will only perform canonicalizations, but others can be added via the
# `include` keyword
z_opt = optimize_graph(z)

aesara.dprint(z_opt)
# Elemwise{mul,no_inplace} [id A] ''
#  |InplaceDimShuffle{x,x} [id B] ''
#  | |a [id C]
#  |M [id D]
#  |InplaceDimShuffle{x,x} [id E] ''
#    |b [id F]


# This will perform all the default optimizations
z_fn = aesara.function([a, b, M], z)

aesara.dprint(z_fn)
# Elemwise{mul,no_inplace} [id A] ''   2
#  |InplaceDimShuffle{x,x} [id B] ''   1
#  | |a [id C]
#  |M [id D]
#  |InplaceDimShuffle{x,x} [id E] ''   0
#    |b [id F]

ricardoV94 · 2022-01-26T19:20:09Z

Are you guys talking about #287?

brandonwillard · 2022-01-26T19:34:17Z

Are you guys talking about #287?

Essentially, yes, but for multiplication, which is probably something we could consider implementing quickly.

twiecki · 2022-01-27T04:05:36Z

Are you talking about transforming ((b * M) * a) to ((b * a) * M) so that there's only one matrix/scalar product?

Yes, exactly.

twiecki · 2022-02-26T15:41:18Z

aesara/tensor/math_opt.py

+
+    # This does the optimization
+    # 1. (x + y + z) * A  = x * A + y * A + z * A
+    # 2. (x + y + z) / A  = x / A + y / A + z / A


Could this lead to numerical issues?

Ah no, the actual optimization being done here is reverse of that:

# Line 3545 distribute_over_add_opt = KanrenRelationSub(lambda x, y: distribute_over_add(y, x))

So it combines the additive terms not distributes it. It's just that it's easier to represent it in this way in kanren and it works just as well if we implement it the other way round purely in kanren.

kc611 · 2022-02-26T18:04:58Z

Alright so it seems that the optimizations work with arbitrary ordering of the common terms. All of these test cases work:

x_at * a_at + a_at * y_at = a_at * (x_at + y_at)
a_at * x_at + y_at * a_at = a_at * (x_at + y_at)
a_at * x_at + a_at * y_at = a_at * (x_at + y_at)
x_at * a_at + y_at * a_at = (x_at + y_at) * a_at
x_at / a_at + y_at / a_at = (x_at + y_at) / a_at
a_at * x_at - a_at * y_at = a_at * (x_at - y_at)
x_at * a_at - y_at * a_at = (x_at - y_at) * a_at
a_at * x_at - y_at * a_at = a_at * (x_at - y_at)
a_at * x_at - a_at * y_at = (x_at - y_at) * a_at
x_at / a_at - y_at / a_at = (x_at - y_at) / a_at

The current issue I'm trying to work on is to extrapolate singular terms into 1 * term:
For instance an Elemwise with a constant term in it:

x * A + A = x * A + 1 * A = (x + 1) * A

@brandonwillard Any ideas on how to do this using goals ? Or is it even possible, to add the extra terms inside the Kanren relations filters .

aesara/tensor/math_opt.py

brandonwillard

This PR is turning out to be an amazing example of genuine relational programming (and kanren)!

If it's possible to abstract/parameterize the implementation of distribute_over_add so that it also covers subtraction, the footprint of these features would be impressively small.

twiecki · 2022-03-21T09:22:49Z

aesara/tensor/math_opt.py

+# x_lv, if any, is the logic variable version of some term x,
+# while x_at, if any, is the Aesara tensor version for the same.
+def distributive_collect(in_lv, out_lv):
+    from kanren import eq


Why a local import?

twiecki · 2022-03-21T09:23:15Z

aesara/tensor/math_opt.py

+
+# x_lv, if any, is the logic variable version of some term x,
+# while x_at, if any, is the Aesara tensor version for the same.
+def distributive_collect(in_lv, out_lv):


I would add a doc-string.

twiecki · 2022-03-21T09:24:58Z

This PR is very powerful in its conciseness and the future it demonstrates. It really brings @brandonwillard's vision to the light. So I think we should turn this into a blog post that describes what's happening, how powerful it is, how unique it is, etc.

brandonwillard · 2022-03-23T21:55:11Z

Looks like the tests are taking too long to finish. I can't tell if it's GitHub Actions or not, yet.

kc611 · 2022-04-02T17:04:48Z

Alright so it seems like there are two separate issues in the failing test over here:

Things like a + a being optimized into a * (1 + 1). Now this technically is correct however it's not an optimization so should be fixed, this would need a condition in kanren corresponding to at-least one. The context in which this would be used is iterating over the contents of the logic variable cdr_lv in the distributive_collect optimization and checking if at-least one of the values is not equal to A_lv
The bad-view map error. This seems to be happening simply because the optimization has no way of knowing if there are alias between outputs in that particular case. Last time this happened I remember solving this test case by simply changing the position of the optimization. Not sure what the true fix here should be, though.

rlouf · 2022-09-07T06:08:24Z

I think (1) can be "solved" when what is being discussed here is implemented. Constraints like the one you suggested should be kept out of miniKanren goals imho.

Ideally we would apply all the possible rewrites and then use a scoring function to choose the "optimal" graph. We could also adopt a greedy approach a choose the "optimal" rewrite out of all possible rewrites at a given step.

brandonwillard · 2022-09-07T19:21:39Z

Ideally we would apply all the possible rewrites and then use a scoring function to choose the "optimal" graph. We could also adopt a greedy approach a choose the "optimal" rewrite out of all possible rewrites at a given step.

In this case, we would like to avoid the cost of re-searching the AC "graph space" on each application of these rewrites.

The e-graph data structures mentioned in #1082 could help with this, albeit not in a direct way perhaps. #1165 is a little closer, though, because it sets us up for rewrite caching and the like. #1165 also helps with #1082 and the use of e-graph-like data structures.

rlouf · 2022-10-17T08:40:43Z

Rebased this on main and resolved the merge conflicts that appeared after #1054

kc611 marked this pull request as draft October 28, 2021 16:15

brandonwillard assigned kc611 Oct 28, 2021

brandonwillard added graph rewriting enhancement New feature or request labels Oct 28, 2021

kc611 commented Oct 30, 2021

View reviewed changes

brandonwillard requested changes Oct 30, 2021

View reviewed changes

kc611 force-pushed the new_opts branch from fa2b44d to 438e7ce Compare November 3, 2021 10:11

kc611 force-pushed the new_opts branch from 438e7ce to cb93c47 Compare January 4, 2022 16:20

kc611 commented Jan 4, 2022

View reviewed changes

aesara/tensor/math_opt.py Outdated Show resolved Hide resolved

kc611 changed the title ~~Added local_add_sub_collector optimization~~ Added KanrenRelationSub for distributive rewrites Jan 4, 2022

kc611 force-pushed the new_opts branch from cb93c47 to ed20fdd Compare January 5, 2022 16:07

ricardoV94 reviewed Jan 5, 2022

View reviewed changes

tests/tensor/test_math_opt.py Outdated Show resolved Hide resolved

kc611 force-pushed the new_opts branch from ed20fdd to 4fe7d29 Compare January 9, 2022 16:52

kc611 marked this pull request as ready for review January 10, 2022 06:08

kc611 requested review from brandonwillard and ricardoV94 January 10, 2022 06:08

kc611 force-pushed the new_opts branch from 4fe7d29 to 9d5537b Compare January 10, 2022 17:52

brandonwillard reviewed Jan 10, 2022

View reviewed changes

kc611 force-pushed the new_opts branch from 9d5537b to cbead70 Compare January 15, 2022 18:18

kc611 commented Jan 15, 2022

View reviewed changes

kc611 force-pushed the new_opts branch 2 times, most recently from c9ac672 to 31da77d Compare January 26, 2022 14:27

kc611 force-pushed the new_opts branch from 31da77d to c95fcad Compare February 26, 2022 14:52

kc611 requested a review from brandonwillard February 26, 2022 14:55

twiecki reviewed Feb 26, 2022

View reviewed changes

brandonwillard requested changes Feb 26, 2022

View reviewed changes

kc611 force-pushed the new_opts branch from c95fcad to a5425d8 Compare March 19, 2022 18:24

brandonwillard reviewed Mar 19, 2022

View reviewed changes

twiecki reviewed Mar 21, 2022

View reviewed changes

kc611 force-pushed the new_opts branch 2 times, most recently from 0ae8664 to e55dcf3 Compare March 23, 2022 17:38

rlouf force-pushed the new_opts branch 2 times, most recently from b160ef5 to 67e1262 Compare October 17, 2022 08:39

kc611 added 3 commits October 19, 2022 09:24

Added KanrenRelationSub for distributive rewrites

ffac713

Fixed infinite distributive rewrites

903193c

Fixed datatype conversion in KanrenRelationSub

5a864a2

rlouf force-pushed the new_opts branch from 67e1262 to 5a864a2 Compare October 19, 2022 07:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added KanrenRelationSub for distributive rewrites #634

Added KanrenRelationSub for distributive rewrites #634

kc611 commented Oct 28, 2021 •

edited

Loading

brandonwillard commented Oct 28, 2021 •

edited

Loading

kc611 Oct 30, 2021

brandonwillard Oct 30, 2021

codecov bot commented Oct 30, 2021 •

edited

Loading

brandonwillard Oct 30, 2021

kc611 commented Nov 3, 2021

kc611 commented Nov 3, 2021

brandonwillard commented Nov 26, 2021

brandonwillard commented Dec 13, 2021

kc611 commented Jan 1, 2022

brandonwillard commented Jan 1, 2022

brandonwillard left a comment •

edited

Loading

kc611 Jan 15, 2022 •

edited

Loading

brandonwillard Jan 15, 2022

twiecki commented Jan 26, 2022

kc611 commented Jan 26, 2022

brandonwillard commented Jan 26, 2022

ricardoV94 commented Jan 26, 2022

brandonwillard commented Jan 26, 2022 •

edited

Loading

twiecki commented Jan 27, 2022

twiecki Feb 26, 2022

kc611 Feb 26, 2022

kc611 commented Feb 26, 2022 •

edited

Loading

brandonwillard left a comment

twiecki Mar 21, 2022

twiecki Mar 21, 2022

twiecki commented Mar 21, 2022

brandonwillard commented Mar 23, 2022

kc611 commented Apr 2, 2022

rlouf commented Sep 7, 2022

brandonwillard commented Sep 7, 2022

rlouf commented Oct 17, 2022

Added KanrenRelationSub for distributive rewrites #634

Are you sure you want to change the base?

Added KanrenRelationSub for distributive rewrites #634

Conversation

kc611 commented Oct 28, 2021 • edited Loading

brandonwillard commented Oct 28, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Oct 30, 2021 • edited Loading

Codecov Report

Choose a reason for hiding this comment

kc611 commented Nov 3, 2021

kc611 commented Nov 3, 2021

brandonwillard commented Nov 26, 2021

brandonwillard commented Dec 13, 2021

kc611 commented Jan 1, 2022

brandonwillard commented Jan 1, 2022

brandonwillard left a comment • edited Loading

Choose a reason for hiding this comment

kc611 Jan 15, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

twiecki commented Jan 26, 2022

kc611 commented Jan 26, 2022

brandonwillard commented Jan 26, 2022

ricardoV94 commented Jan 26, 2022

brandonwillard commented Jan 26, 2022 • edited Loading

twiecki commented Jan 27, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kc611 commented Feb 26, 2022 • edited Loading

brandonwillard left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

twiecki commented Mar 21, 2022

brandonwillard commented Mar 23, 2022

kc611 commented Apr 2, 2022

rlouf commented Sep 7, 2022

brandonwillard commented Sep 7, 2022

rlouf commented Oct 17, 2022

kc611 commented Oct 28, 2021 •

edited

Loading

brandonwillard commented Oct 28, 2021 •

edited

Loading

codecov bot commented Oct 30, 2021 •

edited

Loading

brandonwillard left a comment •

edited

Loading

kc611 Jan 15, 2022 •

edited

Loading

brandonwillard commented Jan 26, 2022 •

edited

Loading

kc611 commented Feb 26, 2022 •

edited

Loading