[TVM] Zero elimination #2634

sgrechanik-h · 2019-02-20T14:40:04Z

This PR is extracted from the tensor expression autodiff PR #2498. It contains transformations required for optimizing the code generated by the autodiff. Note that although the transformations are independent from the autodiff, they aren't used anywhere else yet.

The main goal of zero elimination is to remove summation over zeros. This is achieved by

Factoring out conditions (LiftNonzeronessCondition does this)
Fusing conditions with reductions and simplifying their iteration domains (SimplifyReductionDomain)

The facade function which combines both these steps is OptimizeAndLiftNonzeronessConditions.

A brief summary of functions added by the PR:

include/tvm/ir_operator.h
- is_const_value - check if an Expr is a const of the given value, works for floats as well as for ints.
src/op/op_util.h
- TensorFromExpr - create a tensor from an expr and itervars. Something like compute, but easier to use in some contexts since it automatically duplicates reductions.
- TransformBody - create a new tensor by transform the body of the given tensor using the given expression transforming functions.
src/pass/zero_elimination.h
- IsSumCombiner - check whether the given combiner is a sum.
- CanFactorZeroFromCombiner - check if we can factor out zero from the given combiner.
- LiftNonzeronessCondition - transform the expression into the form select(c, e, 0).
- InlineTailCall - if the given tensor is just a call to another tensor, inline it. This function is used in the autodiff.
- InlineTensors - transform the given expression or tensor by inlining calls to other tensors, also used in the autodiff.
- SolveSystemOfInequalities - perform Fourier-Motzkin elimination.
- SimplifyDomain - simplify the given iteration domain using Fourier-Motzkin elimination and several other transformations.
- SimplifyReductionDomain - simplify the iteration domain of the given reduction using the previous function.
- ExtractAsTensorMaybe - extract the given expr into a separate tensor if this seems reasonable.
- ExtractNonTopReductions - extract non-top-level reductions as separate tensors.
- ExtractReductions - almost the same thing, but may also extract the given expr itself.
- OptimizeAndLiftNonzeronessConditions - a more powerful version of LiftNonzeronessCondition which also works for reductions and tries to optimize the given tensor using the above transformations (mostly by simplifying iteration domains). This is the most important function which is used by the autodiff.
python/tvm/testing.py
- estimate_performance - statically estimate performance of the given Tensor or Expr, used for regression testing.

tqchen · 2019-02-23T22:55:09Z

@sgrechanik-h please request reviews

sgrechanik-h · 2019-02-25T13:16:04Z

@ZihengJiang @were @xqdan @derisavi @grwlf Please review.

sergei-mironov

Here is a partial review containing few questions.

sergei-mironov · 2019-02-25T13:48:32Z

src/pass/zero_elimination.cc

+    Visit(op->buffer_var);
+    IRVisitor::Visit_(op);
+  }
+};


Not an issue, but a generic thought. How can we be sure that we really visit all IR operations involving variables? What if somebody adds new operation containing variable later? How to let them know that they need to fix this place too? May be we should think about using direct switch on IRNodeTypes rather than visitor pattern in such cases and instruct the C++ compiler to issue a warning if case for some node is missing. @tqchen what do you think?

If we want a compile time error/warning then the simplest solution is to create an alternative of IRVisitor with every Visit_ function declared as purely virtual:

class IRVisitorNoDefault : public IRVisitor { public: virtual void Visit_(const Variable* op) = 0; virtual void Visit_(const LetStmt* op) = 0; //... };

(Using a switch and configuring the compiler to raise and error on an incomplete switch could be an alternative, but it may be a bit compiler-dependent).

However, I'm not sure if it's worth doing since new operations are rarely added into IR.

include/tvm/ir_operator.h

src/pass/zero_elimination.cc

derisavi

Thanks for putting such a large effort into this work.
I haven't reviewed the whole code yet but these are my comments so far.

derisavi · 2019-02-26T21:29:43Z

include/tvm/ir_operator.h

+  } else {
+    return false;
+  }
+}


This method's implementation is very similar to that of tvm::is_const_int(). I wonder whether it's possible to reuse one in the other.

one difference I see is in how UIntImm is handled. Seems to me that your handling is correct and is_const_int's handling is not. So maybe the latter need to be fixed and then it will be more similar to yours.

Yes, I think it makes sense to express is_const_int through is_const_value.

python/tvm/testing.py

src/pass/zero_elimination.cc

…ndLiftNonzeronessConditions

sergei-mironov

Looks good to me, thanks.

tqchen · 2019-03-03T03:08:29Z

Thanks for putting great effort into this. Can we comment a bit on how some of the simplification routines might fit into the new infrastructure #2588?

Ideally, we would like to have a consolidated effort on index simplifications. So that some of the effort introduced here can benefit general cases.

I understand that such discussion and deliberation might cause a bit slowdown in bringing this in, but could benefit us in the long term.

sgrechanik-h · 2019-03-05T19:39:19Z

@tqchen Honestly speaking, I don't quite understand the design and motivation behind the new Analyzer class. In zero-elimination there are several internal analysis/transformations which are basically functions of the signature ResType (Expr, Map<Var, Range>), and they could be added as subanalyses by using a simple adapter from such functions to analyses. However, it is probable that I'm missing the point.

Concerning the usefulness of adding these transformations to the analyzer infrastructure, I should think a bit more on it. There are analyses like FreeVariables which are probably useful, there are transformations which are too zero-elimination-specific (like LiftNonzeronessConditions), and there are some transformations for which I'm not sure (like div/mod elimination).

tqchen · 2019-03-05T22:53:34Z

It is not about the usefulness of the features, but want to discuss where to best put them, so that we have a clear collection of sub-analyzers, and limited amount of states.

Let me give some of the example questions we can ask:

Can Map<Var, Range> fold directly into a Sub-Analyzer that gives you the constant bound or symbolic bound?
does div/mod fold into Canonical/Rewrite Simplifier?
Reduction simplification might now go into the Simplifier as well now that we can define Visitors for Reduce.

sgrechanik-h · 2019-03-06T16:04:17Z

@tqchen,

Can Map<Var, Range> fold directly into a Sub-Analyzer that gives you the constant bound or symbolic bound?

Like replacing Map<Var, Range> with Analyzer and calling analyzer.const_int_bound(...) instead of looking up in the map? Yes, this is possible.

does div/mod fold into Canonical/Rewrite Simplifier?

Div/mod elimination is not really a simplification, it is a transformation into a form which can be manipulated more easily by certain algorithms. So I don't think it should be a part of any simplifier. It's actually quite common: many transformations don't really simplify expressions in the usual sense.

Reduction simplification might now go into the Simplifier as well now that we can define Visitors for Reduce.

This is possible.

Actually I have several doubts about the new Analyzer infrastructure that may be worth discussing, in particular:

The Analyzer class combines context tracking and managing subanalyzers. Or may be analyzers combine context tracking and analysis itself. I have a feeling that these things should be separated, although I may be wrong.
The notion of a subanalyzer is not very well defined.
It seems very hard to add a new subanalyzer: it seems to involve modifying several classes in several different places in a way that is very dependent on the particular analyzer, instead of just adding a class or a function to a list.
There is no example of memoization yet, and I'm not sure how it'll go. However, I believe that the problem of memoization should be separated from analysis/transformation itself, i.e. there should be some combinator to make arbitrary analysis memoized.

kaitingwang · 2019-03-13T02:14:15Z

src/pass/zero_elimination.cc

+                          const Array<Var>& outer_axis,
+                          const Map<Var, Range>& vranges) {
+  // TODO(sgrechanik-h): We don't use divmod elimination here because of some performance problems
+  auto res = SimplifyDomain(cond, outer_axis, vranges, false);


SimplifyDomain iterates over outer_axis in reverse order at line 1297 (as described by equations 22-26 in the paper?). As such, res.axis and res.conditions are in opposite orders. (conditions are added at line 1394 in the original order) This can result in a unnecessary transpose copy by line 1468 when trying to create a TensorFromExpr with res.axis which contain axes in reverse order from that of new_expr. Should a simple fix in SimplifyDomain ensure original order of simplified axes in res (i.e. not reverse), a unnecessary copy can still result from TensorFromExpr. Any thoughts? Tensor copy can be costly to performance.

@kaitingwang I think it makes sense to reverse the resulting res.axis to match the original order of variables, I'll fix it. However, although it helps in some cases, the real problem is in the ExtractAsTensorMaybe function which doesn't check if the expression is costly enough to make the extraction beneficial. Since "costly enough" is rather difficult to define precisely, I think, the proper solution is to rely on schedules (compute_inline should help, I guess). However, this is quite naive (especially considering that we don't have an autoscheduler yet), so I think I'll implement some heuristics to prevent such extractions in some obvious cases.

Indeed, in cases where new_expr is a computational intensive expression, computing it once and memoize the result for later uses is wise. This is in the case where the uses may require recomputing the same computational intensive expression multiple times. Thanks for the fix.

tqchen · 2019-03-25T04:10:11Z

@sgrechanik-h perhaps some of the discussion should go to #2588 . Thanks for being patient. I think one goal would be moving all the states related to the context information into the Analyzer and move everything related to expression simplification there.

For now, I did not implement memoization mainly because memoization can cause problems when we update context information about variable x(and previously computed transformation about x related expression may no longer hold.

It would be great if we can consolidate everything into the new infra, and have more specific discussions in #2588 on how things can be further improved. I hope by doing so we can have a common infra that we can improve upon and produce as less technical debts as possible

sgrechanik-h · 2019-03-31T09:56:55Z

For now, I did not implement memoization mainly because memoization can cause problems when we update context information about variable x(and previously computed transformation about x related expression may no longer hold.

@tqchen I think that it may be important to implement memoization as early as possible exactly for this reason: it may cause lots of problems requiring us to make completely different design decisions about the whole Analyzer infrastructure.

xqdan · 2019-04-02T09:26:08Z

src/pass/zero_elimination.cc

+    e = Substitute(e, vmap);
+  }
+
+  return CanonicalSimplify(Simplify(CanonicalSimplify(e, vranges), vranges), vranges);


can you explain why call 3 times simplify here?

It is simply the combination that gave the best result. I think I will rewrite this piece of code in the near future so that it doesn't use the Halide simplifier and uses the new simplifiers instead (and hopefully there won't be a need to make 3 calls).

tqchen · 2020-03-30T20:54:43Z

Superseded by #5171

[TVM] Zero elimination

fb15019

sgrechanik-h mentioned this pull request Feb 20, 2019

[TVM] Automatic differentiation for tensor expressions #2498

Closed

tqchen added the status: need review label Feb 20, 2019

sergei-mironov reviewed Feb 25, 2019

View reviewed changes

sergei-mironov reviewed Feb 26, 2019

View reviewed changes

src/pass/zero_elimination.cc Outdated Show resolved Hide resolved

derisavi reviewed Feb 26, 2019

View reviewed changes

sgrechanik-h added 4 commits February 27, 2019 18:26

Fix several Or-related issues

1d7b561

Implement is_const_int via is_const_value

cdd9c81

Use vranges in IsSumCombiner, CanFactorZeroFromCombiner and OptimizeA…

f020bf1

…ndLiftNonzeronessConditions

Fix the bug in tensor inlining

7f5ef4d

sergei-mironov approved these changes Mar 1, 2019

View reviewed changes

sgrechanik-h added 2 commits March 4, 2019 14:53

Fix treatment of if_then_else in LiftNonzeronessCondition

8c8dfb0

More comments; Fixed the behaviour of divmod elimination for equal exprs

f697063

kaitingwang reviewed Mar 13, 2019

View reviewed changes

sgrechanik-h added 3 commits March 13, 2019 19:19

Reverse the resulting axis array in SimplifyDomain

ea1c184

Don't extract tensor calls in ExtractAsTensorMaybe

58ba0fb

Fix lint errors

cd1375e

xqdan reviewed Apr 2, 2019

View reviewed changes

yzhliu mentioned this pull request Mar 30, 2020

[Arith] linear system and equation solver #5171

Merged

tqchen closed this Mar 30, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[TVM] Zero elimination #2634

[TVM] Zero elimination #2634

sgrechanik-h commented Feb 20, 2019

tqchen commented Feb 23, 2019

sgrechanik-h commented Feb 25, 2019

sergei-mironov left a comment

sergei-mironov Feb 25, 2019

sgrechanik-h Feb 27, 2019

derisavi left a comment •

edited

Loading

derisavi Feb 26, 2019

sgrechanik-h Feb 27, 2019

sergei-mironov left a comment

tqchen commented Mar 3, 2019

sgrechanik-h commented Mar 5, 2019

tqchen commented Mar 5, 2019 •

edited

Loading

sgrechanik-h commented Mar 6, 2019

kaitingwang Mar 13, 2019

sgrechanik-h Mar 13, 2019

kaitingwang Mar 13, 2019 •

edited

Loading

tqchen commented Mar 25, 2019

sgrechanik-h commented Mar 31, 2019

xqdan Apr 2, 2019

sgrechanik-h Apr 2, 2019

tqchen commented Mar 30, 2020

[TVM] Zero elimination #2634

[TVM] Zero elimination #2634

Conversation

sgrechanik-h commented Feb 20, 2019

tqchen commented Feb 23, 2019

sgrechanik-h commented Feb 25, 2019

sergei-mironov left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

derisavi left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sergei-mironov left a comment

Choose a reason for hiding this comment

tqchen commented Mar 3, 2019

sgrechanik-h commented Mar 5, 2019

tqchen commented Mar 5, 2019 • edited Loading

sgrechanik-h commented Mar 6, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kaitingwang Mar 13, 2019 • edited Loading

Choose a reason for hiding this comment

tqchen commented Mar 25, 2019

sgrechanik-h commented Mar 31, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tqchen commented Mar 30, 2020

derisavi left a comment •

edited

Loading

tqchen commented Mar 5, 2019 •

edited

Loading

kaitingwang Mar 13, 2019 •

edited

Loading