[RELAY] Turn reshape into nop in graph executor backend. #7945

tqchen · 2021-04-29T14:16:08Z

Previously we are generating the function calls for reshape.
This PR updates the optimization to turn reshape into nop:

Tag a fused function as reshape only if it only contains reshape.
Update memory planner to force input output to share the same piece of memory
Update the graph runtime codegen to emit nop when reshape only function is encountered.

Previously we are generating the function calls for reshape. This PR updates the optimization to turn reshape into nop: - Tag a fused function as reshape only if it only contains reshape. - Update memory planner to force input output to share the same piece of memory - Update the graph runtime codegen to emit nop when reshape only function is encountered.

tqchen · 2021-04-29T14:16:32Z

cc @altanh @jroesch @zhiics @icemelon9 @Meteorix @ZihengJiang @junrushao1994

icemelon

Could you also update the tvm/src/relay/transforms/memory_alloc.cc to use the kReshapeOnly flag?

tqchen · 2021-04-29T16:50:39Z

Thanks @icemelon9 let me take a look and see what is going on there

altanh

I think this is an important PR to have, but also wanted to bring attention to other similar cases and propose that we generalize this so that the code is less ad hoc if possible.

altanh · 2021-04-29T17:42:20Z

include/tvm/relay/function.h

@@ -144,6 +144,9 @@ constexpr const char* kComposite = "Composite";
 constexpr const char* kInline = "Inline";
 /*! \brief Indicate the function was created by the Pattern Partitioning Pass. */
 constexpr const char* kPartitionedFromPattern = "PartitionedFromPattern";
+
+/*! \brief Mark the function as only composed of reshape operations. */
+constexpr const char* kReshapeOnly = "relay.reshape_only";


are there other ops that could benefit? for example squeeze, expand_dims, etc.? I feel like this should more generally be like relay.alias_only (although alias might be too general since you could "alias" a subset of an input by slicing)

altanh · 2021-04-29T17:43:24Z

src/relay/transforms/fuse_ops.cc

-      void VisitExpr_(const CallNode* op) final { has_call = true; }
+      bool reshape_only = true;
+      void VisitExpr_(const CallNode* op) final {
+        static const Op& reshape_op_ = Op::Get("reshape");


similar to my previous comment, we'd at least want to include a check for reverse_reshape op, but I think we should generalize and make it less ad hoc if possible

csullivan

Thanks for the PR! It's great to see some thinking on logical operation handling, but I feel the choice to make reshape a nop as done in the PR is mixing two separate concepts that are worth addressing,

C1. Input-ouput pair support to achieve storage aliasing for inplace operations

C2. Eliding physical operations on logical-only transformations/ops

For C1, the reshape function attribute strikes me as a special case of what could be a generic input output aliasing for inplace operations. E.g. operator annotations that indicate a mapping from input indices to output indices. From which the memory planner can reuse the input token for an output according to this mapping.

With respect to C2, there can be other logical operators that a backend can support depending on the hardware target (e.g. implicit broadcast, transpose, reshape). Two concerns come to mind,

Logical operations on one platform might not be logical on another. For example, Reshape([A, B, C] -> [A, b1, b2, C]) is not guaranteed to be logical for a device api which utilizes the Nd storage allocation. For example, with texture2d storage: shape[A, B, C] would yield a texture of shape {A, B, C} whereas shape[A, b1, b2, C] could yield a texture of shape {A*b1, b2, C}. So the result of a reshape can change the physical memory layout depending on the storage type.
It's worth discussing the pros and cons of handling logical operations by eliding their graph/runtime codegen. I've done this in the past, and while it works, it is a bit of a hack in the sense that the graph-level representation of the compute no longer maps to the device runtime. Removing the invocation definitely has the advantage of not needing to touch the graph or update the type relations etc. But has the disadvantage of breaking the expectation of the relay graph being a honest representation of the physical compute. Another option can be to introduce a LogicalExpr at the graph-level and use passes to replace expr with certain properties (which can be hw dependent, see 1) with these logical exprs. These can then be handles specially by the type checker to be type and shape syncs (on the inputs) and sources (on the outputs), and which are understood to be nops in the graph runtime.

I'm sure there are other considerations I've missed here, but overall I feel this needs a bit more discussion, would you consider making an RFC?

tqchen · 2021-04-29T19:37:03Z

Thanks @icemelon9 @altanh comments are addressed by looking for a reshapeop property.

tqchen · 2021-04-29T19:52:20Z

@csullivan to followup on your points. I agree that a decision about output aliasing can be a big undertaking to mark in the IR itself. The rationale that brings the current PR os structured as follows:

K0 The reshape only property is a property with respect to the function itself, without breaching into the property of the phyiscal execution
K1 The graph execution generator takes that property to generate eliding.

Note that majority part of the property marking in K0 more strict than C1, due to the reason that aliasing info breaches the semantics while K0 is considered as an annotation of the function property, the backend can still choose to implement reshape only function by actually running the code or eliding and nop. The annotation K0 does not imply the decision.

K1 have to happen at some time point, and can be both target and operation dependent. Given that right now the memory planning and executions are coupled at the graph runtime codegen, the current change to the graph runtime is consistent with that choice.

As we start to factor out more of memory alloca and execution into seperate passes, I agree that introducing physical specific annotations would then make sense since the decisions will then be passed across a few stages.

The target dependence of K1 is indeed something worth thinking about. Given that the current graph planner already assumes flat memory, my assumption is that a separate planning and execution generator will be needed for the N-D case, where we can skip this choice for now(note that the reshape only is a property tag rather than a tag that enforces the execution).

Back to the general topic of aliasing memory. I think that problem would be a bigger decision indeeds deserves an RFC and more in-depth thoughts(I don't think marking aliasing in the graph directly is the right approach as you said it changes the semantics of the graph in an early stage). This PR is mainly intended as a incremental improvement to the graph executor that is handle the case of reshape in a consistent way as VM(which already elides reshape) in the case of flat memory scenario.

We can certainly bring followup refactors once we have a good proposal to handle aliasing memory.

tqchen · 2021-04-29T20:01:42Z

Added comments about the flat memory assumptions and TODOs about what can be done under non-flat settings once the support is introduced

csullivan

It will be good to decouple the memory planning and codegen into separate passes.

I'm not clear on the distinction between K0 and C0. In both cases they are function level annotations. In C0, the proposal is to annotate the function as attr::alias(i0, o0), in K1 the proposal is to annotate the function as attr:kReshapeOnly. In both cases the backend can choose whether to respect them based on target or storage capability.

I'm fine to wait on implementing this as a post memory planning optimization pass once things are factored out. But want to be sure I understand why you argue that the aliasing info breaches the semantics.

See my comment below for K1. I personally would feel better about the graph executor codegen eliding logical reshapes if it checks to ensure that the operation is inplace (input storage_ids have been forwarded).

src/relay/backend/graph_executor_codegen.cc

csullivan

Deleted.

csullivan

Deleted.

csullivan

Deleted.

csullivan

Deleted.

csullivan

Deleted.

csullivan

Deleted.

csullivan

Deleted.

csullivan

Deleted.

csullivan

Deleted.

tqchen · 2021-04-29T21:35:04Z

reshape only was a property about the function itself, the property have well-defined meaning in functional land. While alias property itself does not have a clear definition in the function land and is closer to the execution property.

So a reshape only tag can appear in early stage of compilation without a problem. While alias tag would likely need to only appear when we start to think about memory planning.

For general ops, having alias information alone may not give a complete picture, as the operator semantics itself is also part of the picture. For some ops, alias needs to be enforced(e.g. case of scatter sparse add to dense), while others the impact is more minimum(e.g. reshape). Reshape only in that regard is a more narrow tag that specifies the op semantics as well

csullivan

Understood, LGTM 👍

tqchen · 2021-04-30T22:52:02Z

cc @icemelon9 @altanh please take another look

altanh

LGTM 👍 thanks!

ZihengJiang

LGTM

* [RELAY] Turn reshape into nop in graph executor backend. Previously we are generating the function calls for reshape. This PR updates the optimization to turn reshape into nop: - Tag a fused function as reshape only if it only contains reshape. - Update memory planner to force input output to share the same piece of memory - Update the graph runtime codegen to emit nop when reshape only function is encountered. * Address review comments. * Additional comment and TODOs on the rationale

…7982)

* [RELAY] Turn reshape into nop in graph executor backend. Previously we are generating the function calls for reshape. This PR updates the optimization to turn reshape into nop: - Tag a fused function as reshape only if it only contains reshape. - Update memory planner to force input output to share the same piece of memory - Update the graph runtime codegen to emit nop when reshape only function is encountered. * Address review comments. * Additional comment and TODOs on the rationale

…7982)

* [RELAY] Turn reshape into nop in graph executor backend. Previously we are generating the function calls for reshape. This PR updates the optimization to turn reshape into nop: - Tag a fused function as reshape only if it only contains reshape. - Update memory planner to force input output to share the same piece of memory - Update the graph runtime codegen to emit nop when reshape only function is encountered. * Address review comments. * Additional comment and TODOs on the rationale

…7982)

* [RELAY] Turn reshape into nop in graph executor backend. Previously we are generating the function calls for reshape. This PR updates the optimization to turn reshape into nop: - Tag a fused function as reshape only if it only contains reshape. - Update memory planner to force input output to share the same piece of memory - Update the graph runtime codegen to emit nop when reshape only function is encountered. * Address review comments. * Additional comment and TODOs on the rationale

…7982)

icemelon reviewed Apr 29, 2021

View reviewed changes

altanh reviewed Apr 29, 2021

View reviewed changes

csullivan requested changes Apr 29, 2021

View reviewed changes

Address review comments.

e137278

csullivan approved these changes Apr 29, 2021

View reviewed changes

src/relay/backend/graph_executor_codegen.cc Outdated Show resolved Hide resolved

csullivan requested changes Apr 29, 2021

View reviewed changes

csullivan reviewed Apr 29, 2021

View reviewed changes

Additional comment and TODOs on the rationale

4ad67a6

tqchen force-pushed the reshape branch from f347d2d to 4ad67a6 Compare April 29, 2021 22:30

csullivan approved these changes Apr 30, 2021

View reviewed changes

altanh approved these changes Apr 30, 2021

View reviewed changes

zhiics approved these changes May 1, 2021

View reviewed changes

ZihengJiang approved these changes May 2, 2021

View reviewed changes

ZihengJiang merged commit b2c4f1c into apache:main May 2, 2021

altanh mentioned this pull request May 4, 2021

[Relay][Parser][Bugfix] Fix parsing hierarchical attibute names #7976

Merged

areusch added a commit to areusch/tvm that referenced this pull request May 5, 2021

Fix post-merge conflict between apache#7785 and apache#7945.

48cb230

tqchen pushed a commit that referenced this pull request May 6, 2021

Fix post-merge conflict between #7785 and #7945. (#7982)

148b852

trevor-m pushed a commit to trevor-m/tvm that referenced this pull request May 6, 2021

Fix post-merge conflict between apache#7785 and apache#7945. (apache#…

d62eb15

…7982)

trevor-m pushed a commit to trevor-m/tvm that referenced this pull request May 6, 2021

Fix post-merge conflict between apache#7785 and apache#7945. (apache#…

342c613

…7982)

trevor-m pushed a commit to trevor-m/tvm that referenced this pull request May 6, 2021

Fix post-merge conflict between apache#7785 and apache#7945. (apache#…

473c006

…7982)

trevor-m pushed a commit to neo-ai/tvm that referenced this pull request May 11, 2021

Fix post-merge conflict between apache#7785 and apache#7945. (apache#…

a137527

…7982)

tqchen deleted the reshape branch February 26, 2023 13:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RELAY] Turn reshape into nop in graph executor backend. #7945

[RELAY] Turn reshape into nop in graph executor backend. #7945

tqchen commented Apr 29, 2021

tqchen commented Apr 29, 2021 •

edited

Loading

icemelon left a comment

tqchen commented Apr 29, 2021 •

edited

Loading

altanh left a comment

altanh Apr 29, 2021

altanh Apr 29, 2021

csullivan left a comment

tqchen commented Apr 29, 2021

tqchen commented Apr 29, 2021 •

edited

Loading

tqchen commented Apr 29, 2021

csullivan left a comment

csullivan left a comment •

edited

Loading

csullivan left a comment •

edited

Loading

csullivan left a comment •

edited

Loading

csullivan left a comment •

edited

Loading

csullivan left a comment •

edited

Loading

csullivan left a comment •

edited

Loading

csullivan left a comment •

edited

Loading

csullivan left a comment •

edited

Loading

csullivan left a comment •

edited

Loading

tqchen commented Apr 29, 2021 •

edited

Loading

csullivan left a comment

tqchen commented Apr 30, 2021

altanh left a comment

ZihengJiang left a comment

[RELAY] Turn reshape into nop in graph executor backend. #7945

[RELAY] Turn reshape into nop in graph executor backend. #7945

Conversation

tqchen commented Apr 29, 2021

tqchen commented Apr 29, 2021 • edited Loading

icemelon left a comment

Choose a reason for hiding this comment

tqchen commented Apr 29, 2021 • edited Loading

altanh left a comment

Choose a reason for hiding this comment

altanh Apr 29, 2021

Choose a reason for hiding this comment

altanh Apr 29, 2021

Choose a reason for hiding this comment

csullivan left a comment

Choose a reason for hiding this comment

tqchen commented Apr 29, 2021

tqchen commented Apr 29, 2021 • edited Loading

tqchen commented Apr 29, 2021

csullivan left a comment

Choose a reason for hiding this comment

csullivan left a comment • edited Loading

Choose a reason for hiding this comment

csullivan left a comment • edited Loading

Choose a reason for hiding this comment

csullivan left a comment • edited Loading

Choose a reason for hiding this comment

csullivan left a comment • edited Loading

Choose a reason for hiding this comment

csullivan left a comment • edited Loading

Choose a reason for hiding this comment

csullivan left a comment • edited Loading

Choose a reason for hiding this comment

csullivan left a comment • edited Loading

Choose a reason for hiding this comment

csullivan left a comment • edited Loading

Choose a reason for hiding this comment

csullivan left a comment • edited Loading

Choose a reason for hiding this comment

tqchen commented Apr 29, 2021 • edited Loading

csullivan left a comment

Choose a reason for hiding this comment

tqchen commented Apr 30, 2021

altanh left a comment

Choose a reason for hiding this comment

ZihengJiang left a comment

Choose a reason for hiding this comment

tqchen commented Apr 29, 2021 •

edited

Loading

tqchen commented Apr 29, 2021 •

edited

Loading

tqchen commented Apr 29, 2021 •

edited

Loading

csullivan left a comment •

edited

Loading

csullivan left a comment •

edited

Loading

csullivan left a comment •

edited

Loading

csullivan left a comment •

edited

Loading

csullivan left a comment •

edited

Loading

csullivan left a comment •

edited

Loading

csullivan left a comment •

edited

Loading

csullivan left a comment •

edited

Loading

csullivan left a comment •

edited

Loading

tqchen commented Apr 29, 2021 •

edited

Loading