-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix the FInplaceIdentity #2572
Fix the FInplaceIdentity #2572
Conversation
nnvm/src/pass/plan_memory.cc
Outdated
@@ -39,6 +40,22 @@ static int GetDTypeSize(int type_flag) { | |||
} | |||
} | |||
|
|||
static bool BitwiseCompatibleTypes(int type1, int type2) { | |||
static std::set<std::pair<int, int>> compatible_pairs { {kUint8, kInt8}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems this set is designed for cast op. For others, it may not suitable. For example, saturate_cast from kUint8 to kInt8 will change trunk data. Relu from kInt8 to kUint8 will also break this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's hard to imagine relu and saturate_cast to advertise themselves as identites.
That said, I made this exception for types to accommodate your usecase as I understand it, if you don't need that then I can just replace the check with strict type check, that will also be fine (and slightly safer).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Relu on uint8 and saturate_cast with same dtype are exactly the same cases as cast, as they are redundant operations. According to your approach, they should add identity attributes. But this will break the cases relu from int8 to uint8 and saturate_cast from uint8 to int8.
If replacing the check with strict type check, then identity can't help on casting from uint8 to int8. Also, relu on fp32 will have trouble.
Overall, you can't find a unified relu here to cover all cases. A graph pass is a better solution as it can see the operator and make different relu for each operator.
I will evalute the effect on mkldnn when CNY holiday ends on next Monday. For this PR itself, I suggest to implement an graph pass alternatively for removing cast with same dtype. |
It is not only about casts with the same dtype - there may be other operations that can be treated as identity, e.g. pass through some types and change other. Currently they can't advertise itself as potential identities, which is a bug. |
From my point of view, cast shouldn't have identity attributes because for most of the usecase, cast will modify trunk data. It's abuse of using of identity for working around redundant operations. This should be implemented as a graph pass that detect and delete those redundant operations, including cast with same dtype, elewise_add with 0, relu on uint8, bn with kernel=[1,1] and stride=[1,1], etc. Otherwise, we have to add identity to all those operators, it's not maintainable. Also, identity can only ensure memory inplace between input and output. Extra check have to be added in each opeators to avoid the computation, making common usecase even slower. Another problem is, the memory copy between previous GPU operator and next identity CPU operator can't be avoided, making identity useless for this case. Overall, this is imperfect solution. |
I will address both of your comments here. Most of the operations you listed are actually impossible to be handled by the graph pass because e.g. add with 0 depends on the data passed to the operator, which graph pass does not have access to. The decision to skip the computation has to be done inside those operators in that case. For the relu case - for it to be able to claim that it is an identity, it would need to know the type of its input (to ensure it is some unsigned integral type), which it can't do in the current architecture of NNVM. It still can say that it wants to be in place and, if there is no other node using its inputs, it will get Making the check for both inplace and inplaceidentity more loose in your PR (instead of just inplace) made FInplaceIdentity usable only by reshape and its variants(flatten and expand_dims). This does not make sense. |
optimizing add with 0 is also impossible within operator because checking if input is all 0 is too heavy. It can be done with graph pass when we have constant node in graph. For rest of case, graph pass can handle them for now. Graph pass can also handle multple operators case like A + B - B. "making common usecase even slower" means we have to run
In my previous PR, I just loose the restriction of inplace and identity to make them can be used on dtype mismatch case. Overall their usage is enlarged rather than restricted. On my opinion, Simply adding restriction back for supporting some corner cases is not a smart choice. Finally, I'm not the maintainer of this pass, and above is just a suggestion for technical disscusion. I'll try this PR on mkldnn next Monday. If this doesn't break the usage of mkldnn. I'm fine with this PR. |
I did some experiments with this PR, and doesn't see any issue with mkldnn. Even removing
is still OK for MKLDNN. I think it's because this PR only adds restriction with FInplaceIdentity while FInplaceOption remains unchanged. And MKLDNN only uses FInplaceOption. If we have to add this restriction to FInplaceIdentity, I suggest to remove compatible type check above because it's quite specific for cast op, and may cause misunderstanding for other developer. |
Thanks @ptrendx @ZhennanQin @eric-haibin-lin @yzhliu this is now merged |
Currently NNVM operators that could be identity under certain conditions (e.g. Cast from type A to type A) cannot be marked with FInplaceIdentity option. It is because PR #1696 changed the requirements of inplace optimization (including identity) to not check for the exact type match, and the operator itself does not have sufficient information inside FInplaceIdentity function to be able to set the option only if it is actually identity.
This PR restricts identity in the memory plan pass to honor FInplaceIdentity option from operator only if shape sizes match and if types of input and output are bitwise compatible (by which I mean cast from dtype1 to dtype2 does not change anything in memory, like going from int32_t to uint32_t).
@eric-haibin-lin FYI
Adding author of PR #1696 @ZhennanQin for comment if this approach still satisfies their needs.