-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BYOC][ACL] Fix list is not supported as an input node #10801
Conversation
@lhutton1 - could you take a look ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @DzAvril, its great to see more support being added to ACL again :) Apologies for the delayed review I was on holiday the past couple of days. Overall LGTM! The comments are mostly just about cleaning things up, testing and tightening the conditions under which concatenate gets offloaded
@lhutton1 Thanks for the comments. Please help to review again. |
Thanks @DzAvril LGTM! cc @manupa-arm @comaniac @Mousius who may be able to help get this merged |
Hi @lhutton1 , thanks for your review. As you have approved these changes, the status of this PR is supposed to be
Will it be merged automatically after the status changes to |
Unfortunately since I'm just a reviewer my approval doesn't unblock merging, we would need to wait for someone who is a committer to approve this change and merge it. Friendly ping @manupa-arm @comaniac @Mousius (I'm not sure who else might be interested in ACL support) |
Hi @DzAvril, I was having a think about this over the weekend and remembered that for the ACL integration we don't run the My apologies for missing this previously - it's been a while since working on this. |
Our team is working on a project which needs hot-upgrade for models, our approach is offloading as many operations as possible to ACL and performance is not a concern. So this PR fixes a separate issue of offloading operations which have a list as inputs and offloading concatenate is a typical case of this issue. |
The JSONRuntimeBase misses a line for multiple EntryID (eid), as the patch pointed out. ( And this issue hits us while hooking ACL concat for a feature of "hot upgradable model":
We leverage ACL as the base of the "common Op library". ) Hi @lhutton1, is it a more appropriate way to Or b) Submit the fixup of json_runtime.h only? Any idea? |
Thanks for the insight @DzAvril, @cee1, I think in that case it would be fine to keep the implementation. It might be a good idea to disable offloading concatenate by default, but then add an optional parameter say |
Hi @lhutton1, would you please review this revision of the patch? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@masahi @areusch @tmoreau89 - we're struggling to find a committer to take a look at this PR, do you have any suggestions? |
|
||
Returns | ||
------- | ||
ret : annotated and partitioned module. | ||
""" | ||
global offload_concat_ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than using a global var, can we make use of arm_compute_lib_pattern_table
?
def arm_compute_lib_pattern_table(): |
i.e. register concat
pattern if offload_concat == True
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking a look @masahi. Currently ACL has a mixture of registering operations using composite functions and _register_external_op_helper
which adds the target.arm_compute_lib
attribute, which is not ideal. I think if we wanted to do this for concat it would need to be registered using pattern table rather than the other mechanism.
In the future we should probably use the pattern table for all operations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, if we are adding a new op support, why not add it via pattern like
def conv_pattern(): |
Then it is easy to enable / disable a pattern based on some flags. @DzAvril
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated, please review. I'm not sure it is what you want because it seems a little weird since arm_compute_lib_pattern_table
aims to register fusing patterns for ops and registering op attr is out of it.
BTW, this commit triggered a Sanity Check
error:
python/tvm/relay/op/contrib/arm_compute_lib.py:274: [W0612(unused-variable), arm_compute_lib_pattern_table.concatenate] Unused variable 'concatenate'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I don't know what @tvm.ir.register_op_attr("nn.conv2d", "target.arm_compute_lib")
thing is for... Why do you need both the pattern + check
function at
def conv_pattern(): |
def check_conv(extract): |
and register_op_attr
thing at
tvm/python/tvm/relay/op/contrib/arm_compute_lib.py
Lines 291 to 292 in 8a0472f
@tvm.ir.register_op_attr("nn.conv2d", "target.arm_compute_lib") | |
def conv2d(expr): |
?
I never needed the latter in the BYOCs I worked on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I don't know what
@tvm.ir.register_op_attr("nn.conv2d", "target.arm_compute_lib")
thing is for...
Gently ping @lhutton1, can you help to explain this to masai?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@DzAvril, @lhutton1 said "I think if we wanted to do this for concat it would need to be registered using pattern table rather than the other mechanism". This is also what I'm suggesting above. Can you do that? You shouldn't need @tvm.ir.register_op_attr
thing for concat. You just need to add is_op("concatenate")
etc + check function which you already have under @tvm.ir.register_op_attr(concatenate)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I don't know what
@tvm.ir.register_op_attr("nn.conv2d", "target.arm_compute_lib")
Yeah I don't think we need to register conv2d like this either as it should have already been picked up by the pattern table. This way of registering operators predates the pattern table so I guess this was just overlooked. We can follow up in a separate PR to move all operators so that they are registered using the pattern table - which I believe is the way other BYOC's are going. Apologies for the confusion @DzAvril, like @masahi said, it would be best to register concatenate (and other operators added in the future) using the pattern table since this is the preferred mechanism.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for missing some comments above. I have a question about the concatenate_pattern
, see code below.
def concatenate_pattern():
"""Create an concatenate pattern from equivalent relay operators.
Returns
-------
pattern : dataflow_pattern.AltPattern
Denotes the concatenate pattern.
"""
pattern = is_op("concatenate")(wildcard())
return pattern
As the input of concatenate
is a tuple of tensors, what the pattern of the parameters should be like? wildcard()
in the code block is not working.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you try is_tuple(None)
? See #7754
@masahi @lhutton1 Thanks for your comments. I re-support
std::vector<JSONGraphNodeEntry> AddCommonSingleJSONNode(const CallNode* cn, std::string name) {
std::vector<JSONGraphNodeEntry> inputs;
for (const auto& arg : cn->args) {
auto res = VisitExpr(arg);
inputs.insert(inputs.end(), res.begin(), res.end());
}
auto node = std::make_shared<JSONGraphNode>(name, /* name_ */
"kernel", /* op_type_ */
inputs, 1 /* num_outputs_ */);
const auto* fn = cn->op.as<FunctionNode>();
ICHECK(fn);
const auto* callNode = fn->body.as<CallNode>();
ICHECK(callNode);
SetCallNodeAttribute(node, callNode);
return AddNode(node, GetRef<Expr>(cn));
} This function calls
It will work without |
def arm_compute_lib_pattern_table(disabled_ops=["concatenate"]): The default value of disabled_ops triggers an error in
I have no idea how to fix this, can you give a hint pls? |
You can add
|
* [BYOC][ACL] Fix list is not supported as an input node * fix clang lint error * fix compile warnning * fix python module import error * rename concatenate test file * fix always MakeACLTensor with same eid 0 * do not offload concat default * fix concattnate test failure * fix test failure * fix lint error * fix lint * remove global var offload_concat * support concatenate with pattern table mechanism * disable pylint dangerous-default-value warning Co-authored-by: XuZhi <xuzhi.xu@alibaba-inc.com>
* [BYOC][ACL] Fix list is not supported as an input node * fix clang lint error * fix compile warnning * fix python module import error * rename concatenate test file * fix always MakeACLTensor with same eid 0 * do not offload concat default * fix concattnate test failure * fix test failure * fix lint error * fix lint * remove global var offload_concat * support concatenate with pattern table mechanism * disable pylint dangerous-default-value warning Co-authored-by: XuZhi <xuzhi.xu@alibaba-inc.com>
The input of op
concatenate
is a list of tensors. We tried to bring it to ACL and found the JSON node of it is like below.The inputs of node
kernel
is:Element in
inputs
is the index ofinput
node. Taking[0, 1, 0]
as an example, the first0
isnodeid
, the second1
isindexid
, and the last0
isversion
.But in function
Run
, it only uses0
asindexid
in statementuint32_t eid = EntryID(nid, 0);
so the other two inputs are ignored.This PR introduces a variable
json_inputid_to_layer_inputid
which maps the input index of JSON node to the index of the ACL layer's inputs.