-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Model conversion tool] Support fusing Conv+Add #7799
Conversation
tfjs-converter/python/tensorflowjs/converters/tf_saved_model_conversion_v2.py
Outdated
Show resolved
Hide resolved
tfjs-converter/python/tensorflowjs/converters/tf_saved_model_conversion_v2.py
Outdated
Show resolved
Hide resolved
tfjs-converter/python/tensorflowjs/converters/normalize_biasAdd.py
Outdated
Show resolved
Hide resolved
or ancestor_node.op == 'DepthwiseConv2dNative') \ | ||
and len(graph_rewrite_util.get_output_node_names(input_node_map, ancestor_node_name)): | ||
node.op = 'BiasAdd' | ||
node.attr['data_format'].s = bytes('NHWC', 'utf-8') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this data_format info should come from the ancestor node
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pyu10055 If bias is a 1D tensor, the AddV2 op always broadcast/adds it to the last dimension of the ancestor's results, so this behavior is always NHWC BiasAdd.
Also took a look at the models that do not fuse ops as expected, mainly for conv/depthwise+bias:
|
Update: added conversion supports for 'Add' op fusing, in addition to 'AddV2' because of the use cases of the last comment. |
Great work! After this PR is merged, the benchmarks models can be updated with this PR, right? Look forward to great perf improvement with fused models. |
Yes |
def get_output_node_names(node_map, target): | ||
output_node_names = [] | ||
for name, node in node_map.items(): | ||
for input_name in node.input: | ||
if target == input_name: | ||
output_node_names.append(name) | ||
return output_node_names |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm surprised we have node.input
but not node.output
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, node.input is the edge information for model topology, node.output would have the duplicate information.
tfjs-converter/python/tensorflowjs/converters/normalize_bias_add.py
Outdated
Show resolved
Hide resolved
tfjs-converter/python/tensorflowjs/converters/normalize_bias_add.py
Outdated
Show resolved
Hide resolved
tfjs-converter/python/tensorflowjs/converters/normalize_bias_add.py
Outdated
Show resolved
Hide resolved
…dd.py Co-authored-by: Matthew Soulanille <matthew@soulanille.net>
(Conv means PointwiseConv, DepthwiseConv or generalConv here.)
Problem
The tool currently supports 'Conv + BiasAdd + Activiation' fusing, but this fusing requires the op of 'BiasAdd' to be 'BiasAdd', while 'Conv + AddV2 + Activiation' could not be fused even though they are mathematically same.
Candidate solutions
Assume we could not fix grappler for such missed fusing pattern. There are a couple of candidate solutions:
Either way, the tool has to find out 'AddV2' nodes to fuse, with the following conditions:
Misc.
From TF API doc, the major difference between 'AddV2' and 'Add' is the data type of input. The previous one additionally supports 'uint16, uint32, uint64', while the later one additionally supports 'string'.
To see the logs from the Cloud Build CI, please join either our discussion or announcement mailing list.