Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change Equal 11 for string input #2149

Merged
merged 4 commits into from
Apr 14, 2023
Merged

Conversation

mikeessen
Copy link
Contributor

This pull request fixes issue #2148 which is that ONNX Equal 11 does not support a type of string, but Equal version 11 in tf2onnx/onnx_opset/logical.py is implemented to support all types including string.

Signed-off-by: Mike Essenmacher <essen@us.ibm.com>
@mikeessen
Copy link
Contributor Author

@fatcat-z Is there a way to restart the pipeline tests? For the Equal 11 string update PR there are 2 failing pipeline tests which I think may not have to do with the changes in the PR.

@mikeessen mikeessen closed this Mar 31, 2023
@mikeessen mikeessen reopened this Mar 31, 2023
@mikeessen
Copy link
Contributor Author

@fatcat-z please review when you get a chance. Thank you!

@mikeessen
Copy link
Contributor Author

@fatcat-z please let me know if there are any changes or updates I should make to the PR. Thank you!


def _add_cast_to_same_type_to_inputs(graph, node, supported_dtypes, target_dtype):
for inp in node.input:
if graph.get_dtype(inp) not in supported_dtypes:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original purpose of this method is to unify the dtype of all of inputs.
After your changes, if all of inputs are in supported_dtypes, we won't unify them any more, is that correct?

If so, probably we need to add this logic back if all of inputs are in supported_dtypes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I will add logic to unify the dtype of all of inputs.

@fatcat-z
Copy link
Collaborator

Could you please also add a test (like this) for this change?

Signed-off-by: Mike Essenmacher <essen@us.ibm.com>
common_dtype = graph.get_dtype(node.input[0])
if common_dtype not in supported_dtypes:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad, I misunderstood the issue before.

Actually, the issue is: The op Equal doesn't support 'string' with version 11, meaning we should fail the conversion once we detect this case. Even the content is empty, we should not convert it to an integer value which might confuse users.

So it'd better to set up an unsupported op list which may only contain string right now. If we detect current type of input[0] is in it, we fail the conversion with a reasonable message instead of making tricky things to work around it.

Make sense?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, the issue is: The op Equal doesn't support 'string' with version 11, meaning we should fail the conversion once we detect this case.

In the OpSet 7 version of this, the Equal Op works fine assuming the strings can be converted to the desired type. So this change makes OpSet11 match the OpSet7 behavior.

So if we made this an error, then OpSet 7 Equal and Opset 11 would behave differently but both have the same stance on strings. Making them behave different didn't seem right. Rather than remove what was working in OpSet 7, we decided to enable the same setup for OpSet 11. It could be argued to make OpSet7 (and earlier) also automatically error in this case instead. If that's truely preferred we could look into it. However our thought was to expand 11 rather than restrict 7.

Even the content is empty, we should not convert it to an integer value which might confuse users.

I'll agree that the empty string was a special case that almost lead us to restrict OpSet 7. However after digging into the TF code, we found they explicitly handle this case. When creating a feature column, if an entry is "" then it's explicitly changed to -1 (https://github.com/tensorflow/tensorflow/blob/4e7f0185c70faf35e12acbfe381a729d1e6cc38c/tensorflow/python/feature_column/feature_column.py#L2286). Since it's explicitly handled, we matched the TF behavior over just erroring out. Otherwise it is also confusing to have a model work in TF but fail to convert.

For reference we have a model encountering this scenario that indeed works fine in TF but then fails to convert. With these changes, it converts fine. Unfortunately it's not a model we can share but it is a real world scenario.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your detailed explanations.

Signed-off-by: Jay Zhang <36183870+fatcat-z@users.noreply.github.com>
Copy link
Collaborator

@fatcat-z fatcat-z left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contributions!

@fatcat-z fatcat-z enabled auto-merge (squash) April 14, 2023 02:55
@fatcat-z fatcat-z merged commit 276bdea into onnx:main Apr 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants