Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#4312 broke Huggingface BERT ONNX import #6263

Closed
mbrookhart opened this issue Aug 13, 2020 · 2 comments · Fixed by #6272
Closed

#4312 broke Huggingface BERT ONNX import #6263

mbrookhart opened this issue Aug 13, 2020 · 2 comments · Fixed by #6272

Comments

@mbrookhart
Copy link
Contributor

mbrookhart commented Aug 13, 2020

Hi @yongwww @kevinthesun @lixiaoquan,

I haven't been running this model recently, but one of the models I was targeting in my recent dynamic shapes work is a BERT Model from HuggingFace, imported via the ONNX importer. It's written in a dynamic fashion, but it's actually a complex static model, so I'm testing my ONNX refactor against it before starting on fully dynamic models. I was testing it again this week after a break, and noticed the model is broken on master.

The basic issue is that before #4312, I get strided slices that look like this:

%40 = strided_slice(%39, begin=[0, 0, 0, 0], end=[9223372036854775807, 2147483647, 2147483647, 9223372036854775807]);

And after #4312, I get strided slices that look like this:

%40 = strided_slice(%39, meta[relay.Constant][2], meta[relay.Constant][3], meta[relay.Constant][4], begin=[0, 0, 0, 0], end=[-1, 2147483647, 2147483647, -1], strides=[1]);

The -1 is a problem, becuase the shape of the input tensor is (1, 1, 1, 128), so I end up with an empty array.

I can reproduce this with a change to the onnx importer tests:

diff --git a/tests/python/frontend/onnx/test_forward.py b/tests/python/frontend/onnx/test_forward.py
index 14b827c79..c376c9aa7 100644
--- a/tests/python/frontend/onnx/test_forward.py
+++ b/tests/python/frontend/onnx/test_forward.py
@@ -478,15 +478,15 @@ def _test_slice_iteration_v10(indata, outdata, starts, ends, axes=None):
     inputs = [
         helper.make_tensor_value_info("data", TensorProto.FLOAT,
                                       list(indata.shape)),
-        helper.make_tensor_value_info("starts", TensorProto.INT32,
+        helper.make_tensor_value_info("starts", TensorProto.INT64,
                                       list(starts.shape)),
-        helper.make_tensor_value_info("ends", TensorProto.INT32,
+        helper.make_tensor_value_info("ends", TensorProto.INT64,
                                       list(ends.shape))
     ]
     initializer = [
-        helper.make_tensor("starts", TensorProto.INT32, list(starts.shape),
+        helper.make_tensor("starts", TensorProto.INT64, list(starts.shape),
                            starts),
-        helper.make_tensor("ends", TensorProto.INT32, list(ends.shape), ends)
+        helper.make_tensor("ends", TensorProto.INT64, list(ends.shape), ends)
     ]
 
     if axes:
@@ -534,7 +534,8 @@ def test_slice():
     _test_slice_iteration_v10(x, x[0:3, 0:10], (0, 0), (3, 10), (0, 1))
     _test_slice_iteration_v10(x, x[:, :, 3:4], (0, 0, 3), (20, 10, 4))
     _test_slice_iteration_v10(x, x[:, 1:1000], (1), (1000), (1))
-    _test_slice_iteration_v10(x, x[:, 0:-1], (0), (-1), (1))
+    x = np.random.randn(1, 1, 1, 128).astype(np.float32)
+    _test_slice_iteration_v10(x, x, (0, 0), (9223372036854775807, 9223372036854775807), (0, 3))
 
 
 def _test_onnx_op_elementwise(inshape, outfunc, npargs, dtype, opname, kwargs):

Which passes before #4312 but fails after with this error:

E       (shapes (1, 1, 1, 128), (0, 1, 1, 127) mismatch)
E        x: array([[[[ 0.857093,  0.894232,  1.201452, -0.637382, -0.028272,
E                  0.989941, -0.312949,  1.000041, -0.895313, -0.075969,
E                  1.994201, -0.617561, -0.712636,  0.255715,  2.050107,...
E        y: array([], shape=(0, 1, 1, 127), dtype=float32)

I haven't figured out which change in #4312 is causing the problem, but if you guys have any ideas, I'd love to hear them.

Thanks!

Matthew

@kevinthesun
Copy link
Contributor

What is the original end? Usually -1 means the end of that dimension.

@mbrookhart
Copy link
Contributor Author

The original end was the max of int64. In numpy, onnx, and tvm's default mode, -1 means take everything but the last element of the axis.

Turned out to be a precision/rounding issue, that PR fixes it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants