-
Notifications
You must be signed in to change notification settings - Fork 108
Add support for tf.nn.depth_to_space lambda #492
Conversation
Thanks for your contribution. There is test failure for your case, dimension mismatch -- This is because the spec of tensorflow DepthToSpace and onnx DepthToSpace may be different. It is not a simple plug-in, you need understand both tf and onnx behavior, and use onnx ops to construct the tf op correctly, so may need some additional manipulation such as Transpose, etc. |
Thanks @jiafatom. Yes, of course I noticed the newly introduced failed test, see also my message :-) I'll dig deeper and see if I can figure out what needs to be done, thanks. (sorry for closing/re-opening, pressed the wrong button) |
Tensorflow documentation for NHWC case (the failing one):
If we number
It does after inserting a transpose [0, 3, 1, 2] at the start and [0, 2, 3, 1] at the end. But if we would implement that, why use the ONNX DepthToSpace operator at all, since essentially it is just two reshapes and a single transpose? So I've just implemented directly Tensorflow's documentation using 2 reshapes and a transpose for the NHWC case. For the NCHW case the test seems to pass already, so the ONNX DCR case seems to match Tensorflow's NCHW implementation. |
I've 'fixed' some tests by skipping old TensorFlows and old opsets, but I still see these kind of failures in the CI builds:
I don't have this issue on my test machine (also CPU only). Did you have similar issues in other tests and if so how do you normally circumvent them? |
The failed CI build is on linux-conda_ci and win32-conda-ci, which uses standalone keras and tf backend (not tf.keras). So if you set |
No, it turned out to be that older versions of TF (pre 2.1.0) don't support NCHW mode for this op on the CPU. So I've added another skip in the tests. |
keras2onnx/_builtin.py
Outdated
block_size = node.get_attr('block_size') | ||
oopb = OnnxOperatorBuilder(container, scope) | ||
if _is_nhwc(node): | ||
_, h, w, c = _cal_tensor_shape(node.inputs[0]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still need some help, because here I have now set n = -1
such that it is OK if the batch dimension is unknown. But what if h
and w
are also unknown, i.e. their values are None
. How to then do the reshapes below?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point here. This line has issue when the tensor shape is unknown (dynamic). Search our code base to find the case when _cal_tensor_shape
is None
. We need handle that -- Add a Shape
op after node.inputs[0]
to get the dynamic shape, and use Slice
op to get h
and w
, and concatenate with other dimensions to make the desired_shape
. We have some examples in our code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a look at an example and it is not trivial, code quickly becomes unreadable because of all the extra nodes added. Then in that case I reconsidered the use of the actual DepthToSpace node, so I worked on that a bit and made it work with two extra transposes. I've added a test case with unknown tensor sizes as well and that one now also passes :-) Could you have another look at the code? Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, thanks for your contribution!
Added support for tf.nn.depth_to_space as a Keras lambda, mapped to the ONNX DepthToSpace operator.
The two added test-cases visualised with Netron:
Channels first (NCHW), channel dimension reduced from 4 to 4/(2*2) = 1, width & height dimension increased by a factor 2:
Channels last (NHWC), channel dimension reduced from 8 to 8/(2*2) = 2, width & height dimension increased by a factor 2:
Although the visualisations look correct as far as I can see, the NHWC test-case fails with the following error in the ORT:
This is my first contribution to keras-onnx (and first ONNX contribution in general), so please review carefully. And I'm also asking for help w.r.t. the second test-case. Or is that a bug in ORT?