[MXNET-382] Shape and Size Operator #10889

anirudhacharya · 2018-05-10T18:15:13Z

Description

Shape and Size Operator. Pre-requisite for this issue - #10789

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA issue created (except PRs with tiny changes)
Changes are complete (i.e. I finished coding on this PR)
All changes have test coverage:
Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
Code is well-documented:
For user-facing API changes, API doc string has been updated.
For new C++ functions in header files, their functionalities and arguments are documented.
For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

Shape Operator
Size Operator

@haojin2 @eric-haibin-lin @zheng-da

haojin2 · 2018-05-10T18:21:48Z

src/operator/tensor/elemwise_unary_op_basic.cc

+    [](const NodeAttrs& attrs) { return std::vector<uint32_t>(1, 1); })
+.set_attr<FCompute>("FCompute<cpu>", ShapeCompute<cpu>)
+.set_attr<nnvm::FInferShape>("FInferShape",
+	[](const nnvm::NodeAttrs& attrs,


Please use spaces instead

yes, the CI failed due to cpplint. I will fix it.

reminisce · 2018-05-10T18:48:45Z

src/operator/tensor/elemwise_unary_op_basic.cc

+	    TYPE_ASSIGN_CHECK(*out_attrs, 0, 0U);
+	    return out_attrs->at(0) != -1;
+		})
+.set_attr<nnvm::FGradient>("FGradient", MakeZeroGradNodes)


Shouldn't there be no appropriate gradient definition for this?

no, shape and size operator will not have gradients right?

Right, I'm asking the same thing. Why did you define FGradient attribute for the op?

reminisce · 2018-05-10T18:49:02Z

src/operator/tensor/elemwise_unary_op.h

@@ -388,6 +388,26 @@ void CastCompute(const nnvm::NodeAttrs& attrs,
  });
 }

+template<typename xpu>
+void ShapeCompute(const nnvm::NodeAttrs& attrs,
+                 const OpContext& ctx,


reminisce · 2018-05-10T18:50:14Z

src/operator/tensor/elemwise_unary_op.h

+  TShape in_shape = in_data.shape_;
+  MSHADOW_TYPE_SWITCH(out_data.type_flag_, DType, {
+    mxnet_op::Kernel<mshadow_op::identity_with_cast, xpu>::Launch(
+	  s, in_data.ndim(), out_data.dptr<DType>(), in_shape.data());


Why DType for out_data? Isn't that int64 as in the description?

yes, i will make this change.

reminisce · 2018-05-10T18:50:56Z

src/operator/tensor/elemwise_unary_op.h

+  const TBlob& in_data = inputs[0];
+  const TBlob& out_data = outputs[0];
+  mshadow::Stream<xpu> *s = ctx.get_stream<xpu>();
+  TShape in_shape = in_data.shape_;


const TShape&.

reminisce · 2018-05-10T18:52:53Z

src/operator/tensor/elemwise_unary_op_basic.cc

+.describe("Returns a 1D int64 array containing the shape of data.")
+.set_num_inputs(1)
+.set_num_outputs(1)
+.set_attr<nnvm::FInplaceIdentity>("FInplaceIdentity",


Why add this?

will remove it.

reminisce · 2018-05-10T18:55:22Z

src/operator/tensor/elemwise_unary_op_basic.cc

+	     TShape target_shape(1);
+	     target_shape[0] = in_attrs->at(0).ndim();
+	     SHAPE_ASSIGN_CHECK(*out_attrs, 0, target_shape);
+	     return out_attrs->at(0).ndim() != 0U && out_attrs->at(0).Size() != 0U;


use shape_is_none(out_attrs->at(0)

you suggest that I define a shape_is_none function and use it here?

It's already available.

reminisce · 2018-05-10T18:56:31Z

src/operator/tensor/elemwise_unary_op_basic.cc

+	   std::vector<int>* out_attrs) {
+		CHECK_EQ(in_attrs->size(), 1U);
+	    CHECK_EQ(out_attrs->size(), 1U);
+	    TYPE_ASSIGN_CHECK(*out_attrs, 0, 0U);


Use type enum variable name.

okay, i will use mshadow::kInt64

reminisce · 2018-05-10T18:59:19Z

src/operator/tensor/elemwise_unary_op_basic.cu

@@ -77,6 +77,9 @@ NNVM_REGISTER_OP(_identity_with_attr_like_rhs)
 NNVM_REGISTER_OP(reshape_like)
 .set_attr<FCompute>("FCompute<gpu>", UnaryOp::IdentityCompute<gpu>);

+NNVM_REGISTER_OP(shape)


Where does the name come from? This looks confusing and conflicts with the property name shape of NDArray in Python. Need to ensure the documentation can be rendered correctly.

what do you suggest the name of the operator should be? This is more or less the name used in a couple of other frameworks too.

Please make sure the doc page can be rendered correctly at least.

zheng-da · 2018-05-10T20:33:14Z

the shape operator doesn't have backward. my biggest concern is that any computation that uses this operator can't perform a backward computation.

haojin2 · 2018-05-10T20:35:51Z

src/operator/tensor/elemwise_unary_op.h

+  });
+}
+
+


Please get rid of one blank line here, c++ use only 1 blank line between functions

haojin2 · 2018-05-10T20:43:41Z

src/operator/tensor/elemwise_unary_op_basic.cc

@@ -399,6 +399,37 @@ NNVM_REGISTER_OP(reshape_like)
 .add_argument("lhs", "NDArray-or-Symbol", "First input.")
 .add_argument("rhs", "NDArray-or-Symbol", "Second input.");

+NNVM_REGISTER_OP(shape)
+.describe("Returns a 1D int64 array containing the shape of data.")


Would it be clearer if you add an example here?

method

chinakook · 2018-05-11T06:55:21Z

That's great. Keras and Tensorflow both have this op.

piiswrong · 2018-05-11T17:56:45Z

The name shape_op and size_op looks adhoc

anirudhacharya · 2018-05-11T18:02:06Z

@piiswrong shape and size are already existing ndarray operations. I changed the names to prevent confusion. Some of the other names that come to mind are - shape_operator, tensor_shape, array_shape.

reminisce · 2018-05-11T18:12:36Z

How about shape_nd and size_nd, which mimics the naming of gather_nd and scatter_nd?

reminisce · 2018-05-11T18:42:37Z

Please also add tests here:
https://github.com/apache/incubator-mxnet/blob/master/tests/python/unittest/test_ndarray.py#L913
https://github.com/apache/incubator-mxnet/blob/master/tests/python/unittest/test_symbol.py#L171

anirudh2290 · 2018-05-11T22:32:42Z

src/operator/tensor/elemwise_unary_op_basic.cc

+       SHAPE_ASSIGN_CHECK(*out_attrs, 0, target_shape);
+       return !shape_is_none(out_attrs->at(0));
+     })
+.set_attr<nnvm::FInferType>("FInferType",


not registering FGradient ?

Shape operator does not have a differential. Check the conversation here - #10789 (comment)

eric-haibin-lin · 2018-05-13T16:54:34Z

src/operator/tensor/elemwise_unary_op.h

+  CHECK_EQ(req.size(), 1U);
+  const TBlob& in_data = inputs[0];
+  const TBlob& out_data = outputs[0];
+  out_data.dptr<int64_t>()[0] = in_data.Size();


out_data holds a pointer to gpu memory. you need to explicitly use kernel launch to set the value

If I use kernel launch then I will need a databuffer pointing to in_data.Size(). How would I get that? because in_data.Size() is of type index_t which does not have data or dptr attribute.

You know the output size is only 1, so you can just use 1 for that.

mxnet_op::Kernel<mshadow_op::identity_with_cast, xpu>::Launch(s, 1U, out_data.dptr<int64_t>(), < what goes here? - in_data.Size()?? >);

…to shape

szha · 2018-05-21T21:47:03Z

Ping for another round of reviews.

chinakook · 2018-05-30T14:33:51Z

If we have a symbol A with 4 dims, and how can I get the 1-dim size of shape_nd(A) result?
It’s straightforward in Keras and Tensorflow using K.shape(A)[1] and A.shape[1].
I think it’s difficult to do like this in MXNet’s Symbol system.

reminisce · 2018-05-30T18:08:41Z

src/operator/mshadow_op.h

@@ -96,6 +96,12 @@ struct identity_with_cast {
  }
 };

+struct size_kernel {
+  MSHADOW_XINLINE static void Map(int i, int64_t *out, unsigned int in) {
+    out[0] = int64_t(in);


nit: static_cast<int64_t>

reminisce · 2018-05-30T18:20:03Z

src/operator/tensor/elemwise_unary_op.h

+  const TShape& in_shape = in_data.shape_;
+  MSHADOW_TYPE_SWITCH(out_data.type_flag_, DType, {
+    mxnet_op::Kernel<mshadow_op::identity_with_cast, xpu>::Launch(
+      s, in_data.ndim(), out_data.dptr<int64_t>(), in_shape.data());


in_shape.data is a pointer in cpu memory which cannot be directly accessed on gpu. You can use Shape<ndim> instead.

how come this is not captured by CI?

it did, the CI failed for GPU tests. I need to fix it.

piiswrong · 2018-05-30T18:44:33Z

shape_nd still sounds weird as it's also available in symbol.

BTW I think these operators can be useful but they won't solve the issue #10789

reminisce · 2018-06-15T00:01:30Z

src/operator/tensor/elemwise_unary_op_basic.cu

+  }
+
+  MSHADOW_TYPE_SWITCH(out_data.type_flag_, DType, {
+    mxnet_op::Kernel<mshadow_op::shape_kernel, gpu>::Launch(


Same as above, it's simply copying a tiny amount of data from a cpu array to a gpu array. Launching kernel is expensive and a waste of resources. You can just call cudaMemcpyAsync to alleviate the workload.

piiswrong · 2018-06-18T21:42:53Z

src/operator/tensor/elemwise_unary_op_basic.cu

+  const TBlob& out_data = outputs[0];
+  mshadow::Stream<gpu> *s = ctx.get_stream<gpu>();
+  const TShape& in_shape = in_data.shape_;
+  Shape<10> temp_shape;


What if ndim is greater than 10?

Actually I agree with reminisce, you should use cudamemcopy here so that you don't need this magic number

okay, i will fix it.

…to shape

shape

…to shape

anirudhacharya · 2018-06-27T02:03:28Z

@reminisce @piiswrong ping for review.

reminisce · 2018-06-28T03:37:18Z

src/operator/mshadow_op.h

@@ -92,7 +92,8 @@ MXNET_UNARY_MATH_OP(identity_grad, 1);
 struct identity_with_cast {
  template<typename DTypeIn, typename DTypeOut>
  MSHADOW_XINLINE static void Map(int i, DTypeOut *out, DTypeIn *in) {
-    out[i] = DTypeOut(in[i]);
+    DTypeIn in_data = in[i];


What's the purpose of this change?

reminisce · 2018-06-28T03:40:38Z

src/operator/tensor/elemwise_unary_op.h

@@ -388,6 +388,20 @@ void CastCompute(const nnvm::NodeAttrs& attrs,
  });
 }

+template<typename xpu>
+void ShapeCompute(const nnvm::NodeAttrs& attrs,


You don't need to make templates for shape and size functions based on the type of device. CPU and GPU FCompute functions are defined respectively in .cc and .cu and don't share anything.

anirudhacharya · 2018-06-29T16:56:41Z

@haojin2 @piiswrong can this be merged?

haojin2 · 2018-06-29T20:40:42Z

src/operator/tensor/elemwise_unary_op_basic.cc

@@ -398,6 +398,98 @@ NNVM_REGISTER_OP(reshape_like)
 .add_argument("lhs", "NDArray-or-Symbol", "First input.")
 .add_argument("rhs", "NDArray-or-Symbol", "Second input.");

+void ShapeComputeCPU(const nnvm::NodeAttrs& attrs,
+                       const OpContext& ctx,


nit: alignments of all ComputeXPU functions

anirudhacharya · 2018-06-30T00:25:11Z

@piiswrong @reminisce please merge this.

* Shape Operator * cuda * size op * lint issues * docs example * add docs, change op name to avoid conflict, add convenience confluent method * change name to _nd * fix test cases, add new kernel * test name fix. * solve gpu memory problem for size and shape * get rid of FIgnoreInputs attr of shape_nd * op name change * fix * retrigger CI * retrigger CI * retrigger CI * trigger CI * fix comments * cpplint * nit * trigger CI

Anirudh Acharya added 2 commits May 10, 2018 11:12

Shape Operator

4635957

cuda

38c15d9

haojin2 reviewed May 10, 2018

View reviewed changes

reminisce suggested changes May 10, 2018

View reviewed changes

haojin2 reviewed May 10, 2018

View reviewed changes

Anirudh Acharya added 4 commits May 10, 2018 17:13

size op

6540678

lint issues

01d1b95

docs example

b513dbe

add docs, change op name to avoid conflict, add convenience confluent

125c3cd

method

anirudhacharya requested a review from szha as a code owner May 11, 2018 02:16

anirudhacharya changed the title ~~[MXNET-382] Shape Operator~~ [MXNET-382] Shape and Size Operator May 11, 2018

change name to _nd

ac96ef1

anirudh2290 reviewed May 11, 2018

View reviewed changes

eric-haibin-lin reviewed May 13, 2018

View reviewed changes

anirudhacharya requested a review from marcoabreu as a code owner May 15, 2018 00:39

fix test cases, add new kernel

93ffddc

anirudhacharya force-pushed the shape branch from 88bbc95 to 93ffddc Compare May 15, 2018 00:44

Anirudh Acharya added 2 commits May 15, 2018 12:43

test name fix.

3d578d3

Merge branch 'master' of https://github.com/apache/incubator-mxnet in…

ef43d2f

…to shape

reminisce reviewed May 30, 2018

View reviewed changes

reminisce reviewed Jun 15, 2018

View reviewed changes

piiswrong reviewed Jun 18, 2018

View reviewed changes

Anirudh Acharya added 2 commits June 18, 2018 14:43

Merge branch 'master' of https://github.com/apache/incubator-mxnet in…

2074b46

…to shape

fix

4b1164e

anirudhacharya force-pushed the shape branch from 7c29f92 to 4b1164e Compare June 22, 2018 20:44

Anirudh Acharya added 3 commits June 25, 2018 20:35

Merge branch 'master' of https://github.com/apache/incubator-mxnet into

ee97196

shape

Merge branch 'master' of https://github.com/apache/incubator-mxnet in…

10cb562

…to shape

retrigger CI

cbec1e5

anirudhacharya force-pushed the shape branch from 543ac08 to cbec1e5 Compare June 26, 2018 15:43

Anirudh Acharya added 4 commits June 26, 2018 10:40

retrigger CI

039e6d4

retrigger CI

ee110fd

Merge branch 'master' of https://github.com/apache/incubator-mxnet in…

460c315

…to shape

trigger CI

67cbbfe

reminisce reviewed Jun 28, 2018

View reviewed changes

fix comments

804dfb4

anirudhacharya force-pushed the shape branch from 30f9377 to 804dfb4 Compare June 28, 2018 18:12

cpplint

f17f9c8

reminisce approved these changes Jun 29, 2018

View reviewed changes

haojin2 reviewed Jun 29, 2018

View reviewed changes

Anirudh Acharya added 2 commits June 29, 2018 13:53

nit

c188404

trigger CI

de64b97

piiswrong approved these changes Jun 29, 2018

View reviewed changes

haojin2 approved these changes Jun 29, 2018

View reviewed changes

eric-haibin-lin merged commit 33022f8 into apache:master Jun 30, 2018

anirudhacharya deleted the shape branch June 30, 2018 01:16

ijkguo mentioned this pull request Jul 12, 2018

[MXNET-670] shape_array and size_array operator is non-differentiable #11661

Merged

7 tasks

[MXNET-382] Shape and Size Operator #10889

[MXNET-382] Shape and Size Operator #10889

Conversation

anirudhacharya commented May 10, 2018 • edited Loading

Description

Checklist

Essentials

Changes

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zheng-da commented May 10, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

chinakook commented May 11, 2018

piiswrong commented May 11, 2018

anirudhacharya commented May 11, 2018

reminisce commented May 11, 2018

reminisce commented May 11, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

szha commented May 21, 2018

chinakook commented May 30, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

piiswrong commented May 30, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

anirudhacharya commented Jun 27, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

anirudhacharya commented Jun 29, 2018 • edited Loading

Choose a reason for hiding this comment

anirudhacharya commented Jun 30, 2018

anirudhacharya commented May 10, 2018 •

edited

Loading

anirudhacharya commented Jun 29, 2018 •

edited

Loading