standard update for sparse sgd_mom_update #9189

ZiyueHuang · 2017-12-23T14:41:54Z

Description

#9177

cc @eric-haibin-lin

Checklist

Essentials

Passed code style checking (make lint)
Changes are complete (i.e. I finished coding on this PR)
All changes have test coverage:
Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
Code is well-documented:
For user-facing API changes, API doc string has been updated.
For new C++ functions in header files, their functionalities and arguments are documented.
For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

unittest

Comments

If this change is a backward incompatible change, why must this change be made.
Interesting edge cases to note here

eric-haibin-lin · 2017-12-31T18:49:43Z

python/mxnet/optimizer.py

@@ -464,16 +465,19 @@ class SGD(Optimizer):
    ----------
    momentum : float, optional
       The momentum value.
+    lazy_update : bool, optional
+       If True, standard updates are applied.


Clarify the default value, too

eric-haibin-lin · 2017-12-31T18:50:46Z

src/operator/optimizer_op.cc

@@ -96,11 +147,15 @@ only the row slices whose indices appear in grad.indices are updated (for both w
 .set_attr_parser(ParamParser<SGDMomParam>)
 .set_attr<nnvm::FInferShape>("FInferShape", ElemwiseShape<3, 1>)
 .set_attr<nnvm::FInferType>("FInferType", ElemwiseType<3, 1>)
-.set_attr<FInferStorageType>("FInferStorageType", ElemwiseStorageType<3, 1, false, true, false>)


Also update the doc here, too?

eric-haibin-lin · 2017-12-31T18:52:22Z

src/operator/optimizer_op-inl.h

+  }
+  if (!dispatched && in_attrs->at(0) == kRowSparseStorage &&
+      in_attrs->at(1) == kRowSparseStorage &&
+      (in_attrs->at(2) == kRowSparseStorage || in_attrs->at(2) == kDefaultStorage)) {


Save in_attrs->at(2) in a local var to improve readability?

eric-haibin-lin · 2018-01-01T01:04:32Z

src/operator/optimizer_op.cu

+        DType* mom_data = mom.dptr<DType>();
+        DType* out_data = out->dptr<DType>();
+        nnvm::dim_t num_rows = weight.shape_[0];
+        auto row_length = weight.shape_.ProdShape(1, weight.ndim());


let's not use auto

eric-haibin-lin · 2018-01-01T01:06:08Z

src/operator/optimizer_op.cc

+
+        nnvm::dim_t* prefix_sum = reinterpret_cast<nnvm::dim_t*>(workspace.dptr_);
+        // mark row flags
+        Fill<false>(s, TBlob(prefix_sum, Shape1(num_rows), cpu::kDevMask), kWriteTo, 0);


Fill uses memset which is single-thread. It's slow for large number of elements. Let's use Kernel<set_zero, cpu>::Launch

eric-haibin-lin · 2018-01-02T21:57:26Z

python/mxnet/optimizer.py

@@ -433,7 +433,8 @@ def _get_wd(self, index):
 class SGD(Optimizer):
    """The SGD optimizer with momentum and weight decay.

-    The optimizer updates the weight by::
+    If any storage type of weight, state or grad is ``default``, \
+    **standard updates** are applied by::


I think we should reverse the order and mention lazy_update, since the users of optimizer don't know how state stypes are created:

If the storage types of weight and grad are both ``row_sparse``, and ``lazy_update`` is True, **lazy updates** are applied by:: for row in grad.indices: ... Otherwise, **standard updates** are applied by: ...

eric-haibin-lin · 2018-01-03T01:15:50Z

src/operator/optimizer_op-inl.h

+                                   : prefix_sum[i] > prefix_sum[i-1];
+
+    for (index_t j = 0; j < row_length; j++) {
+      const index_t data_i = i * row_length + j;


i * row_length can be cached and computed only once.
Same for (prefix_sum[i]-1) * row_length

eric-haibin-lin

LGTM in general

eric-haibin-lin · 2018-01-03T20:13:06Z

src/operator/optimizer_op-inl.h

@@ -460,6 +461,99 @@ inline void SGDMomUpdateRspRspRspImpl(const SGDMomParam& param,
                                 mom.data(), req, &out_blob);
 }

+template<int n_rsp, int n_rsp_dns>


Let's add some description on what the template params mean

* standard sparse sgd mom update * update * update comments * address comments * revise * more general infer stype * fix * fix * add comments for stype inference func * update

ZiyueHuang added 3 commits December 23, 2017 13:43

standard sparse sgd mom update

ef0484f

update

f660d37

update comments

f90c450

eric-haibin-lin self-assigned this Dec 27, 2017

eric-haibin-lin reviewed Dec 31, 2017

View reviewed changes

eric-haibin-lin reviewed Jan 1, 2018

View reviewed changes

ZiyueHuang added 2 commits January 1, 2018 11:10

address comments

54c901c

Merge remote-tracking branch 'upstream/master' into std_sparse_sgd

66c87b8

eric-haibin-lin reviewed Jan 2, 2018

View reviewed changes

eric-haibin-lin reviewed Jan 3, 2018

View reviewed changes

ZiyueHuang added 4 commits January 3, 2018 07:12

revise

9ed9563

more general infer stype

37d57d6

fix

a944b76

fix

70596d2

eric-haibin-lin reviewed Jan 3, 2018

View reviewed changes

ZiyueHuang added 3 commits January 4, 2018 03:06

resolve conflict

703db8f

add comments for stype inference func

96de6c5

update

1807ca4

eric-haibin-lin approved these changes Jan 4, 2018

View reviewed changes

eric-haibin-lin merged commit df9f79a into apache:master Jan 5, 2018

ZiyueHuang deleted the std_sparse_sgd branch January 30, 2018 11:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

standard update for sparse sgd_mom_update #9189

standard update for sparse sgd_mom_update #9189

ZiyueHuang commented Dec 23, 2017 •

edited

Loading

eric-haibin-lin Dec 31, 2017

eric-haibin-lin Dec 31, 2017

ZiyueHuang Jan 1, 2018

eric-haibin-lin Dec 31, 2017

eric-haibin-lin Jan 1, 2018

eric-haibin-lin Jan 1, 2018

eric-haibin-lin Jan 2, 2018

eric-haibin-lin Jan 3, 2018

eric-haibin-lin left a comment

eric-haibin-lin Jan 3, 2018

standard update for sparse sgd_mom_update #9189

standard update for sparse sgd_mom_update #9189

Conversation

ZiyueHuang commented Dec 23, 2017 • edited Loading

Description

Checklist

Essentials

Changes

Comments

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eric-haibin-lin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ZiyueHuang commented Dec 23, 2017 •

edited

Loading