Simplify training interface by removing weight decay and scaling #695

neubig · 2017-07-14T13:25:24Z

There is some confusion regarding the interface of the "Trainer" class:
#641
#684

I agree that it's difficult to understand. This commit removes the rate decay and gradient scaling functionality that implicitly changes the learning rate in non-transparent ways. Here is an example of the before/after behavior:

Rate Decay Before

// At beginning of training
Trainer trainer(initial_learning_rate, rate_decay)
// After every epoch
trainer.update_epoch()

Rate Decay After

// At beginning of training
Trainer trainer(initial_learning_rate)
// After every epoch
trainer.learning_rate /= (1 - rate_decay)

Gradient Scaling Before:

cg.backward(loss)
trainer.update(scaling_factor)

Gradient Scaling After:

cg.backward(loss * scaling_factor)
trainer.update()

pmichel31415

LGTM!

liesun1994 · 2017-08-17T14:04:42Z

I am wondering if lamtram need to be modified with the issues pulled.

liesun1994 · 2017-08-17T14:05:51Z

When I am running lamtram， it may occurs Trainer::update_epoch has been deprecated and doesn't do anything. Please remove it from your code, and control the learning rate of the trainer directly, for example by: 'trainer.learning_rate /= (1 - rate_decay)', see #695 for details .

It was deprecated in clab#695.

* Remove deprecated Trainer::update_epoch It was deprecated in #695. * Remove first variable from examples

yuvalpinter · 2017-10-28T21:53:03Z

I can't seem to be able to run the trainer.learning_rate /= (1 - rate_decay) code. It gives me the following error: TypeError: 'property' object is not callable.
This happens for both MomentumSGDTrainer and AdamTrainer.

shuheik · 2018-01-07T16:35:50Z

Shouldn't it be
trainer.learning_rate *= (1 - rate_decay) or
trainer.learning_rate /= (1 + rate_decay)
instead of
trainer.learning_rate /= (1 - rate_decay)?

I assume rate_decay to be some small positive value and dividing by (1- rate_decay) would make learning_rate grow larger.

Simplify training interface by removing weight decay and scaling

221a833

neubig requested review from yoavg, armatthews and pmichel31415 July 14, 2017 13:25

neubig added 2 commits July 14, 2017 10:20

Merge branch 'master' of github.com:clab/dynet into simplify-training

2d42a3d

Fixed some mistakes in tests

eb8c864

pmichel31415 approved these changes Jul 14, 2017

View reviewed changes

pmichel31415 and others added 2 commits July 14, 2017 13:05

Getter/setter for learning rate in python bindings

ed07ac4

Fixed tests on old version of boost

c3ede39

neubig merged commit 348b502 into master Jul 14, 2017

This was referenced Jul 14, 2017

Passing scale to Optimizer changes gradient clipping threshold #641

Closed

update_epoch changes eta regardless of current eta #684

Closed

neubig mentioned this pull request Aug 1, 2017

params in MomentumSGDTrainer yuvalpinter/Mimick#1

Closed

neubig deleted the simplify-training branch September 1, 2017 17:55

tetsuok added a commit to tetsuok/dynet that referenced this pull request Sep 2, 2017

Remove deprecated Trainer::update_epoch

3cb9dd7

It was deprecated in clab#695.

tetsuok mentioned this pull request Sep 2, 2017

Remove deprecated Trainer::update_epoch #850

Closed

tetsuok added a commit to tetsuok/dynet that referenced this pull request Sep 3, 2017

Remove deprecated Trainer::update_epoch

4569924

It was deprecated in clab#695.

tetsuok mentioned this pull request Sep 3, 2017

Remove deprecated Trainer::update_epoch #852

Merged

neubig pushed a commit that referenced this pull request Sep 4, 2017

Remove deprecated Trainer::update_epoch (#852)

cb24072

* Remove deprecated Trainer::update_epoch It was deprecated in #695. * Remove first variable from examples

shirayu mentioned this pull request Nov 1, 2017

Trainer::update_epoch has been deprecated neubig/lamtram#33

Open

yuvalpinter added a commit to yuvalpinter/dynet that referenced this pull request Jan 10, 2018

Added Trainer.set_learning_rate() clab#695

f0f5c71

neubig pushed a commit that referenced this pull request Jan 17, 2018

Added Trainer.set_learning_rate() #695 (#1176)

0eeb191

kashif pushed a commit to kashif/dynet that referenced this pull request Jan 19, 2018

Added Trainer.set_learning_rate() clab#695 (clab#1176)

dd76c2b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplify training interface by removing weight decay and scaling #695

Simplify training interface by removing weight decay and scaling #695

neubig commented Jul 14, 2017 •

edited

Loading

pmichel31415 left a comment

liesun1994 commented Aug 17, 2017

liesun1994 commented Aug 17, 2017

yuvalpinter commented Oct 28, 2017

shuheik commented Jan 7, 2018

Simplify training interface by removing weight decay and scaling #695

Simplify training interface by removing weight decay and scaling #695

Conversation

neubig commented Jul 14, 2017 • edited Loading

pmichel31415 left a comment

Choose a reason for hiding this comment

liesun1994 commented Aug 17, 2017

liesun1994 commented Aug 17, 2017

yuvalpinter commented Oct 28, 2017

shuheik commented Jan 7, 2018

neubig commented Jul 14, 2017 •

edited

Loading