-
Notifications
You must be signed in to change notification settings - Fork 18.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RMSprop clean up and rebase #2867
Conversation
0bd5360
to
3922737
Compare
thanks for handling this :) |
@@ -521,7 +531,7 @@ TYPED_TEST(NesterovSolverTest, TestNesterovLeastSquaresUpdateWithMomentum) { | |||
const Dtype kMomentum = 0.5; | |||
const int kNumIters = 1; | |||
for (int i = 0; i <= kNumIters; ++i) { | |||
this->TestLeastSquaresUpdate(kLearningRate, kWeightDecay, kMomentum, i); | |||
this->TestLeastSquaresUpdate(kLearningRate, kWeightDecay, kMomentum, 0., i); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These should be declared as constants (e.g. const Dtype kRMSDecay = 0
) like the other args to make the meaning clear.
Thanks @erogol for the original work and thanks @ronghanghu for the rebase. This looks good except as noted above. |
@jeffdonahue OK, I'll handle them. Thanks for the comments! |
3e8ab30
to
fbd0533
Compare
Cool, LGTM. @ronghanghu feel free to merge whenever it's easiest for you, before or after the other two PRs. |
7f61b86
to
abe99e8
Compare
Implement RMSProp solver and cleaned up to adjust to new solver interface that uses accumulated gradients and refactored regularization.
|
||
protected: | ||
virtual void InitSolver(const SolverParameter& param) { | ||
this->solver_.reset(new RMSPropSolver<Dtype>(param)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you set the RMS decay here, instead of introducing the decay argument to least squares and snapshotting tests? Since it is unique to this solver I think it is best handled here.
@ronghanghu Sorry I didn't catch this earlier, but I have a suggestion for the RMS decay parameter in the tests. Instead of introducing another argument and setting it for every test, this param could be set by the RMSProp test class for encapsulation. Could you send a follow-up PR to make this change? |
@shelhamer Yes, I can send another PR to do that. Adam solver is also going to introduce a Addressed in #2888. |
Rebased and adapted RMSprop implementation #1890 to the new solver interface #2518 and #1977. The original author is @erogol. Pulled against master instead of dev.
The RMSprop solver is based on G. Hinton's lecture (http://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf). Param gradients are divided by average root mean square of gradients in recent batches. It can be seen as a mini-batch version of using only the sign of gradients.
Update rule:
Momentum is not supported for RMSprop solver, as in #1890.