Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adam solver #2918

Merged
merged 2 commits into from
Aug 14, 2015
Merged

Adam solver #2918

merged 2 commits into from
Aug 14, 2015

Conversation

ronghanghu
Copy link
Member

Carried on Adam solver (originally #2856) for merge.

I completed the tests and rebased it to latest master.

Authorship belongs to @PatWie, and is preserved in git commit.

Original message in #2856 :

This commit implements the Adam solver by Kingma et. al for CPU and GPU. All solver parameters are defined in the caffe.proto. This also adds an example for the MNIST dataset.

As you may see, now both solver.cpp and test_gradient_based_solver.cpp are growing to 1000+ lines. This problem will be addressed in #2890.

@ronghanghu
Copy link
Member Author

Encountered error larger than margin on solver tests. Perhaps this is simply numerical issue. Looking into details.
It was a mistake I made when implementing test code ComputeLeastSquaresUpdate for Adam, where I forgot to power beta1 and beta2 by t. Now everything is fine.

@ronghanghu ronghanghu mentioned this pull request Aug 14, 2015
@ronghanghu
Copy link
Member Author

@shelhamer @jeffdonahue @philkr @PatWie Please take a look if you have time. I think this should be ready to merge.

This is last piece of the solver trilogy in #2860. After merging this one, we can address #2890.

@@ -218,6 +218,21 @@ class AdaDeltaSolver : public SGDSolver<Dtype> {
};

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to cite the ADAM paper somewhere*. I suggest putting a reference here, e.g. in a doxygen formatted comment like this. Eventually it would also be good to add sections to the solver tutorial on these new solvers, where the reference should also then be added.

*We probably also need to go back and add references for some of the other recently merged solvers.

@jeffdonahue
Copy link
Contributor

Thanks for the rebase @ronghanghu and thanks @PatWie for the original implementation! See above comment; otherwise looks good.

@ronghanghu
Copy link
Member Author

Citation added for Adam.

Eventually it would also be good to add sections to the solver tutorial on these new solvers, where the reference should also then be added.
*We probably also need to go back and add references for some of the other recently merged solvers.

Let's address that in #2890 .


// we create aliases for convenience
size_t update_history_offset = net_params.size();
shared_ptr<Blob<Dtype> > val_m = this->history_[param_id];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to create shared_ptrs for these val_* variables, is there? (I suggest using the raw pointer, e.g. Blob<Dtype>* val_m = this->history_[param_id].get();.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are right. I should use raw pointers.

@jeffdonahue
Copy link
Contributor

Thanks for adding the citation. After a final glance I noticed the one other thing I commented above; sorry about not noticing before. Feel free to merge after addressing that.

@PatWie
Copy link
Contributor

PatWie commented Aug 14, 2015

Looks good.

This commit implements the Adam solver by Kingma et. al for CPU and
GPU. All solver parameters are defined in the caffe.proto. This also
adds an example for the MNIST dataset.
@ronghanghu
Copy link
Member Author

Changed from shared ptrs to raw ptrs in AdamSolver<Dtype>::ComputeUpdateValue.

ronghanghu added a commit that referenced this pull request Aug 14, 2015
@ronghanghu ronghanghu merged commit cbca8fe into BVLC:master Aug 14, 2015
@ronghanghu ronghanghu deleted the adam branch August 14, 2015 16:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants