-
Notifications
You must be signed in to change notification settings - Fork 18.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor Solver to allow interactive stepping #1228
Conversation
@longjon I am currently playing on top of this branch. Right now I am simply planning to expose these to python, so as to have a python re-implementation of Solver::Solve, and then add there callbacks to python plotting magic (see #481). |
@rodrigob Cool, let me know (or PR this branch) if you find any errors. In my use, I don't worry about things happening at the end of But yes, I essentially use a Python implementation of |
LGTM |
I'm using this PR for implementing online reinforcement learning. I think it's very useful but needs a small fix to work safely:
This is how I fixed it: muupan@d098036 |
@muupan you're right, of course. I've updated this with a rebase of my most recent working version. |
5f43178
to
033bafe
Compare
Refactor Solver to allow interactive stepping
Refactor Solver to allow interactive stepping Conflicts: src/caffe/solver.cpp
Updated: the snapshot logic has been changed to properly snapshot after every _k_th iteration, instead of snapshotting before the iteration following every _k_th. (In other words, snapshotting has been moved to the end of the solver loop.) This simplifies the logic (except the "snapshot after train" logic), and ensures that |
} | ||
|
||
template <typename Dtype> | ||
void Solver<Dtype>::PreSolve() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO since this method is common setup specific to the Solver
class itself, it should just be given a new name like Initialize()
(and then called separately before PreSolve()
@ line 236), so that anyone subclassing Solver doesn't need to remember to call it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that forcing subclassers to call PreSolve
is awkward. Is there any reason we don't just call PreSolve
from Init
? That would simplify the logic and get rid of initialized_
, as there would be no such thing as an uninitialized solver, and then it makes sense to do the base class initialization right in Init
. I think I'll try it and update the PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good to me!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, we can't do that, because virtual functions can't be called from constructors. Here's a slightly more radical idea, which I've implemented now: get rid of Solver::PreSolve
and just use the constructors, since constructors now provide the same functionality. SGDSolver::PreSolve
, which was the only actual PreSolve
, remains as it was and gets called from the SGDSolver
constructor.
147555a
to
9478a48
Compare
// Remember the initial iter_ value; will be non-zero if we loaded from a | ||
// resume_file above. | ||
void Solver<Dtype>::Step(int iters) { | ||
vector<Blob<Dtype>*> bottom_vec; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can be removed since you changed the call to ForwardPrefilled
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changed call is in Solve
, this is in Step
, where the call is to ForwardBackward
. An alternative is to remove bottom_vec
but call ForwardPrefilled
and Backward
separately; that's what I've now implemented.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I think I'll revert that, because it makes #1663 awkward. We should probably just clean up the Net
interface to avoid these dummy vectors at some later point.
@jeffdonahue I think I'm satisfied with this now; merge when ready. |
Refactor Solver to allow interactive stepping
Cool, thanks Jon |
In this PR, the main loop of
Solver::Solve
is factored out intoSolver::Step
, which is exposed to Python. This allows interactive stepping of the solver.Warning: I've only actually tested a parallel version of this on another branch.