-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added gamma
to LBFGSState
#320
Conversation
One aspect I am not too sure about though in the arguments of |
90dbb5b
to
3a31685
Compare
We use a circular buffer to maintain the history without lists. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks a lot for the contribution
I think I don't have access to the logs of the copybara failure, or at least I am not sure how to access them in order to correct the failure. |
jaxopt/_src/lbfgs.py
Outdated
@@ -142,6 +142,7 @@ class LbfgsState(NamedTuple): | |||
s_history: Any | |||
y_history: Any | |||
rho_history: jnp.ndarray | |||
gamma: Any = jnp.array(1.0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I overlooked this in my previous review but you shouldn't initialize the array here. Please do
gamma: Any = jnp.array(1.0) | |
gamma: Any = jnp.ndarray |
and instead initialize gamma
in init_state
. CC @froystig
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I meant
gamma: jnp.ndarray
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
2267f6e
to
a18481c
Compare
a18481c
to
b270cc1
Compare
Per change in google/jaxopt#320
Per change in google/jaxopt#320
Per change in google/jaxopt#320
This PR adds a new attribute to the
LBFGState
named tuple,gamma
.This attribute is needed when one wants to use the approximation of the Hessian inverse after the algorithm has run using
inv_hessian_product
.Indeed I realized that in this implementation, the initial value of the approximation of the Hessian is recomputed at each iteration, as suggested by Nocedal (something that is not present in scipy to the best of my knowledge and understanding of Fortran).
The application I have in mind for the approximation of the Hessian inverse is https://arxiv.org/abs/2106.00553.
(I didn't write an issue first because it's such a small PR, and I was going to write this part anyway for my own xps)