Inconsistency between `jax.scipy.minimize` and `jaxopt.LBFGS` #322

zaccharieramzi · 2022-10-07T15:31:20Z

zaccharieramzi
Oct 7, 2022

Hi,

As I understand it, jax.scipy.minimize is going to be deprecated in favor of jaxopt.LBFGS (see here).
I am in the process of translating some code using jax.scipy.minimize to jaxopt.LBFGS, and I am noticing some inconsistencies.
In my case, jaxopt.LBFGS takes way more iterations to converge than jax.scipy.minimize.

I tried a minimal example, where the results were a bit different from my use case but still point an inconsistency:

import jax.numpy as jnp
from jax.scipy.optimize import minimize


from jaxopt import LBFGS
from jaxopt._src import test_util

def fun(x, *args, **kwargs):
    return 15.0*(x[1] - x[0]**2.0)**2.0 + (1 - x[0])**2.0

x0 = jnp.zeros(2)
lbfgs = LBFGS(fun=fun, tol=1e-3, maxiter=500, maxls=20)
x_jaxopt, lbfgs_state_jaxopt = lbfgs.run(x0)

soln_jaxminimize = minimize(
    fun, 
    x0, 
    method="l-bfgs-experimental-do-not-rely-on-this",
    options={"gtol": 1e-3, "maxiter": 500, "maxls": 20},
)
x_jax_minimize = soln_jaxminimize.x

print(soln_jaxminimize.success.item(), soln_jaxminimize.status.item(), soln_jaxminimize.nit.item())

test_util.JaxoptTestCase().assertArraysAllClose(x_jaxopt, x_jax_minimize, atol=1e-3)

which should output

False 5 1

...

Mismatched elements: 2 / 2 (100%)
Max absolute difference: 0.998744
Max relative difference: 3.2761042
 x: array([0.999655, 0.998744], dtype=float32)
 y: array([0.233777, 0.      ], dtype=float32)

What this shows is that in the case of a simplified Rosenbrock function (i.e. 15 instead of 100), the implementation of jax.scipy.minimize fails completely (not the case for 10). In particular it's failing after 1 iteration and because there is a line-search issue (as indicated by the status id 5).

I am about to start a debugging quest to find out what the differences in the line search implementations are, but potentially this is already a known topic.
For example, I know @mblondel implemented the line search, maybe you know?

It's a tricky one because again, in my actual use case the jax.scipy.minimize of the line search seems to actually perform better.

Answered by zaccharieramzi

Oct 7, 2022

I found out! Took quite some time of debugging but here you go: the discrepancy comes from the fact that in jax.scipy.minimize when the line search fails, i.e. ls_results.failed == True, the status is then 5.
In turn, this makes the LBFGS state failed attribute True.
When this attribute is True, then of course the iterations stop.

This is not the case in jaxopt. Indeed, no matter the failure of the line search, the step size is always accepted. This is because the LBFGS state in jaxopt is less rich than in jax.scipy.minimize, and it does not allow some more custom logic in the IterativeSolver's _cond_fun. Maybe one way to get back that consistency would be to enrich the LBFGSState tuple w…

View full answer

zaccharieramzi · 2022-10-07T16:10:12Z

zaccharieramzi
Oct 7, 2022
Author

I also see that there was a similar issue pointed out here, but again in this example jax.scipy.minimize was performing better

0 replies

mblondel · 2022-10-07T16:17:21Z

mblondel
Oct 7, 2022
Maintainer

We use the zoom line search by default, which comes from jax.scipy.minimize. The only modifications we made are for supporting pytrees. Our LBFGS implementation is different from jax.scipy.minimize but in the tests, we have a minimalist implementation of LBFGS to check for correctness. So I'm not sure what could cause this discrepancy.

0 replies

zaccharieramzi · 2022-10-07T17:24:29Z

zaccharieramzi
Oct 7, 2022
Author

I found out! Took quite some time of debugging but here you go: the discrepancy comes from the fact that in jax.scipy.minimize when the line search fails, i.e. ls_results.failed == True, the status is then 5.
In turn, this makes the LBFGS state failed attribute True.
When this attribute is True, then of course the iterations stop.

This is not the case in jaxopt. Indeed, no matter the failure of the line search, the step size is always accepted. This is because the LBFGS state in jaxopt is less rich than in jax.scipy.minimize, and it does not allow some more custom logic in the IterativeSolver's _cond_fun. Maybe one way to get back that consistency would be to enrich the LBFGSState tuple with a failed attribute and override _cond_fun in LBFGS.

However, I am not sure it's necessarily something we want given that the algorithm works even though the line search fails.

Also, this explains the discrepancy in this minimal example but not in my case, I will need to do some more logging to see exactly what's going on.

0 replies

zaccharieramzi · 2022-10-07T17:30:34Z

zaccharieramzi
Oct 7, 2022
Author

And indeed if I comment out the failure line in the jax.scipy.minimize case, then there is no more discrepancy between the 2 implementations in the minimal example above.

0 replies

zaccharieramzi · 2022-10-07T17:37:37Z

zaccharieramzi
Oct 7, 2022
Author

And actually it also explains exactly the discrepancy I noticed in my real use case!! Commented out the same line, and there I got a lot more iterations with jax.scipy.minimize, too much even which is surely another problem, but I think this is it.

I will actually try to implement the fix I was mentioning, maybe it will be worth considering for jaxopt.

0 replies

mblondel · 2022-10-07T17:49:34Z

mblondel
Oct 7, 2022
Maintainer

Thanks a lot for the investigation! So do you think it's better to stop the algorithm altogether when the line search failed?

We need either a way to override the stopping criterion or a mechanism to tell the optimization loop that it has to stop.

0 replies

zaccharieramzi · 2022-10-07T17:54:50Z

zaccharieramzi
Oct 7, 2022
Author

I honestly have no idea atm... That's why I was suggesting just allowing the user to decide for themself with an attribute allow_line_search_fail until we figure out what's happening there.

The reason I am confused is because in the Rosenbrock case it appears to be better to ignore the failure, while in my use case it appears to be better not to ignore it... it definitely depends on the situation, the question is "what is the discriminating factor?".

Anyway in the meantime I am submitting a PR with the fix implemented, although it's branched off my previous PR.

0 replies

mblondel · 2022-10-07T18:06:50Z

mblondel
Oct 7, 2022
Maintainer

Let's merge your previous PR first, for simplicity. There is just a small fix needed in the way you initialize gamma.

1 reply

zaccharieramzi Oct 7, 2022
Author

yes just opening it as draft as I will be AFK in a few

ynqiu · 2023-09-28T15:18:59Z

ynqiu
Sep 28, 2023

I have obtained similar mistakes between jax.scipy.minimize and jaxopt.LBFGS. I found that scipy.optimize.minimize and jaxopt.LBFGS obtain the same results, but distinct with jax.scipy.minimize.

1 reply

zaccharieramzi Oct 3, 2023
Author

Hi @ynqiu ,

could you provide the code to reproduce your observations?

Thanks

sabrinastronomy · 2023-10-19T04:37:46Z

sabrinastronomy
Oct 19, 2023

@zaccharieramzi I'm having the same issue. Did you ever submit a PR where you implemented allow_line_search_fail?

3 replies

sabrinastronomy Oct 19, 2023

Oops I found it in PR #323, but I still have the same issue as you quoted above: "The reason I am confused is because in the Rosenbrock case it appears to be better to ignore the failure, while in my use case it appears to be better not to ignore it... it definitely depends on the situation, the question is "what is the discriminating factor?"

Did you ever figure out that discriminating factor?

zaccharieramzi Oct 19, 2023
Author

Hi @sabrinastronomy ,

Unfortunately no I never took the time to properly investigate ...

vroulet Oct 19, 2023
Maintainer

Hi @sabrinastronomy,

Do you have a code example? It would be great for us to expand the tests we have on LBFGS.
Also if the linesearch fails, you should now get some warning messages, did those help?
Also did you try to simply increase the number of linesearch iterations?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inconsistency between `jax.scipy.minimize` and `jaxopt.LBFGS` #322

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 10 comments 5 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Inconsistency between jax.scipy.minimize and jaxopt.LBFGS #322

zaccharieramzi Oct 7, 2022

Replies: 10 comments · 5 replies

zaccharieramzi Oct 7, 2022 Author

mblondel Oct 7, 2022 Maintainer

zaccharieramzi Oct 7, 2022 Author

zaccharieramzi Oct 7, 2022 Author

zaccharieramzi Oct 7, 2022 Author

mblondel Oct 7, 2022 Maintainer

zaccharieramzi Oct 7, 2022 Author

mblondel Oct 7, 2022 Maintainer

zaccharieramzi Oct 7, 2022 Author

ynqiu Sep 28, 2023

zaccharieramzi Oct 3, 2023 Author

sabrinastronomy Oct 19, 2023

sabrinastronomy Oct 19, 2023

zaccharieramzi Oct 19, 2023 Author

vroulet Oct 19, 2023 Maintainer

Inconsistency between `jax.scipy.minimize` and `jaxopt.LBFGS` #322

zaccharieramzi
Oct 7, 2022

Replies: 10 comments 5 replies

zaccharieramzi
Oct 7, 2022
Author

mblondel
Oct 7, 2022
Maintainer

zaccharieramzi
Oct 7, 2022
Author

zaccharieramzi
Oct 7, 2022
Author

zaccharieramzi
Oct 7, 2022
Author

mblondel
Oct 7, 2022
Maintainer

zaccharieramzi
Oct 7, 2022
Author

mblondel
Oct 7, 2022
Maintainer

zaccharieramzi Oct 7, 2022
Author

ynqiu
Sep 28, 2023

zaccharieramzi Oct 3, 2023
Author

sabrinastronomy
Oct 19, 2023

zaccharieramzi Oct 19, 2023
Author

vroulet Oct 19, 2023
Maintainer