-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes to MoreThuente #75
Conversation
Codecov Report
@@ Coverage Diff @@
## master #75 +/- ##
==========================================
+ Coverage 62.27% 62.82% +0.55%
==========================================
Files 8 8
Lines 607 616 +9
==========================================
+ Hits 378 387 +9
Misses 229 229
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #75 +/- ##
=========================================
+ Coverage 62.27% 62.58% +0.3%
=========================================
Files 8 8
Lines 607 620 +13
=========================================
+ Hits 378 388 +10
- Misses 229 232 +3
Continue to review full report at Codecov.
|
Good catch... and did it work after you fixed this? |
Yes, it now converges, whilst previously it generated NaN step guesses. I have also compared behaviour with a without the |
Alright, looks good to me then. They probably did it for a reason, if it's a floating point reason, maybe @simonbyrne has stumbled across something like it before? |
I'll get in contact with the people who were involved in the implementations of the original codes, and see if they can remember (originally from 1983 and 1991, so we'll see...) |
Dianne O'Leary pointed out to me that I'll merge and tag this on Tuesday, if nobody has any objections. |
Unless you have a specific reason not to, just squash+merge and tag now (or whenever you feel like it). |
After looking through the MoreThuente code, I found that the original translation from the MATLAB code has a bug in it. There are several lines in
cstep
where a values = max(theta, dgx,dg)
is calculated. In the original FORTRAN and MATLAB codes, however, this is set tos = max(abs(theta), abs(dgx), abs(dg))
.I caught this when an optimization problem I was running broke down. It is a high-dimensional problem, and I can't replicate it in a test as the input values to the line search are shown in pretty-mode, and small changes in the eight decimal place mean that the problem does not appear...
There also seems to be a weird choice in the original codes where they calculate
gamma = s* sqrt( (theta/s)^2 - (dg/s) * (dgx / s))
. Ass >= 0
, this should be equivalent (modulo floating points) togamma = sqrt( theta^2 - dg * dgx)
. I have made the change, as it omits the potentially problematic division bys
.In addition, I have moved the finite-value check from #73 outside the of loop, so that it does not interfere with the algorithm in itself, but only ensures that the initial step does not mess things up.