Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serious bug in HillClimbingEstimator? Causing only first iteration to matter #13

Closed
machow opened this issue Mar 8, 2019 · 5 comments
Closed

Comments

@machow
Copy link

machow commented Mar 8, 2019

Hey, I think that oftentimes the HillClimbingEstimator only does 1 round of estimation (rather than 10, iteratively more precise rounds). The issue is...

  • after round 1, round 2 checks values to the left and right of the best theta, starting from the left (value checked before best theta, that best theta has > likelihood than).
  • likelihood of the leftmost estimate is always < best theta (by definition), so it ends round 2 right away.
  • crazy things start happening, because it sets the lowerbound as the upperbound, and interval sizes can become negative, etc..

image

I need to patch within the next couple days on DataCamp/catsim, so can submit a PR to fix!

Solution

Code here:

https://github.com/douglasrizzo/catsim/blob/master/catsim/estimation.py#L138-L139

seems like the solution is that max_ll needs to be reset each round?

@douglasrizzo
Copy link
Owner

Hi there, thanks for the report.

There's definitely something fishy going on with the estimator. It seems like I was trying to implement some kind of early stopping: theta values are checked from left to right and, when the log-likelihood of these values went up, but then started going down, the next candidate theta values didn't need to be checked. However, the implementation seemed to be wrong.

Also, the new intervals I generated in subsequent rounds were also wrong, as you pointed out. All of this made the estimator not be as precise as it can be.

I decided to remove the early stopping idea and implemented something a little more straightforward. After each iteration, the new candidate theta values to be checked are generated around the current best theta, but in smaller intervals. I added some comments in the code and used more descriptive variable names, since it has been a long time since I last worked in this code.

I committed a possible fix to the testing branch. Take a look if it makes sense to you. The lines you were interested in are now here. An example verbose output from the hill climbing estimator can be seen below. Values with larger log-likelihood are printed and we can now see that subsequent passes do generate better thetas.

Pass: 1
	Bounds: -1.735914394995148 0.9443143368283956
	Interval size: 0.2978031924248381
		Theta: -1.735914394995148, LL update: inf
		Theta: -1.43811120257031, LL update: 0.026153563041037664
Pass: 2
	Bounds: -1.735914394995148 -1.140308010145472
	Interval size: 0.06617848720551955
Pass: 3
	Bounds: -1.5042896897758296 -1.3719327153647904
	Interval size: 0.014706330490115382
		Theta: -1.4307580373252522, LL update: 3.4473263991863234e-05
Pass: 4
	Bounds: -1.4454643678153676 -1.4160517068351368
	Interval size: 0.0032680734422478874
		Theta: -1.4291240006041284, LL update: 3.219923880415365e-06
		Theta: -1.4258559271618805, LL update: 1.5761288620907976e-06
Pass: 5
	Bounds: -1.4291240006041284 -1.4225878537196326
	Interval size: 0.0007262385427218021
		Theta: -1.4269452849759632, LL update: 1.9605989365345522e-07
Pass: 6
	Bounds: -1.427671523518685 -1.4262190464332414
	Interval size: 0.00016138634282714115
		Theta: -1.4268645918045497, LL update: 1.023564344393435e-08
		Theta: -1.4267032054617226, LL update: 8.590400391028652e-09
Pass: 7
	Bounds: -1.4268645918045497 -1.4265418191188954
	Interval size: 3.5863631739463386e-05
Pass: 8
	Bounds: -1.426739069093462 -1.4266673418299831
	Interval size: 7.969695942078303e-06
		Theta: -1.4266992206137514, LL update: 1.1692868895352149e-11
Pass: 9
	Bounds: -1.4267071903096935 -1.4266912509178094
	Interval size: 1.771043542708739e-06
		Theta: -1.4266983350919802, LL update: 1.2860823517257813e-12

@machow
Copy link
Author

machow commented Mar 11, 2019

Thanks for your quick response! I think the way you did it previously, going...

  • from left to right
  • a round ending when the current theta has a lower likelihood then the previous

seemed like a good, slightly more efficient strategy, given that the likelihood is concave (at least for 2PL model, off-the-cuff I think this is the case for 3 and 4PL? Could be very wrong about this?). Once the next candidate has a lower likelihood, then the likelihood has started going down, so the rest will also be lower (and don't need to be checked).

As long as the best theta is reset each round it should be okay (since it will pass over the previously best theta with increasing granularity).

@douglasrizzo
Copy link
Owner

I see what you mean. If I reset best_theta and max_ll in every search pass and if I am sure that the current best theta is included in all future search passes, I am guaranteed to at least find the same best_theta every time, but also have the chance to find a better one.

if I am sure that the current best theta is included in all future search passes

The thing is, I never assumed this condition. If you have observed it, it was there by coincidence. I also didn't want to reset best_theta and max_ll on every search pass because, if there was a chance that a better theta was found in a previous search pass, the function would return that value and not the best value found in the current pass.

I decided to just explicitly keep the value of the current log-likelihood (current_ll) and the value of the log-likelihood from the previous theta (previous_ll) and, if current_ll < previous_ll, it means the function is only going to go downhill, so we can stop the search and start another pass.

The implementation of early stopping is here.

I can think of ways to do it better, like search once to the left and once to the right of the current best theta, in order to know which side we actually have to search. This can cut the search space in half and it's not hard to implement.

@douglasrizzo
Copy link
Owner

I'll close this as it appears the issue has been fixed. If you find other anomalies, please report them and I'll try to fix them as quickly as possible. We can continue discussion on this bug here, nevertheless.

@machow
Copy link
Author

machow commented Mar 14, 2019

Thanks for the fix!

That's fair that there could be some small improvements. I guess knowing whether the crnt best theta is to the left or right of the best theta out of the crnt round's values contains a lot of information for limiting the search range.

As is, is super helpful--thanks again for being so responsive!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants