Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix estimate_covar in case of !isempty(fit.wt) #241

Closed
wants to merge 3 commits into from

Conversation

David96
Copy link

@David96 David96 commented Aug 15, 2023

While trying to figure out how LsqFit determines the covariance of the fit parameters, I found this inconsistency between the two used methods in estimate_covar. According to your own docs this factor of sigma^2 is missing at one occasion. I also documented the relation used for the QR decomposition because at least for me this wasn't obvious.

I'm also not sure why there is this special casing for isempty(fit.wt) in the first place since the two methods are mathematically identical but I didn't want to make any wrong assumptions here and therefore only added the fix to the existing case.

Edit: Looking a bit through git blame, it seems like the original idea was to use the weights in this position:
55eefde#diff-b335630551682c19a781afebcf4d07bf978fb1f8ac04c6bf87428ed5106870f5R76
Which was then later removed.

To me, using the inverse weights as variance might somehow make sense, but in the current state I'm pretty sure it is just wrong, it is equivalent to setting sigma^2 = 1.

@pkofod pkofod closed this Oct 3, 2023
@pkofod pkofod reopened this Oct 3, 2023
@pkofod
Copy link
Member

pkofod commented Oct 3, 2023

closed and reopened to see if tests run, if not can I maybe ask you to rebase this PR?

@codecov
Copy link

codecov bot commented Oct 3, 2023

Codecov Report

Merging #241 (165637f) into master (8f8c18a) will decrease coverage by 0.04%.
The diff coverage is 100.00%.

❗ Current head 165637f differs from pull request most recent head a31f793. Consider uploading reports for the commit a31f793 to get more accurate results

@@            Coverage Diff             @@
##           master     #241      +/-   ##
==========================================
- Coverage   90.34%   90.31%   -0.04%     
==========================================
  Files           4        4              
  Lines         259      258       -1     
==========================================
- Hits          234      233       -1     
  Misses         25       25              
Files Coverage Δ
src/curve_fit.jl 86.66% <100.00%> (-0.13%) ⬇️

@pkofod
Copy link
Member

pkofod commented Oct 5, 2023

I remember we discussed this. We may have made the wrong decision, but I'll try to dig out the old discussions.

@pkofod
Copy link
Member

pkofod commented Oct 7, 2023

I now remember the details. The reason why it is the way it is is that it is correct if you are inputting the variances as weights. Then there is no reason to estimate the sigma, because it will be 1 by construction (as I think you found out yourself). This is under the assumption that the variance is correct of course. If you calculate it anyway, the estimated variance will be around 1 if you input the variances as weights.

Hooowever, I realize that people use weights differently than the classical "weight by the variance" application, and in that case you obviously cannot know that MSE will be one. I'm also not sure the interpretation stays as clear, but maybe that's just a problem for the users and not me :)

I think we should maybe implement it as a switch instead of just alway applying it? I agree the inv(J'J) is weird, it should use the same code path as the non weighted. I think it's just the case of someone making a change and then being afraid of messing something unrelated up.

@David96
Copy link
Author

David96 commented Oct 11, 2023

Hey, thanks for your explanation!

After looking at it again I also realized the wt = 1/sigma^2 assumption and that therefore the inv(J'J) was indeed correct. It's maybe a bit confusing also implementation wise, to only use the QR decomposition for the no-weights code path, since it would also be applicable to the weights one.
Using the exact same codepath would not be possible though since the mse function does not include the weights I think? (shouldn't it do that?)
I realize I understand way less of the codebase than I first thought so I best stop messing around with it 😆

Looking at the documentation (https://julianlsolvers.github.io/LsqFit.jl/latest/tutorial/) again, there's this sentence i.e. the covariance of the estimated parameter is calculated as covar = inv(J'*fit.wt*J). which I then think is inconsistent with the implementation and is missing a sigma^2 in front of the inv?

@pkofod
Copy link
Member

pkofod commented Oct 12, 2023

I think inv(J'*fit.wt*J) is actually correct. We just absorb sqrt diagonal elements or cholesky if full matrix into the Jacobian.

I don't mind discussing these things, please keep it coming. I had less time to finish this package than originally planned, so I'm happy that someone keeps it alive by questioning the implementations.

@David96
Copy link
Author

David96 commented Oct 13, 2023

Reading your documentation again, I think you're correct. I changed the commit accordingly, since on the current master this factor fit.wt is missing.

@pkofod
Copy link
Member

pkofod commented Oct 13, 2023

I think you're wrong. If you look at all the curvefit calls, the sort or chol of wt is called u and is applied at the objective type level, so the J you're seeing in the code already has the "sqrt" weight applied.

@David96
Copy link
Author

David96 commented Oct 14, 2023

You are of course right again, I didn't realise that the code uses a different J from the documentation.

@David96 David96 closed this Oct 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants