Refactor chisquare function in least squares #158

fjosw · 2023-03-01T16:33:14Z

This is the second step in refactoring the fit module in which I removed redundant formulations of the chisquare function. Please have another close look @s-kuberski & @PiaLJP

least_squares.

s-kuberski · 2023-03-02T08:44:46Z

Thanks. I would like to discuss the general ansatz we use to distinguish between correlated and uncorrelated fits. In principle, we could use a single chisquare function and shift the distinction between the two variants to the determination of the inverse covariance matrix. This would mean using np.diag(np.diag(corr)) for uncorrelated fits or just skipping the whole inversion via Cholesky decomposition and using the inverse of the diagonal entries.
This would make future changes to the chisquared function easier since we do not have to maintain two versions of the chisquared functions anymore.

Also, as a next step, we could allow the user to provide a covariance matrix on their own. This would make the whole fit routine more general and would allow, e.g., to use Ledoit-Wolf shrinkage or similar techniques to regulate the covariance matrix, before the fit is performed (ensuring a correct interpretation in combination with the expected chisquare). Scans of fit ranges relying on the covariance matrix, either because one uses correlated fits or because the expected chisquare is computed, could be significantly sped up if the user could just pre-compute the covariance matrix and pass the (sub-)matrix to the fit routine.

As last point, for future reference, one could even allow the user to pass chisquared functions on their own, making the whole routine completely flexible. We don't have to integrate such feature now, but it would certainly easier to do so when there is just a single default routine.

fjosw · 2023-03-02T09:30:10Z

I think one of the reasons why we have a separate chisq function for uncorrelated fits is that the arithmetic intensity is lower when one can avoid the matrix multiplication ($N^2$ or $(N^2+N)/2$ vs $N$ multiplications). I never tested how big the impact of this actually is, it might be negligible.

I agree that it could be useful to provide the covariance matrix and that should also be easy to implement. I would propose to do that in a separate pull request though.

Providing a custom chisq function can be a bit tricky because it has to fulfill a few constraints for the correct error propagation but if it is useful we could still add it with adequate documentation.

But do the changes I made look okay to you?

fjosw · 2023-03-02T10:05:33Z

I did a quick benchmark of the two ways of computing the uncorrelated chisquare function for different problem sizes:

s-kuberski

Could we simplify the current structure even futher? I imagine a situation where we only have one if checking if we want to do a correlated fit or not. We could define the functions general_chisqfunc_corr, chisqfunc_residuals_corr (if needed ) and general_chisqfunc_nocorr, 'chisqfunc_residuals_nocorr and then just use something like

chisqfunc = general_chisqfunc_corr
chisqfunc_residuals = chisqfunc_residuals_corr

to, once and for all, fix these two functions for later use. This would remove the if clauses in lines 602 and 705 and streamline the code.

You could also just define the two functions in the if of line 583 and use an else for the uncorrelated case.

s-kuberski · 2023-03-02T17:04:12Z

I did a quick benchmark of the two ways of computing the uncorrelated chisquare function for different problem sizes:

Thanks for the timings! If the evaluation of the chisquare function is a relevant part of the computational effort (and this is most likely the case), we should really stick to having to distinct functions.

Co-authored-by: Simon Kuberski <simon.kuberski@uni-muenster.de>

fjosw · 2023-03-02T18:44:01Z

I tried to follow up on your suggestion @s-kuberski and simplified the logic in least_squares. I moved the functions with the suffix _residuals to the Levenberg-Marquardt block as they are only needed there. I also removed a few if clauses.

fjosw added 2 commits March 1, 2023 16:12

refactor: chisqfunc rewritten as sum over residuals.

ee2944d

refactor: removed redundant formulations of the chisquare function in

a140b2a

least_squares.

fjosw requested a review from s-kuberski March 1, 2023 16:54

s-kuberski reviewed Mar 2, 2023

View reviewed changes

refactor: logic in least square fits simplified.

2ac3851

Co-authored-by: Simon Kuberski <simon.kuberski@uni-muenster.de>

fjosw force-pushed the refactor/fits2 branch from 3181551 to 2ac3851 Compare March 2, 2023 18:43

fjosw mentioned this pull request Mar 3, 2023

Fix/multi xdim fits #160

Merged

fjosw merged commit cb289a5 into develop Mar 3, 2023

fjosw deleted the refactor/fits2 branch March 3, 2023 16:34

fjosw mentioned this pull request Mar 6, 2023

Improved prior fit #161

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor chisquare function in least squares #158

Refactor chisquare function in least squares #158

fjosw commented Mar 1, 2023

s-kuberski commented Mar 2, 2023

fjosw commented Mar 2, 2023

fjosw commented Mar 2, 2023

s-kuberski left a comment

s-kuberski commented Mar 2, 2023

fjosw commented Mar 2, 2023

Refactor chisquare function in least squares #158

Refactor chisquare function in least squares #158

Conversation

fjosw commented Mar 1, 2023

s-kuberski commented Mar 2, 2023

fjosw commented Mar 2, 2023

fjosw commented Mar 2, 2023

s-kuberski left a comment

Choose a reason for hiding this comment

s-kuberski commented Mar 2, 2023

fjosw commented Mar 2, 2023