Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow for user adjusted relative weighting of different data sets in constrained fits #1973

Merged
merged 21 commits into from
Apr 26, 2022

Conversation

Caddy-Jones
Copy link
Contributor

This pull request attempts to give the user the option of changing the statistical weighting of data sets when doing a “Constrained or Simultaneous fit”. The change in statistical weighting is achieved by increasing the data used for the weighting, defined by the “Fit Options” tab, by a factor. The factor by which to increase the statistical weight is calculated by comparing the statistical weight of the different data sets.
By default the error in the y axis (dI) is used. Modifications to the dI data caused by this pull request only happen internally, meaning the error bars will not be modified on the plot.
The changes appear as a new column in the “Const. Simul. Fit” tab.
image

The default option if ‘fixed’, which just means not to modify the weighting. The user input is the desired weighting ratio between 2 data sets.
For example, if there is two data sets, one with ‘fixed’ and the second with ‘1.0’, then the second data set will be modified to have the same statistical weight as the first data set. If the second data set has a weighting of ‘2.0’, then the second data set will be modified to have the twice statistical weight as the first data set. ‘0.5’ will have half the statistical weight etc.
If both data sets have a weighting of ‘1.0’, then both will be modified until their on equal statistical weighting.

@pkienzle
Copy link
Contributor

FitProblem in bumps accepts a weights=[...] option, with one weight per model. Keeping weights in the fit control information rather than changing the Δy on the data seems like a better option.

However, see bumps/bumps#87 — you may need to fix this feature in bumps first.

@Caddy-Jones
Copy link
Contributor Author

Good point @pkienzle , the code now uses the weights option in FitProblem

@Caddy-Jones
Copy link
Contributor Author

I've had a look at the bumps/bumps#87, and I see no issue with the weighting option in FitProblem when doing my tests. I'm getting the same results as when i was modifying Δy

@pkienzle
Copy link
Contributor

This may depend on the particular path through the fitter. In particular, L-M uses residuals when fitting while the others use nllf. You wouldn't notice a difference if you just test with L-M.

It may also be a bumps version issue, though I'm not sure if weights ever worked.

@Caddy-Jones
Copy link
Contributor Author

Caddy-Jones commented Dec 3, 2021

Hi @pkienzle , so I've read through bumps/bumps#87, and am I correct in thinking that everything should be fine when this change gets released to bumps?

@pkienzle
Copy link
Contributor

pkienzle commented Dec 3, 2021

The fix is not available in the current release.

However, looking at the installers workflow, it looks like SasView is not using bumps releases in its build, or any version tagging, so it ought to have the latest version.

I'm not sure that this is a good thing. It'll make reproducible builds impossible since it is pulling the latest versions of all packages.

@butlerpd
Copy link
Member

butlerpd commented Dec 3, 2021

However, looking at the installers workflow, it looks like SasView is not using bumps releases in its build, or any version tagging, so it ought to have the latest version.

That is correct as I learned fairly recently -- but means I guess that the version Iestyn is building against is fixed?

I'm not sure that this is a good thing. It'll make reproducible builds impossible since it is pulling the latest versions of all packages.

Indeed. At the moment the plan is to change the release yml to the latest official version of bumps for the very reason you say @pkienzle . We should probably be building everything else against "latest version" but for now we are worrying about the next release. However this PR is not part of the next release. Question is then, when will the next release of bumps with the required fix be .. released?

@pkienzle
Copy link
Contributor

pkienzle commented Dec 4, 2021

It can be released when needed. There shouldn't be too many changes since we just did a release, so checking it can be quick.

@Caddy-Jones
Copy link
Contributor Author

The virtual env i use for SasView is running bumps version 0.8.0

@Caddy-Jones
Copy link
Contributor Author

Added a description of the weighting functionality to sasview\src\sas\qtgui\Perspectives\Fitting\media\fitting_help.rst, at the bottom of the Simultaneous Fit Mode section.

The description is quite long, but I thought it would be better to over explain the functionality rather than under explain it. I’m willing to trim it down if needs be

Do you mind giving it a read @smk78 ?

@smk78 smk78 requested review from smk78, rozyczko and butlerpd December 9, 2021 10:54
Copy link
Contributor

@smk78 smk78 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a review of the help documentation only. The section headed 'Simultaneous Fits with a Modified Weighting'.

I think this section should start with some context; ie, why might a user want to modify the weighting? This context should start generically (eg, 'In some instances...because...' or 'Problems can arise if an attempt is made to simultaneously fit datasets with very different...because...') and then give the specific example of co-fitting SAXS/SANS. Then introduce the new functionality (eg, 'To help address this problem an additional column has been added to the...').

I would suggest reversing and extending the sentences 'The weighting column modifies the weight of each data set in the simultaneous fitting. By default the weighting is decided by the number of points of a dataset and the error on each point.' So maybe something like 'By default the SasView fitting engine statistically weights each dataset in a simultaneous fit by the number of points in a dataset and the error on each point. The weighting column allows the user to change this default weighting behaviour.'

However, I also think you need to say what the weighting factor will change; I believe it's just the errors (internally), right, you're not changing the number of points too?

Replace 'float(number)' by 'floating point number' or 'real number' to do away with programmer-ese! :-)

I like the change from 'Compare' to 'Subject'!

Should the sentence (which is grammatically incorrect anyhow) 'This dataset will act as the subject of comparison between for the modification of the weighting for other data sets.' say something like 'This dataset will act as the subject against which the weighting of the other data sets will be compared.'?

And 'Only one dataset should have the "Subject" option...' or 'Only one dataset can have the "Subject" option...'? Maybe also change to '...having multiple will lead...' to '...attempting to have multiple will lead...'?

Change '...act as a ration...' to '...act as a ratio...'.

In the sentences that follow, I think you also need to explain the consequences; what I mean by this is, does the simultaneous fitting pay more attention to the dataset with a weighting of 2.0 or 0.5? I could make a convincing argument either way!

Maybe instead of 'A dataset with the “Subject” option in the weighting column is not needed...' it would be clearer to say 'It is not mandatory to have a dataset with the “Subject” option in the weighting column...'?

In the sentence 'This means both datasets will be modified to attain the desired ratio of statistical weight.' the word 'both' should presumably be 'all'?

The sentence 'Two datasets with 1.0 in the weighting column will be both be weighted differently to ensure they both have the same statistical weight.' is very confusing! Does 'differently' need to be 'similarly'?

The examples are good, but maybe make them a bulleted list for clarity?

Typo: should be '...affect the weighting...'

Replace 'Some notes' by a proper note or perhaps warning; eg:

.. warning::

   The weighting modiification...

   If the difference...

There are two typos in the notes (changes needed in bold):

  • Should either say 'The weighting modifications do not...' or 'The weighting modification does not...'
  • Should say '...with a difference...', and I'd change 'sizes' to 'bounds'.

Finally, what about a couple of before-after screenshots (with captions) showing how using weighting improves a fit?

Sorry, this was a long review!

@smk78
Copy link
Contributor

smk78 commented Dec 9, 2021

PS @Caddy-Jones , don't worry about the length; let's focus on making it useful to the reader!

@Caddy-Jones
Copy link
Contributor Author

Hi @smk78

I’ve updated the help document following your review. The only changes I didn’t include is this,

“And 'Only one dataset should have the "Subject" option...' or 'Only one dataset can have the "Subject" option...'? Maybe also change to '...having multiple will lead...' to '...attempting to have multiple will lead...'?”

My reasoning is that it is possible to type “Subject” for multiple rows, but only one row will act as the subject for comparison, the other datasets will act as if “Default” was selected. This is communicated to the user via a warning message, but since it is physically possible to type “Subject” in multiple columns, saying “Only one dataset can have the "Subject" option...” is technically incorrect and may be reported needlessly as a bug.
Incidentally the reason allowed multiple “Subject” rows was in case the user loaded many datasets, but was deselecting and then reselecting datasets to fit in the constrained fit tab.

Also I haven’t included any screenshots as I am in the process of attempting to fit some data SAXS and SANS that’s been given to me, so may use that as my example

Thanks for your review :)

Copy link
Contributor

@pkienzle pkienzle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the proposed weighting system will be heavily biased by high Q values. Furthermore, I think that increasing the weights on a dataset decreases the influence, which is opposite of what is described in the docs.

I suggest a much simpler weighting system. Allow the user to specify an error bar scaling for each dataset, preferably limited to values greater than one. The user can adjust these values and refit until the residuals are mostly in the range [-3, 3]. Users are going to need to explain the choice of weights in the results section of the paper, and complicated expressions involving ratios of averaged inverse variances are hard to explain. Furthermore, the proposed weights don't have any real meaning: A constant scaled by an ad hoc value is still just an ad hoc value.

Alternatively, come up with an objective method of choosing the weights, for example, adjusting the weights during the fit until the sum squared residuals equals the number of data points for each dataset in the fit while minimizing the weights and keeping them greater than one.

if flag == 0:
for id_index in weights.keys():
weight_increase[id_index]
return weight_increase
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fails with weight_increase undefined. Use:

if flag == 0:
    return {k: 1.0 for k in weights} 

You also want to do this if no ratios are specified:

defaults = set(("Subject", "Default"))
if flag == 0 or any(v not in defaults for v in ratio.values()):
    return {k: 1.0 for k in weights} 

for val in weights[id_index]:
stat_weight += 1.0 / (val ** 2.0)
av_stat_weight = stat_weight / len(weights[id_index])
stat_weights[id_index] = av_stat_weight
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code simplifies to
stat_weights = {k: np.mean(v**-2) for k, v in weights.items()}

However, I don't think this is the statistic that you want since it'll be dominated by the tiny values at high Q. Relative uncertainty will have better balance.

first_subject_stat_weight_id = list(subject_stat_weights.keys())[0]
logging.warning(f'Multiple data sets specified for comparison using "Subject", will use '
f'{first_subject_stat_weight_id} for comparison.')
subject_stat_weight = subject_stat_weights[first_subject_stat_weight_id]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

subject_ids = [k for k, v in ratio.items() if v == "Subject"]
if len(subject_ids) > 1:
    logging.warning(f'Multiple data sets tagged as "Subject"; using {subject_ids[0]}.')
subject_weight = stat_weights[subject_ids[0]] if subject_ids else np.mean(stat_weights)

else:
desired_stat_weight = subject_stat_weight * float(ratios[id_index])
difference = desired_stat_weight / weight
weight_increase[id_index] = math.sqrt(difference) / num_fits
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scale = lambda k: math.sqrt(subject_weight/stat_weight[k] * float(ratio[k])) / len(weights)
return {k: (1.0 if ratio[k] in defaults else scale(k)) for k in weights}

ignored to such a degree that it has little effect on the fit. This can lead
to unexpected results. As a guide, it is not advised to use the weighting
option on 2 datasets where a difference in error bounds on the datapoints
is over 3 orders of magnitude.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

* Reweighting is only necessary when there is systematic error in the measurement
  or when the model is insufficient to describe the data. Weights should be chosen
  such that uncertainty on the measured values increases, which will lead to a
  corresponding increase in the parameter uncertainty. The ad hoc weighting should
  be reported along with the fitted results.

weighting a dataset with ratio of 1.0, and so will have a **greater** influence
on the fit. A dataset with a **ratio of 0.5** will have half the statistical
weighting of a dataset with a ratio of 1.0, and so will have **less** of an
influence on the fit due to the datasets lower statistical weight.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The weight value appears to scale the error bar (the ratio term is in the numerator of the weight scale), so a larger weight will decrease the influence of the dataset in the fit. If you want to double the statistical weight of one set then you should double the error bars of the other set. You should not halve the error bars of this dataset since this will reduce the uncertainty in the fitted parameters whereas the presence of systematic uncertainty demands that you increase the parameter uncertainty in the fitted result.

@butlerpd butlerpd added Discuss At The Call Issues to be discussed at the fortnightly call and removed Discuss At The Call Issues to be discussed at the fortnightly call labels Mar 1, 2022
@butlerpd
Copy link
Member

butlerpd commented Mar 1, 2022

Miguel to review the status of this long standing PR and update on next steps

@wpotrzebowski wpotrzebowski added the Discuss At The Call Issues to be discussed at the fortnightly call label Mar 15, 2022
@butlerpd
Copy link
Member

Miguel will look at this soon he has a user asking as welll

@butlerpd butlerpd removed the Discuss At The Call Issues to be discussed at the fortnightly call label Mar 15, 2022
@rozyczko
Copy link
Member

Minor GUI comment:
The tooltip tells me Weighting must be an integer, a float or the string 'fixed' or 'compare.
However, neither fixed nor compare pass the validity test.
In the discussion above it was agreed to replace compare with Subject, which validates properly.
'fixed' seems to have been replaced with Default.
Checks in calcWeightIncrease compare to Default and Subject so I assume only the tooltip is incorrect?
If so, fixing it would be trivial.

Copy link
Contributor

@pkienzle pkienzle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More compact code. I'll leave it to others to judge whether it is more readable.

I still advise against this complex weighting scheme since it is hard to explain in a paper what is going on. Do you have a reference from the stats literature in support of the method?

Be aware that bumps is treating the weighting term as a scale factor on 1/sigma, multiplying unnormalized χ²/2 by w². For the simple case of giving equal weight to two datasets with identical Δy/y but with number of points n₁ = 2n₂, then final negative log likelihood (nllf) is (χ²₁ + 4χ²₂)/2 which overweights the second dataset. You will need to take the square root of the weighting.

Consider two datasets, one with 2% Δy/y and the other with 4% Δy/y but the same number of points. The weight norms Σ(y/Δy)² will be n⋅50² and n⋅25² respectively, so the final nllf is (χ²₁ + 16χ²₂)/2. Again taking the square root will give the same statistical weight to both datasets.

The given method is equivalent to artificially reducing the uncertainty on the measurement with poor statistics. This will result in larger minimum χ² and perhaps narrower credible intervals on the fitted parameters compared to the unweighted solution. If you instead normalize to min(w) rather than max(w) so that ratios are less than one then weighted χ² will be reduced and credible intervals will be broader.

If all you have is statistical error then you wouldn't need to do the weighted fit so the weights should be constructed to reflect an increase in uncertainty rather than a decrease.

weight_increase[id_index] = math.sqrt(difference) / num_fits
stat_weights[id_index] = numpy.sum(1.0/weights[id_index]**2)
if stat_weights[id_index] > max_weight:
max_weight = stat_weights[id_index]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

import numpy as npstat_weights = {k: np.sum(v**-2) for k, v in weights.items()}
max_weight = max(v for v in weights.values()}

max_weight = stat_weights[id_index]

for id_index in weights.keys():
weight_increase[id_index] = float(ratios[id_index]) * max_weight / stat_weights[id_index]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

weight_increase = {k: float(ratios[k]) * max_weight/v for k, v in stat_weights.items()}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comments. I agree with all of them, but for the moment I only implemented the change to use min(w) instead of max(w). This should not affect the parameters, but it certainly gives more credible uncertainties.

I agree that it is not a good idea to play with the weights, and there is a clear warning about the dangers of using this option. It is completely empirical and I don't know any math/stats reference to support it. I'm writing a SasView tuto where I plan to explain in more detail the dangers and I will likely include a large part of the discussion that you had with Iestyn.

I also agree that the weights to be sent to bumps should be square rooted. Again I decided not to do it based only on empirical reasons. In a test using 2 data sets I found that I could get an intermediate answer using user-weights of 1 and 3, while the needed user-weights were 500 and 1 when returning the sqrt. But this was a single test and perhaps not very representative, so further tests should be done and this could be changed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While testing, I found that all the FitPages shown in the simultaneous fit tab are fitted, even if a page is not checked to be fit. Furthermore, any change in the tab (e.g. adding a constraint, modifying a weight, etc.) reinitializes the status of the checkbuttons. And the weights are applied to all the FitPages, so I probably need to include again a "Default" option.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some possibilities:

  1. You have an old version of bumps that doesn't have the w² scaling, but only w.
  2. There is a sqrt introduced somewhere between calculation and use.
  3. My math is wrong.

You can modify the multifitproblem nllf method in bumps to print out the scaled nllf from the parts to see that they have approximately the same value.

@wpotrzebowski wpotrzebowski added the Discuss At The Call Issues to be discussed at the fortnightly call label Mar 29, 2022
@wpotrzebowski wpotrzebowski removed the Discuss At The Call Issues to be discussed at the fortnightly call label Mar 29, 2022
for id_index in weights.keys():
weight_increase[id_index] = float(ratios[id_index]) * min_weight / stat_weights[id_index]
stat_weights = {k: numpy.sum(v**-2) for k, v in weights.items()}
min_weight = min(v for v in stat_weights.values())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could have suggested:

min_weight = min(stat_weights.values())

This is twice as fast! 0.7 μs vs 1.65 μs for five entries 😉

Pick what seems more readable to you.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes! This is faster and simpler, so I will use your suggestion.

I still have to check the nllf method, and try to generate or find other data sets to test how stable the method is.

@gonzalezma
Copy link
Contributor

I think this works acceptably. The more philosophical question about if the weights should be modified or not and therefore if we should offer this possibility or not remains open!

I attach a document that describes what this option does and how it works, and at the same time warns against its use. But in the end, it is the user's responsibility to use it or not, so I let others decide. The doc is a mixture of a tutorial and self-documentation that I generated while doing my tests, so it is still a draft. However, if the PR is accepted it can probably be polished and converted into a tutorial.

tutorial_modify_weights.pdf

@butlerpd butlerpd added the Discuss At The Call Issues to be discussed at the fortnightly call label Apr 12, 2022
@butlerpd
Copy link
Member

butlerpd commented Apr 12, 2022

@pkienzle would like to check whether he is doing the right thing in bumps before we merge as fixing it in bumps could cause a this to then break. @pkienzle will be working on this "soonish."

Agreed to hold off merging for now.

@butlerpd butlerpd removed the Discuss At The Call Issues to be discussed at the fortnightly call label Apr 12, 2022
@pkienzle
Copy link
Contributor

TLDR; use what we have for now but plan to replace it.

I've made some progress on an approach to weighting. See bumps/bumps#94. This branch explores fitting a scale factor on the error bars to handle systematic error. It works surprisingly well for the toy problem, with normalized χ² ≈ 1 at the end of the fit. Of course it can't distinguish between an incorrect model and systematic error, so use with caution.

It would be easy enough to do the same mods to sasmodels/bumps_fitting.py to see how well this behaves on SAS datasets. You may want to minimize once with fixed weights (perhaps de-weighting the X-ray data by a "usual" value if there is one), then fitting again with varying weights to improve our error estimate. You need the weight factor even when fitting a single dataset since it affects parameter uncertainty estimates. Adding this to SasView is more complicated. You would need a weight parameter per dataset and some way to forward that to the bumps fitter in sascalc, save it to a file, etc.

On the question of the weighting factor, you can modify Curve.dy_scale to explore different choices: σ, 1/σ, σ², 1/σ². The σ and 1/σ forms seem to perform the best. All of them is used by one source or another, though the 1/σ form I chose for simultaneous fitting is the least common. It does have the advantage that reasonable weights are in (0, 1], with smaller values reducing the influence of that fit. The Curve example currently uses the σ form.

@butlerpd
Copy link
Member

accept @pkienzle TLDR. Get people to test and feedback will help inform future directions.

@butlerpd butlerpd changed the title Weighting Allow for user adjusted relative weighting of different data sets in constrained fits Apr 26, 2022
@butlerpd butlerpd merged commit 7427090 into main Apr 26, 2022
@butlerpd butlerpd deleted the Weighting branch April 26, 2022 14:44
smk78 added a commit to SasView/tutorials that referenced this pull request Aug 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants