-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow for user adjusted relative weighting of different data sets in constrained fits #1973
Conversation
However, see bumps/bumps#87 — you may need to fix this feature in bumps first. |
Good point @pkienzle , the code now uses the weights option in FitProblem |
I've had a look at the bumps/bumps#87, and I see no issue with the weighting option in FitProblem when doing my tests. I'm getting the same results as when i was modifying Δy |
This may depend on the particular path through the fitter. In particular, L-M uses residuals when fitting while the others use nllf. You wouldn't notice a difference if you just test with L-M. It may also be a bumps version issue, though I'm not sure if weights ever worked. |
Hi @pkienzle , so I've read through bumps/bumps#87, and am I correct in thinking that everything should be fine when this change gets released to bumps? |
The fix is not available in the current release. However, looking at the installers workflow, it looks like SasView is not using bumps releases in its build, or any version tagging, so it ought to have the latest version. I'm not sure that this is a good thing. It'll make reproducible builds impossible since it is pulling the latest versions of all packages. |
That is correct as I learned fairly recently -- but means I guess that the version Iestyn is building against is fixed?
Indeed. At the moment the plan is to change the release yml to the latest official version of bumps for the very reason you say @pkienzle . We should probably be building everything else against "latest version" but for now we are worrying about the next release. However this PR is not part of the next release. Question is then, when will the next release of bumps with the required fix be .. released? |
It can be released when needed. There shouldn't be too many changes since we just did a release, so checking it can be quick. |
The virtual env i use for SasView is running bumps version 0.8.0 |
Added a description of the weighting functionality to sasview\src\sas\qtgui\Perspectives\Fitting\media\fitting_help.rst, at the bottom of the Simultaneous Fit Mode section. The description is quite long, but I thought it would be better to over explain the functionality rather than under explain it. I’m willing to trim it down if needs be Do you mind giving it a read @smk78 ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a review of the help documentation only. The section headed 'Simultaneous Fits with a Modified Weighting'.
I think this section should start with some context; ie, why might a user want to modify the weighting? This context should start generically (eg, 'In some instances...because...' or 'Problems can arise if an attempt is made to simultaneously fit datasets with very different...because...') and then give the specific example of co-fitting SAXS/SANS. Then introduce the new functionality (eg, 'To help address this problem an additional column has been added to the...').
I would suggest reversing and extending the sentences 'The weighting column modifies the weight of each data set in the simultaneous fitting. By default the weighting is decided by the number of points of a dataset and the error on each point.' So maybe something like 'By default the SasView fitting engine statistically weights each dataset in a simultaneous fit by the number of points in a dataset and the error on each point. The weighting column allows the user to change this default weighting behaviour.'
However, I also think you need to say what the weighting factor will change; I believe it's just the errors (internally), right, you're not changing the number of points too?
Replace 'float(number)' by 'floating point number' or 'real number' to do away with programmer-ese! :-)
I like the change from 'Compare' to 'Subject'!
Should the sentence (which is grammatically incorrect anyhow) 'This dataset will act as the subject of comparison between for the modification of the weighting for other data sets.' say something like 'This dataset will act as the subject against which the weighting of the other data sets will be compared.'?
And 'Only one dataset should have the "Subject" option...' or 'Only one dataset can have the "Subject" option...'? Maybe also change to '...having multiple will lead...' to '...attempting to have multiple will lead...'?
Change '...act as a ration...' to '...act as a ratio...'.
In the sentences that follow, I think you also need to explain the consequences; what I mean by this is, does the simultaneous fitting pay more attention to the dataset with a weighting of 2.0 or 0.5? I could make a convincing argument either way!
Maybe instead of 'A dataset with the “Subject” option in the weighting column is not needed...' it would be clearer to say 'It is not mandatory to have a dataset with the “Subject” option in the weighting column...'?
In the sentence 'This means both datasets will be modified to attain the desired ratio of statistical weight.' the word 'both' should presumably be 'all'?
The sentence 'Two datasets with 1.0 in the weighting column will be both be weighted differently to ensure they both have the same statistical weight.' is very confusing! Does 'differently' need to be 'similarly'?
The examples are good, but maybe make them a bulleted list for clarity?
Typo: should be '...affect the weighting...'
Replace 'Some notes' by a proper note or perhaps warning; eg:
.. warning::
The weighting modiification...
If the difference...
There are two typos in the notes (changes needed in bold):
- Should either say 'The weighting modifications do not...' or 'The weighting modification does not...'
- Should say '...with a difference...', and I'd change 'sizes' to 'bounds'.
Finally, what about a couple of before-after screenshots (with captions) showing how using weighting improves a fit?
Sorry, this was a long review!
PS @Caddy-Jones , don't worry about the length; let's focus on making it useful to the reader! |
Hi @smk78 I’ve updated the help document following your review. The only changes I didn’t include is this, “And 'Only one dataset should have the "Subject" option...' or 'Only one dataset can have the "Subject" option...'? Maybe also change to '...having multiple will lead...' to '...attempting to have multiple will lead...'?” My reasoning is that it is possible to type “Subject” for multiple rows, but only one row will act as the subject for comparison, the other datasets will act as if “Default” was selected. This is communicated to the user via a warning message, but since it is physically possible to type “Subject” in multiple columns, saying “Only one dataset can have the "Subject" option...” is technically incorrect and may be reported needlessly as a bug. Also I haven’t included any screenshots as I am in the process of attempting to fit some data SAXS and SANS that’s been given to me, so may use that as my example Thanks for your review :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the proposed weighting system will be heavily biased by high Q values. Furthermore, I think that increasing the weights on a dataset decreases the influence, which is opposite of what is described in the docs.
I suggest a much simpler weighting system. Allow the user to specify an error bar scaling for each dataset, preferably limited to values greater than one. The user can adjust these values and refit until the residuals are mostly in the range [-3, 3]. Users are going to need to explain the choice of weights in the results section of the paper, and complicated expressions involving ratios of averaged inverse variances are hard to explain. Furthermore, the proposed weights don't have any real meaning: A constant scaled by an ad hoc value is still just an ad hoc value.
Alternatively, come up with an objective method of choosing the weights, for example, adjusting the weights during the fit until the sum squared residuals equals the number of data points for each dataset in the fit while minimizing the weights and keeping them greater than one.
if flag == 0: | ||
for id_index in weights.keys(): | ||
weight_increase[id_index] | ||
return weight_increase |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fails with weight_increase
undefined. Use:
if flag == 0:
return {k: 1.0 for k in weights}
You also want to do this if no ratios are specified:
defaults = set(("Subject", "Default"))
if flag == 0 or any(v not in defaults for v in ratio.values()):
return {k: 1.0 for k in weights}
for val in weights[id_index]: | ||
stat_weight += 1.0 / (val ** 2.0) | ||
av_stat_weight = stat_weight / len(weights[id_index]) | ||
stat_weights[id_index] = av_stat_weight |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code simplifies to
stat_weights = {k: np.mean(v**-2) for k, v in weights.items()}
However, I don't think this is the statistic that you want since it'll be dominated by the tiny values at high Q. Relative uncertainty will have better balance.
first_subject_stat_weight_id = list(subject_stat_weights.keys())[0] | ||
logging.warning(f'Multiple data sets specified for comparison using "Subject", will use ' | ||
f'{first_subject_stat_weight_id} for comparison.') | ||
subject_stat_weight = subject_stat_weights[first_subject_stat_weight_id] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
subject_ids = [k for k, v in ratio.items() if v == "Subject"]
if len(subject_ids) > 1:
logging.warning(f'Multiple data sets tagged as "Subject"; using {subject_ids[0]}.')
subject_weight = stat_weights[subject_ids[0]] if subject_ids else np.mean(stat_weights)
else: | ||
desired_stat_weight = subject_stat_weight * float(ratios[id_index]) | ||
difference = desired_stat_weight / weight | ||
weight_increase[id_index] = math.sqrt(difference) / num_fits |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
scale = lambda k: math.sqrt(subject_weight/stat_weight[k] * float(ratio[k])) / len(weights)
return {k: (1.0 if ratio[k] in defaults else scale(k)) for k in weights}
ignored to such a degree that it has little effect on the fit. This can lead | ||
to unexpected results. As a guide, it is not advised to use the weighting | ||
option on 2 datasets where a difference in error bounds on the datapoints | ||
is over 3 orders of magnitude. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* Reweighting is only necessary when there is systematic error in the measurement
or when the model is insufficient to describe the data. Weights should be chosen
such that uncertainty on the measured values increases, which will lead to a
corresponding increase in the parameter uncertainty. The ad hoc weighting should
be reported along with the fitted results.
weighting a dataset with ratio of 1.0, and so will have a **greater** influence | ||
on the fit. A dataset with a **ratio of 0.5** will have half the statistical | ||
weighting of a dataset with a ratio of 1.0, and so will have **less** of an | ||
influence on the fit due to the datasets lower statistical weight. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The weight value appears to scale the error bar (the ratio
term is in the numerator of the weight scale), so a larger weight will decrease the influence of the dataset in the fit. If you want to double the statistical weight of one set then you should double the error bars of the other set. You should not halve the error bars of this dataset since this will reduce the uncertainty in the fitted parameters whereas the presence of systematic uncertainty demands that you increase the parameter uncertainty in the fitted result.
Miguel to review the status of this long standing PR and update on next steps |
Miguel will look at this soon he has a user asking as welll |
Minor GUI comment: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More compact code. I'll leave it to others to judge whether it is more readable.
I still advise against this complex weighting scheme since it is hard to explain in a paper what is going on. Do you have a reference from the stats literature in support of the method?
Be aware that bumps is treating the weighting term as a scale factor on 1/sigma, multiplying unnormalized χ²/2 by w². For the simple case of giving equal weight to two datasets with identical Δy/y but with number of points n₁ = 2n₂, then final negative log likelihood (nllf) is (χ²₁ + 4χ²₂)/2 which overweights the second dataset. You will need to take the square root of the weighting.
Consider two datasets, one with 2% Δy/y and the other with 4% Δy/y but the same number of points. The weight norms Σ(y/Δy)² will be n⋅50² and n⋅25² respectively, so the final nllf is (χ²₁ + 16χ²₂)/2. Again taking the square root will give the same statistical weight to both datasets.
The given method is equivalent to artificially reducing the uncertainty on the measurement with poor statistics. This will result in larger minimum χ² and perhaps narrower credible intervals on the fitted parameters compared to the unweighted solution. If you instead normalize to min(w) rather than max(w) so that ratios are less than one then weighted χ² will be reduced and credible intervals will be broader.
If all you have is statistical error then you wouldn't need to do the weighted fit so the weights should be constructed to reflect an increase in uncertainty rather than a decrease.
weight_increase[id_index] = math.sqrt(difference) / num_fits | ||
stat_weights[id_index] = numpy.sum(1.0/weights[id_index]**2) | ||
if stat_weights[id_index] > max_weight: | ||
max_weight = stat_weights[id_index] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
import numpy as np
…
stat_weights = {k: np.sum(v**-2) for k, v in weights.items()}
max_weight = max(v for v in weights.values()}
max_weight = stat_weights[id_index] | ||
|
||
for id_index in weights.keys(): | ||
weight_increase[id_index] = float(ratios[id_index]) * max_weight / stat_weights[id_index] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
weight_increase = {k: float(ratios[k]) * max_weight/v for k, v in stat_weights.items()}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the comments. I agree with all of them, but for the moment I only implemented the change to use min(w) instead of max(w). This should not affect the parameters, but it certainly gives more credible uncertainties.
I agree that it is not a good idea to play with the weights, and there is a clear warning about the dangers of using this option. It is completely empirical and I don't know any math/stats reference to support it. I'm writing a SasView tuto where I plan to explain in more detail the dangers and I will likely include a large part of the discussion that you had with Iestyn.
I also agree that the weights to be sent to bumps should be square rooted. Again I decided not to do it based only on empirical reasons. In a test using 2 data sets I found that I could get an intermediate answer using user-weights of 1 and 3, while the needed user-weights were 500 and 1 when returning the sqrt. But this was a single test and perhaps not very representative, so further tests should be done and this could be changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While testing, I found that all the FitPages shown in the simultaneous fit tab are fitted, even if a page is not checked to be fit. Furthermore, any change in the tab (e.g. adding a constraint, modifying a weight, etc.) reinitializes the status of the checkbuttons. And the weights are applied to all the FitPages, so I probably need to include again a "Default" option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some possibilities:
- You have an old version of bumps that doesn't have the w² scaling, but only w.
- There is a sqrt introduced somewhere between calculation and use.
- My math is wrong.
You can modify the multifitproblem nllf method in bumps to print out the scaled nllf from the parts to see that they have approximately the same value.
for id_index in weights.keys(): | ||
weight_increase[id_index] = float(ratios[id_index]) * min_weight / stat_weights[id_index] | ||
stat_weights = {k: numpy.sum(v**-2) for k, v in weights.items()} | ||
min_weight = min(v for v in stat_weights.values()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could have suggested:
min_weight = min(stat_weights.values())
This is twice as fast! 0.7 μs vs 1.65 μs for five entries 😉
Pick what seems more readable to you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yes! This is faster and simpler, so I will use your suggestion.
I still have to check the nllf method, and try to generate or find other data sets to test how stable the method is.
… scheme selected in the FitOptions tab
I think this works acceptably. The more philosophical question about if the weights should be modified or not and therefore if we should offer this possibility or not remains open! I attach a document that describes what this option does and how it works, and at the same time warns against its use. But in the end, it is the user's responsibility to use it or not, so I let others decide. The doc is a mixture of a tutorial and self-documentation that I generated while doing my tests, so it is still a draft. However, if the PR is accepted it can probably be polished and converted into a tutorial. |
TLDR; use what we have for now but plan to replace it. I've made some progress on an approach to weighting. See bumps/bumps#94. This branch explores fitting a scale factor on the error bars to handle systematic error. It works surprisingly well for the toy problem, with normalized χ² ≈ 1 at the end of the fit. Of course it can't distinguish between an incorrect model and systematic error, so use with caution. It would be easy enough to do the same mods to On the question of the weighting factor, you can modify |
accept @pkienzle TLDR. Get people to test and feedback will help inform future directions. |
This pull request attempts to give the user the option of changing the statistical weighting of data sets when doing a “Constrained or Simultaneous fit”. The change in statistical weighting is achieved by increasing the data used for the weighting, defined by the “Fit Options” tab, by a factor. The factor by which to increase the statistical weight is calculated by comparing the statistical weight of the different data sets.
![image](https://user-images.githubusercontent.com/88383842/143031121-4b9504d1-9da2-4d32-b5a1-c90faa002a4c.png)
By default the error in the y axis (dI) is used. Modifications to the dI data caused by this pull request only happen internally, meaning the error bars will not be modified on the plot.
The changes appear as a new column in the “Const. Simul. Fit” tab.
The default option if ‘fixed’, which just means not to modify the weighting. The user input is the desired weighting ratio between 2 data sets.
For example, if there is two data sets, one with ‘fixed’ and the second with ‘1.0’, then the second data set will be modified to have the same statistical weight as the first data set. If the second data set has a weighting of ‘2.0’, then the second data set will be modified to have the twice statistical weight as the first data set. ‘0.5’ will have half the statistical weight etc.
If both data sets have a weighting of ‘1.0’, then both will be modified until their on equal statistical weighting.