-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Custom fit models #2565
Custom fit models #2565
Conversation
Jeff to check two other PR then close if ok. Otherwise need to look at "does it do what it says on the tin" and overall code |
I've tried it on several PDB models to see how it works on models with a large number of atoms. While it generally works fine I think we will need to put some limits on very large structures (e.g. virus capsids). For the following example https://github.com/Andre-lab/hbv_trSAXS/blob/master/HBVCP_empty_assembly/bayesian_models/capsid_T4.pdb it took about 11min to load the model and I interrupted curve computation after 30min. |
I agree. We could set a limit, above which, the program can give a warning saying that the loading and calculation may be slow. Also, I am wondering if there is a way to estimate the amount of time needed to load and compute the model. Without knowing the details of how the PDB reader code, I might be wrong. |
I agree with @yunliu01 that some benchamrking would be useful. I can probably do some exploration during the code camp. In order not too block it maybe we can put some warning if we have more than X atoms. I can probably come up with rough estimation of X, |
Just to be clear, the slow calculation of PQ is completely unrelated to this pull request and should be ticketed separately. This should in no away affect merging this PR. Actually I think there may already be some related tickets. Basically the PQ calculation, as far as I know, is still the same code originally written >12 years ago and is, as I understand it, technically perfectly correct but the absolute slowest approach. We have discussed over the years a variety of faster algorithms, most of which fail at high Q but usually plenty good enough. This also may be a case where vectorization and GPU could significantly speed up the calculation? To date the interest in this feature by the existing SasView community has been fairly low so there has been no pressure to fix it -- This could be a great opportunity to start doing so? If this part of the code was faster and more robust I believe it would make SasView more useful to the protein biophysics community -- specially those working with pharmeceutical problems? |
MAC installer doesn't start. It seems to work fine from local dev env. |
Change the names of the variables to avoid the confusion.
…odels # Conflicts: # src/sas/sascalc/calculator/geni.py
Beta_Q should be one at Q=0. Therefore, calculated beta_Q needs to be normalized for the value at Q=0. This requires that both F_Q, and P_Q to be normalized properly at Q=0, i.e., F_Q(Q=0)=1 and P_Q(Q=0)=1. However, P_Q is currently not a normalized form factor. Therefore, the current code obtained the normalized Beta_Q by dividing it with Beta_Q[Q_min], the value of Beta_Q at lowest Q point. Note that when using the calculated form factor to fit the data, the code also needs to use the normalized form factor. The normalized form factor is also obtained by dividing the data with the value at the lowest Q point. This approach could be problematic if a user forgets to choose the first Q point to be very small Q values. The documentation needs to be updated to let user know this whenever they use this function. In principle, it is easy to calculate P_Q(Q=0) and F_Q(Q=0). In future, it is better to update the code to normalize the data with the exact values at Q=0 instead of using a value at a finite Q. With a quick look at the code, it seems that F_Q seems to be normalized correctly already. (This has not be fully tested yet.) However, the calculation of P_Q needs to be updated in future. |
So to be clear @yunliu01 -- you are saying that Beta_Q is normalized to Beta_qmin but that F(Q) is normalized to F(Q=0)? Also, is not P(Q)=F(Q)^2 not then also normalized by P(Q=0) by definition? Finally if it is that simple should we not just normalize to Q=0 properly? Or is this considered too hard at this point? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Testing the install on windows (after merging the fast Debye calculator) works very nicely (and fast!). It looks to me that it is ready to merge.
Two points that should be addressed as soon as possible (and the first maybe as part of this release even?):
- since plugin docs will now render in the installer version, it would be nice of some of the documentation currently buried in the help docs were moved (or even copied) to be part of the model docs as suggested in the new issue Add documentation to Custom Fit model generated by GCC #2872. In particular an explanation of the new parameters of swelling and volume and how they should be used.
- Because this model is not based on an analytical equation but on a calculated series of points, it is quite possible for the data to end up being outside the value that can be returned by the model. If not, but the data has resolution, it is still possible that the fit needs data beyond what the model can return. It would be good to display a helpful message when the data+resolution is going to exceed the bounds of the calculated model telling the user that is the case and that they should recompute the plugin using a larger Q range. That may actually be a separate sasmodel issue
Finally I note that it is also possible that a user, particularly on a Mac, does not provide enough points in the Debye calculated plugin such that interpolation is bad in some areas (where there may be significant oscillations. Is there a way to flag that to the user? Is there a way to algorithmically assess it, maybe based on the smoothness of the curve?
It seems to work on Mac as well (with default GSC calculation in the backend). |
Description
Adds a feature to automatically generate plugin models with Structure Factors from the Generic Scattering Calculator. Models are written in a singular combined C + Python, with 2 options for effective radius: equivalent sphere volume and Radius of Gyration. It also gives you the option to choose your own file name. There are 4 parameters- SLD, Solvent SLD, Swelling, and Protein Volume.
This PR combines 2 other PRs as well due to .UI merging issues: #2548 and #2538.
How Has This Been Tested?
Upload a file into nuclear data in the Generic Scattering Calculator- preferably a .pdb file. Choose one of the 1D Scattering options, check the Plugin Models checkbox, and change the desired filename if wanted. Hit calculate and let it run.
After it is done calculating, the model file should automatically be created and put into the plugin_models folder of
.sasview
. You should be able to load it in the Fit Panel by going to Category - Plugin Models, then choosing your model name and choosing a structure factor as wanted.Review Checklist (please remove items if they don't apply):