Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable batch processing for P(r) #1448

Open
wpotrzebowski opened this issue Jan 29, 2020 · 12 comments · May be fixed by #2181
Open

Enable batch processing for P(r) #1448

wpotrzebowski opened this issue Jan 29, 2020 · 12 comments · May be fixed by #2181
Assignees
Labels
Inversion Perspective Concerns inversion perspective

Comments

@wpotrzebowski
Copy link
Contributor

A user from Lund University has a case where processing P(r) in the batch mode would be particularly useful (time-resolved data). We can currently process only a single file for P(r) analysis however it would be useful to enable batch mode for it too.

@wpotrzebowski wpotrzebowski added this to the SasView WishList milestone Jan 29, 2020
@smk78
Copy link
Contributor

smk78 commented Jan 31, 2020

I would agree this would be a useful capability. But implementing it would need some careful thought. It might be straightforward if the user is wanting to process subtle variations of the same system where things like Dmax and the number of terms aren't changing. But what if they want to process X different proteins?

@butlerpd
Copy link
Member

I will say that the original "spec" yeeeaaars ago was that everything I can do once I should be able to do in batch ... including Pr... but, as you point out, it is a bit difficult. Even for model fitting. I would agree that the immediate way forward would be to organize it for the most obvious use case if somebody wanted to tackle this.

However I have been thinking that there should be some more automation possible. For example if we fit the low q to an Rg we should be able to estimate Dmax. Also we currently have the exploration tool that will show a plot of Xi^2 (or other things) vs Dmax which we could try to automate maybe with a summer student? trying to figure out what algorithm works? (I won't use the word AI :-) but I would say that is a much larger scope with a longer "time horizon?"

@butlerpd
Copy link
Member

Interestingly I note that the P(r) documentation specifically says that you CAN do batch fitting .. Ooops 🤦 We might want to change that ... until (if and when) it is in fact possible?

@smk78
Copy link
Contributor

smk78 commented Jan 25, 2021

In which case we might want to address the behaviour of the 'Batch mode' checkbox too. Because you can 'enable' batch mode and then select Inversion!

You can do the same with Invariant and Corfunc too.

@smk78
Copy link
Contributor

smk78 commented Jan 25, 2021

@krzywon : whilst checking the above I've noticed (yet?) another issue with P(r). I don't know if it's already on your radar? But the data file dropdown in the P(r) perspective will happily accept as many datasets as you want to send it. If I select the first one and click calculate I get what I'll call 'normal behaviour'. If I then select a different dataset from the dropdown, some output values flicker but the dropdown then resets to the first dataset. So you cannot actually Calculate P(r) for anything but the first dataset.

The Log Explorer reports:
21:35:24 - INFO: Invertor.estimate_alpha: Invertor: could not invert I(Q)
Pinvertor.get_matrix: Some I(Q) points have no error.

And this was running a local build of main.

@butlerpd
Copy link
Member

That is very interesting. I had not noticed the dropdown before (probably because I know there should not be one so never bothered looking :-) However, using my local build which is a very recent branch from main, I see seem to have some different behavior from what is reported by @smk78.

  • If I select multiple data sets and send them to fitting it works (multiple plots and fit panels open up). However if I do that with any other perspective an error is thrown and a popup appears saying Inversion does not allow replacing multiple data sets where Inversion is replaced as appropriate.
  • The Batch mode and Swap data checkboxes both gets greyed out and are inaccessible as soon as you choose anything but Fitting. Of course if you click on batch while in Fitting (which is the default on startup) and then change to Inversion and then send the data, the greyed out box has a greyed out checkmark. While that seems reasonable enough, ideally I guess it would uncheck the box?

That said, I confirm that sending several data sets in a row to the Inversion (or other single data set perspectives) looks like it does a "swap" but actually is populating a drop down. However it seems clear the actual data is no longer available to the perspective.

Finally the Log Error reported is actually unrelated. I am guessing @smk78 was using the emulsion data (core contrast, shell contrast ect). I was frustrated by this myself and was going to report it as a bug but am still thinking about how SasView should handle this. The problem is the data set itself: the very last data point has an uncertainty of 0 (so a divide by zero somewhere?) I was originally thinking that P(r) doesn't deal with 2 column data but that is not the case. It is fine with 2 column or 3 column ... just not 3 column which then has one (or more) line with a zero for the uncertainty. One can always try writing massive amounts of code to fix every possible stupidity but then it risks eventually making the code unwieldy and unsuportable. That said, it happens enough that it will cause confusion and pain to users. Maybe some data integrity checks in the reader on loading data would be appropriate? But that path too is fraught with pitfalls....

@smk78
Copy link
Contributor

smk78 commented Jan 26, 2021

Doing a little digging I find that the paragraph on batch inversion was added to the docs at version 5.0.0 (but the person responsible did not admit it!). There is also no mention of batch enabling P(r) in any of the release notes.

If I was to modify pr_help to:

  • remove the Batch P(r) subheading
  • change the offending paragraph to read "The p(r) calculator accepts any number of data sets in the p(r) data set combo box. Switching between data sets in the combo box allows for individual fits and review of fit parameters. The displayed plots will also update to include the latest fit results for the selected data set." ; ie, remove the words ", and supports batch fitting for all data sets"

Would there be any objections?

@krzywon
Copy link
Contributor

krzywon commented Jan 26, 2021

@smk78 - I spent most of the 2017 code camp in Copenhagen writing the P(r) perspective for v5.0 and likely wrote that paragraph. My original intent was to have batch capabilities in the perspective for release v5.0, and much of the infrastructure is built in to allow it, but wasn't stable enough for release so the batch capabilities were turned off.

That said, I have no objection to changing the documentation.

@smk78
Copy link
Contributor

smk78 commented Jan 26, 2021

@butlerpd commented:

Finally the Log Error reported is actually unrelated. I am guessing @smk78 was using the emulsion data (core contrast, shell contrast ect). I was frustrated by this myself and was going to report it as a bug but am still thinking about how SasView should handle this. The problem is the data set itself: the very last data point has an uncertainty of 0 (so a divide by zero somewhere?) I was originally thinking that P(r) doesn't deal with 2 column data but that is not the case. It is fine with 2 column or 3 column ... just not 3 column which then has one (or more) line with a zero for the uncertainty. One can always try writing massive amounts of code to fix every possible stupidity but then it risks eventually making the code unwieldy and unsuportable. That said, it happens enough that it will cause confusion and pain to users. Maybe some data integrity checks in the reader on loading data would be appropriate? But that path too is fraught with pitfalls....

Nice catch, @butlerpd . Indeed, it is the drop contrast dataset. In it I_dev = 0 because in fact I = 0.

@smk78
Copy link
Contributor

smk78 commented Jan 26, 2021

@smk78 - I spent most of the 2017 code camp in Copenhagen writing the P(r) perspective for v5.0 and likely wrote that paragraph. My original intent was to have batch capabilities in the perspective for release v5.0, and much of the infrastructure is built in to allow it, but wasn't stable enough for release so the batch capabilities were turned off.

That said, I have no objection to changing the documentation.

Change made!

@dehoni dehoni added the Inversion Perspective Concerns inversion perspective label Mar 18, 2022
@ru4en ru4en self-assigned this Jun 29, 2022
ru4en added a commit that referenced this issue Jul 1, 2022
issue #1448 - Enable batch processing for P(r)

NOTE: This issue is not yet fixed.

Feature:

Selected data sent from the load data can now be sent to Pr window and a new tab will appear for each data set. Although the goal is batch calculate all data and neatly display in a table the current “non batch” tab method can be used for quick references of calculations for each file. In the nect commit I intend to implement a batch tab. Notes for those are in the bottom.

Dev note

-  1) Block automatic calculation after data is sent to Pr prospective whilst batch mode is deselected. (Allow users to hit calculate) Or block graphs from showing automatically as it clusters the interface.

-  2) Fix batch Mode
o  When batch mode is selected create only one tab (similar to fitting)
o  New interface for batch / same interface could be used with the addition of “Fit” / ” Graph” button conjunction with the 1st TODO.

3) Clean up Code. The current files ive modified/created are messy. Also comment code
@butlerpd
Copy link
Member

butlerpd commented Jul 5, 2022

So @krzywon , was the intention that the user set a fixed Dmax for everything in the batch mode and that the batch mode pull the suggested values for each of the Reg Constant and Number of terms? If so, how do you ensure that the behind the scenes computations are done before running the next calculation? These do take non zero time (though usually are quick). Also you need, as I recall, to iterate till the suggestion no longer changes.

I don't think it is helpful to have a batch P(R) that assumes all terms are the same as they seem to be quite sensitive to changes in the curves?

@ru4en have you found where the "mostly done" infrastructure is hidden?

@ru4en
Copy link

ru4en commented Jul 6, 2022

@butlerpd I am currently working on the code that loads data onto the GUI for Batch mode. I have also added Tab functionality for the Pr interface, I might have used the existing functions in inversionPerspective.py to iteratively load and calculate the data for each of the tabs (Not sure if this is the mostly done infrastructure). At the moment all loaded files are calculated all at once which causes the program to sometimes stop responding (Will be working on a fix for this). Tabs for Pr show the calculations on individual Tabs (similar to Fitting). Although I’m yet to use that code to have a functional batch mode (Currently multiple files can be imported and calculated but only displays one calculation at a time. My current plan for Pr Batch mode is:

  1. not have everything calculated straight away. But let the user set parrimeters first then let the data be processed.
  2. have a calculate button to calculate all files.
  3. create a new pop-up table interface similar to the Batch Fitting Results used in Fitting.
  4. when the calculate button is clicked all the outputs are sent to table (in step3)
  5. [ Waiting for 1992 corfunc perspective needs export and report capabilities #2065 to get merged to main ] adding export and report capabilities for the Batch data

If you'd want to see what I've done so far, please let me know and I'll add it to the branch I'm working on. origin/1448-Enable-batch-processing-for-Pr

@dehoni dehoni linked a pull request Oct 5, 2022 that will close this issue
7 tasks
dehoni pushed a commit that referenced this issue Jun 6, 2023
issue #1448 - Enable batch processing for P(r)

NOTE: This issue is not yet fixed.

Feature:

Selected data sent from the load data can now be sent to Pr window and a new tab will appear for each data set. Although the goal is batch calculate all data and neatly display in a table the current “non batch” tab method can be used for quick references of calculations for each file. In the nect commit I intend to implement a batch tab. Notes for those are in the bottom.

Dev note

-  1) Block automatic calculation after data is sent to Pr prospective whilst batch mode is deselected. (Allow users to hit calculate) Or block graphs from showing automatically as it clusters the interface.

-  2) Fix batch Mode
o  When batch mode is selected create only one tab (similar to fitting)
o  New interface for batch / same interface could be used with the addition of “Fit” / ” Graph” button conjunction with the 1st TODO.

3) Clean up Code. The current files ive modified/created are messy. Also comment code
krzywon pushed a commit that referenced this issue Oct 25, 2024
issue #1448 - Enable batch processing for P(r)

NOTE: This issue is not yet fixed.

Feature:

Selected data sent from the load data can now be sent to Pr window and a new tab will appear for each data set. Although the goal is batch calculate all data and neatly display in a table the current “non batch” tab method can be used for quick references of calculations for each file. In the nect commit I intend to implement a batch tab. Notes for those are in the bottom.

Dev note

-  1) Block automatic calculation after data is sent to Pr prospective whilst batch mode is deselected. (Allow users to hit calculate) Or block graphs from showing automatically as it clusters the interface.

-  2) Fix batch Mode
o  When batch mode is selected create only one tab (similar to fitting)
o  New interface for batch / same interface could be used with the addition of “Fit” / ” Graph” button conjunction with the 1st TODO.

3) Clean up Code. The current files ive modified/created are messy. Also comment code
krzywon pushed a commit that referenced this issue Oct 25, 2024
issue #1448 - Enable batch processing for P(r)

NOTE: This issue is not yet fixed.

Feature:

Selected data sent from the load data can now be sent to Pr window and a new tab will appear for each data set. Although the goal is batch calculate all data and neatly display in a table the current “non batch” tab method can be used for quick references of calculations for each file. In the nect commit I intend to implement a batch tab. Notes for those are in the bottom.

Dev note

-  1) Block automatic calculation after data is sent to Pr prospective whilst batch mode is deselected. (Allow users to hit calculate) Or block graphs from showing automatically as it clusters the interface.

-  2) Fix batch Mode
o  When batch mode is selected create only one tab (similar to fitting)
o  New interface for batch / same interface could be used with the addition of “Fit” / ” Graph” button conjunction with the 1st TODO.

3) Clean up Code. The current files ive modified/created are messy. Also comment code
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Inversion Perspective Concerns inversion perspective
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants