-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fit with theory covmat with n3fit #1528
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you also add one example runcard (and maybe a regression test for the invcovmat?)
Also -for @Zaharid - would you be against using .npy
instead of .csv
? I'm don't have a strong opinion on this but it would make things a bit faster and leaner (for a 4000x4000 matrix this is 382M vs 123M)
There is some conflict in the docs (I guess you modified the same document in this PR and also in the other one). Please fix that so that the merge is not blocked.
Yes please, do that. |
Or, compute the covmat dynamically, not sure if that is better. |
We also had #1091 which turned out to be a bit more involved and was forgotten about. Also probably @alecandido convinced me that npy is reasonable enough for this kind of purposes. I guess parquet still has the advantage of giving you an richer index. |
As a note, I think having a csv in the regression tests is good. Having the changes in human readable form is nice (you can always parse them of course). But anyway, this would only be for the storing of the thcovmat. Although maybe it makes sense to compute it on the fly indeed. @andreab1997 how long does setupfit take? |
For the runcards I tried, it takes about 40 minutes against the 20 of a fit without theory covmat |
Then I guess it's better to store it and load it. |
I believe that the only left change to apply is the one related to .csv vs .npy, right? Anyway, I have not understood the final decision, should we dump the thcovmat in csv or npy? (leaving as given that we do not want to compute it online) PS: @scarlehoff there are still your requested changes blocking the merging but I took already care of them. How can I solve this problem? |
Yes, please, using
You can't! I have to look through the changes! |
I guess the proper solution would be to re-request a review: if you solved them, you should allow your reviewer to check that the solution is satisfactory (and not breaking anything else...). |
ok done, thank you :) |
I am sorry, I am having problems finding the place where the writing happens. Can you please suggest me the right place where to look?(I believed this should have been in produce_nnfit_theory_covmat but I cannot find the writing) @scarlehoff @Zaharid |
I'm going to guess the action that |
@scarlehoff yes please. |
Can this be merged? @Zaharid |
@Zaharid I have solved the conflict. Can this be merged now? |
@RoyStegeman could you please have a quick review of this PR so we can merge? |
Sure |
validphys2/examples/theory_covariance/Fit_with_theory_covmat.yml
Outdated
Show resolved
Hide resolved
validphys2/examples/theory_covariance/Fit_with_theory_covmat.yml
Outdated
Show resolved
Hide resolved
validphys2/examples/theory_covariance/Fit_with_theory_covmat.yml
Outdated
Show resolved
Hide resolved
validphys2/src/validphys/covmats.py
Outdated
norm_threshold=None, | ||
dataset_inputs_t0_predictions, | ||
loaded_theory_covmat, | ||
): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What Juan says
validphys2/src/validphys/covmats.py
Outdated
use_weights_in_covmat=True, | ||
norm_threshold=None, | ||
dataset_inputs_t0_predictions, | ||
): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what Juan says (here and functions below)
@andreab1997 could we merge this? |
For me yes. I don't know if @RoyStegeman agrees. |
Sure please go ahead |
Actually give me till this evening, I have to fix a minor thing and then I will merge. |
if f == path: | ||
raise ValueError( | ||
"More than one theory_covmat file in folder tables" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andreab1997 this doesn't seem right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error is not as general as I originally thought. It probably does do what you intended, but I tried it with a custom covmat and got this error because I did not set use_scalevar_uncertainties
to false, in which case this message does not make a lot of sense since there was only one covmat.
So I guess we just need to update the example here: https://github.com/NNPDF/nnpdf/blob/278310410398b36b95e45246992c8ed07db6d6a6/doc/sphinx/source/tutorials/general_th_covmat.rst? If that's correct, I can update the docs.
P.S. the check can be written in one line for example as
if set(paths)&set(files):
raise ValueError
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok yes I agree. I would say the check is checking what I wanted (even if it can be written better) but the error message is misleading.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess there is no situation in which both use_scalevar_uncertainties
and use_user_uncertainties
are true? If so, that should probably be checked separately by vp. Are there valid situations in which both are false?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the moment I would say no but in principle yes (for example one can have a different source of theory uncertainties which are not user_uncertainties
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, but here use_scalevar_uncertainties
is just synonym for using a covmat that has been constructed from different theoryID's (regardless of the source of the theory uncertainty) with the instructions in the runcard, while user_uncertainties
is an external covmat (not necessarily non-scalevar).
So for now I think use_scalevar_uncertainties
can probably be deprecated as it is just the inverse of user_uncertainties
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you agree, I can take care of the changes. Just you know this better than me, so it's good if you can confirm :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I agree. In my mind we could have a third option, like use_whatever_uncertainties
, that uses a different prescription from the scalevar
. However it is true that we do not have it now so, for the time being,use_scalevar_uncertainties
is just not user_uncertainties
. So yes, I agree :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's leave hypothetical future features for hypothetical future people to work on ;)
Ok, I'll open a PR some point this week and ask for your review.
This will make possible to run a NNPDF4.0 fit including theory_covariance matrix
Edit: it also modifies the replica generation (
make_replica
) to utilize the covmat (with theory errors and multiplicative uncertainties if the proper flags are utilized)