-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hyperoptimization metrics - implementation of $\varphi^{2}
$ estimator
#1849
Comments
Hi @Cmurilochem, I'd have to look into this in details in order to make well informed comments. But I will say that one thing you would want to check is what we briefly discussed on Wednesday, that is if whether or not Maybe others already have more insights? |
hi @Cmurilochem, I will prepare something more complete, but from the time being the following code will allow you to already play a bit with validphys and understand how it works (for more, here, the documentation is not complete but will give you some idea already) Here I'm using the API to automagically get some results. First we go to the computation of nnpdf/validphys2/src/validphys/results.py Line 542 in ce6c05c
Which uses the result of We can start by running this by adding calling the API with some inputs: from validphys.api import API
API.phi_data(dataset_input = {"dataset": "NMC"}, theoryid=400, use_cuts="internal", pdf="NNPDF40_nnlo_as_01180") This will already produce a value. And you already have all these quantities when you are doing a fit. The three first keys are given in the runcard and the PDF will be given by the vpinterface. You could as well do this: from validphys.api import API
from validphys.results import phi_data
chi2 = API. abs_chi2_data(dataset_input = {"dataset": "NMC"}, theoryid=400, use_cuts="internal", pdf="NNPDF40_nnlo_as_01180")
phi_data(chi2) Then if you go to the definition of We got there! These are 3 elements that we already have during the fit! from validphys.api import API
from validphys.results import phi_data, abs_chi2_data, results
ds = API.dataset(dataset_input = {"dataset": "NMC"}, theoryid=400, use_cuts="internal")
covmat = API.covariance_matrix(dataset_input = {"dataset": "NMC"}, theoryid=400, use_cuts="internal")
sqcov = API.sqrt_covmat(dataset_input = {"dataset": "NMC"}, theoryid=400, use_cuts="internal")
pdf = API.pdf(pdf="NNPDF40_nnlo_as_01180")
res = results(ds, pdf, covmat, sqcov)
chi2 = abs_chi2_data(res)
phi_data(chi2) And we have our final result! A number for phi! Note: to first approximation (and probably to second and third) you should not need to modify anything in validphys to get this number from n3fit. At most you will need to modify things to the signature of nnpdf/n3fit/src/n3fit/performfit.py Line 21 in ce6c05c
which is what will be read by validphys to fill in the different items. *and I'm being lazy, from the dataset it is possible to get the covmat (and sqrt covmat) without further calls to the API by calling the appropriate functions |
Thanks @scarlehoff for you very detailed explanation. It works perfectly here for me. The idea here (as you suggested) is to call results directly as the only way around. For this we would need to have as arguments
I saw that you have already suggested an alternative idea in how to obtain |
Hi @Cmurilochem, some more notes to that: For the PDF there's no way out, you need that the vpinterface N3PDF "tricks" validphys into believing there is a central value and construct the central value by taking the average of the replicas. Instead, for the rest, you have all information by the time the fit starts. The best way to test this is to go to the definition of the In this case you would be interested in something like Let me know if anything doesn't work, I'm writing from memory so don't trust 100% the details (if it is not |
Thank you @scarlehoff! Yeap, I tested it. I have two options here that appeared after including these arguments in |
Hi @scarlehoff. Thanks to your help I think I found a provisory solution to start with. The idea is to:
I will be reporting more details and possible ToDos/problems in #1726. |
We are interested in implementing an additional metrics to$\varphi^{2}$ ; see Eq.(4.6) of the NNPDF3.0 paper. As defined therein and extended by @RoyStegeman and @juanrojochacon to the context of hyperoptimization, $\varphi^{2}$ can be calculated for each $k$ -fold as
hyperopt
that is sensitive to higher moments of the probability distribution,where the first term represents our usual averaged-over-replicas hyper loss,$\chi^2_k$ , that is calculated based on the dataset used in the fit ($\mathcal{D}$ ) and the theory predictions from each fitted PDF ($f_{\rm fit}$ ) replica. The second term of the above equation would involve the calculation of the hyper loss but now using the theory predictions from the central PDF (averaged-over-replicas PDF - if I understood well).
The idea would be to implement this new metrics as an additional
@staticmethod
of theHyperLoss
class.I noticed that there already exists an implementation of$\varphi$ (probably from NNPDF3.0 paper) in the
phi_data
function invalidphys
. This function depends on theabs_chi2_data
function which in turn depends onresults
.To avoid code duplication, I think it would be nice to use these functions probably via
n3fit/vpinterface.py
.The problem is that I really do not know how to use these functions from
validphys
, speciallyresults
that depends oncovariance_matrix
andsqrt_covmat
arguments.Please, could anybody help me on that or even suggest any alternative way to do so ? I would appreciate it very much you help.
The text was updated successfully, but these errors were encountered: