Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docs: tutorial for running fit with scale variation theory covmat #1110

Merged
merged 14 commits into from
Mar 15, 2021
3 changes: 3 additions & 0 deletions doc/sphinx/source/theory/theoryparamsinfo.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
```eval_rst
.. _th_parameter_info:
```
# Looking up the parameters of a theory

The parameters for all of the theories can be found in the `theory.db` file,
Expand Down
1 change: 1 addition & 0 deletions doc/sphinx/source/tutorials/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ Running fits
./run-fit.md
./run-legacy-fit.rst
./run-iterated-fit.rst
./thcov_tutorial.rst

Analysing results
-----------------
Expand Down
272 changes: 272 additions & 0 deletions doc/sphinx/source/tutorials/thcov_tutorial.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,272 @@
How to include a theory covariance matrix in a fit
==================================================
:Author: Contact Rosalyn (r.l.pearson@ed.ac.uk) for further information.

This section details how to include :ref:`scale variation covariance matrices (covmats) <vptheorycov-index>`
in a PDF fit. At the present time this can only be done at next-to-leading order (NLO), for which the
central theory is :ref:`theory 163 <theory-indexes>`.

First, decide which theory covmat you want
------------------------------------------
- Choose the desired point-prescription listed :ref:`here <prescrips>`.
- Each prescription comes with a ``point_prescription`` flag to include in
the runcard, one of ["3 point", "5 point", "5bar point", "7 point", "9 point"]

Next, add necessary flags to the runcard
----------------------------------------
- Remember to list the required datasets using ``dataset_inputs`` (see :ref:`data_specification`).
- Specify ``metadata_group: nnpdf31_process`` in the runcard. This means that
RosalynLP marked this conversation as resolved.
Show resolved Hide resolved
the grouping will be done according to process type specified in the ``plotting``
files.
.. warning::
If ``metadata_group`` is not set to ``nnpdf31_process`` it will default to
``experiment`` and renormalisation scale variation correlations will be
between experiments rather than processes. See :ref:`backwards-compatibility`
for details.
- Add ``theorycovmatconfig`` to the runcard. An example is in the following code snippet:

.. code:: yaml

############################################################
theory:
theoryid: 163 # database id

theorycovmatconfig:
point_prescription: "3 point"
theoryids:
from_: scale_variation_theories
pdf: NNPDF31_nlo_as_0118
use_thcovmat_in_fitting: true
use_thcovmat_in_sampling: true

sampling_t0:
use_t0: false

fitting_t0:
use_t0: true

############################################################

- ``pdf`` is the PDF used to generate the scale varied predictions which
construct the theory covmat. Choose something close to the PDF you are
trying to fit, such as a previous iteration if available.
- ``theoryids`` are necessary for the construction of the theory covmat.
To avoid user error in entering them in the correct configuration and order,
this is handled by the ``produce_scale_variation_theories`` action in
`config <https://github.com/NNPDF/nnpdf/tree/master/validphys2/src/validphys/config.py>`_,
using the information in
`the scalevariations module <https://github.com/NNPDF/nnpdf/tree/master/validphys2/src/validphys/scalevariations>`_.
- The flags ``use_thcovmat_in_fitting`` and ``use_thcovmat_in_sampling`` specify
where to use the theory covmat in the code. There are two possible places:
the fitting (i.e. :math:`\chi^2` minimiser) and the sampling (i.e. pseudodata
generation). The default is ``True`` for both.
.. warning::
Changing either of these to ``False`` will affect the fit outcome and should
be avoided unless you know what you are doing.

If you want to compare data to another fit
------------------------------------------
- Sometimes we want to compare data to another fit for validation, for example
we might want to compare predictions for the NLO fit with MHOUs to the known
NNLO fit (see :ref:`vptheorycov-tests`).
- To make sure the cuts match between these two fits, edit the ``datacuts``
section of the runcard to include the following

.. code:: yaml

use_cuts: fromintersection
cuts_intersection_spec:
- theoryid: 163
- theoryid: 53

- This ensures that the cuts on the data are the intersection of the cuts in
theory 53 (default NNLO) and theory 163 (central scale variation NLO). See
:ref:`here <theory-indexes>` for theory definitions.

Example runcard
---------------
The following is an example runcard for an NLO NNPDF3.1-style fit with a 3 point theory covmat.
It can be found `here <https://github.com/NNPDF/nnpdf/tree/master/validphys2/examples/theory_covariance/fit_with_thcovmat.yaml>`_.

.. code:: yaml

#
# Configuration file for NNPDF++
#
##########################################################################################
description: Example runcard for NLO NNPDF3.1 style fit with 3pt theory covariance matrix

##########################################################################################
# frac: training fraction
# ewk: apply ewk k-factors
# sys: systematics treatment (see systypes)
dataset_inputs:
- {dataset: NMCPD, frac: 0.5}
- {dataset: NMC, frac: 0.5}
- {dataset: SLACP, frac: 0.5}
- {dataset: SLACD, frac: 0.5}
- {dataset: BCDMSP, frac: 0.5}
- {dataset: BCDMSD, frac: 0.5}
- {dataset: CHORUSNU, frac: 0.5}
- {dataset: CHORUSNB, frac: 0.5}
- {dataset: NTVNUDMN, frac: 0.5}
- {dataset: NTVNBDMN, frac: 0.5}
- {dataset: HERACOMBNCEM, frac: 0.5}
- {dataset: HERACOMBNCEP460, frac: 0.5}
- {dataset: HERACOMBNCEP575, frac: 0.5}
- {dataset: HERACOMBNCEP820, frac: 0.5}
- {dataset: HERACOMBNCEP920, frac: 0.5}
- {dataset: HERACOMBCCEM, frac: 0.5}
- {dataset: HERACOMBCCEP, frac: 0.5}
- {dataset: HERAF2CHARM, frac: 0.5}
- {dataset: CDFZRAP, frac: 1.0}
- {dataset: D0ZRAP, frac: 1.0}
- {dataset: D0WEASY, frac: 1.0}
- {dataset: D0WMASY, frac: 1.0}
- {dataset: ATLASWZRAP36PB, frac: 1.0}
- {dataset: ATLASZHIGHMASS49FB, frac: 1.0}
- {dataset: ATLASLOMASSDY11EXT, frac: 1.0}
- {dataset: ATLASWZRAP11, frac: 0.5}
- {dataset: ATLAS1JET11, frac: 0.5}
- {dataset: ATLASZPT8TEVMDIST, frac: 0.5}
- {dataset: ATLASZPT8TEVYDIST, frac: 0.5}
- {dataset: ATLASTTBARTOT, frac: 1.0}
- {dataset: ATLASTOPDIFF8TEVTRAPNORM, frac: 1.0}
- {dataset: CMSWEASY840PB, frac: 1.0}
- {dataset: CMSWMASY47FB, frac: 1.0}
- {dataset: CMSDY2D11, frac: 0.5}
- {dataset: CMSWMU8TEV, frac: 1.0}
- {dataset: CMSZDIFF12, frac: 1.0, cfac: [NRM]}
- {dataset: CMSJETS11, frac: 0.5}
- {dataset: CMSTTBARTOT, frac: 1.0}
- {dataset: CMSTOPDIFF8TEVTTRAPNORM, frac: 1.0}
- {dataset: LHCBZ940PB, frac: 1.0}
- {dataset: LHCBZEE2FB, frac: 1.0}
- {dataset: LHCBWZMU7TEV, frac: 1.0, cfac: [NRM]}
- {dataset: LHCBWZMU8TEV, frac: 1.0, cfac: [NRM]}

############################################################
datacuts:
t0pdfset: 190310-tg-nlo-global # PDF set to generate t0 covmat
q2min: 13.96 # Q2 minimum
w2min: 12.5 # W2 minimum
combocuts: NNPDF31 # NNPDF3.0 final kin. cuts
jetptcut_tev: 0 # jet pt cut for tevatron
jetptcut_lhc: 0 # jet pt cut for lhc
wptcut_lhc: 30.0 # Minimum pT for W pT diff distributions
jetycut_tev: 1e30 # jet rap. cut for tevatron
jetycut_lhc: 1e30 # jet rap. cut for lhc
dymasscut_min: 0 # dy inv.mass. min cut
dymasscut_max: 1e30 # dy inv.mass. max cut
jetcfactcut: 1e30 # jet cfact. cut
use_cuts: fromintersection
cuts_intersection_spec:
RosalynLP marked this conversation as resolved.
Show resolved Hide resolved
- theoryid: 163
- theoryid: 53

############################################################
theory:
theoryid: 163 # database id

theorycovmatconfig:
point_prescription: "3 point"
theoryids:
from_: scale_variation_theories
fivetheories: None
pdf: NNPDF31_nlo_as_0118
use_thcovmat_in_fitting: true
use_thcovmat_in_sampling: true

sampling_t0:
use_t0: false

fitting_t0:
use_t0: true
RosalynLP marked this conversation as resolved.
Show resolved Hide resolved

############################################################
fitting:
seed: 65532133530 # set the seed for the random generator
genrep: on # on = generate MC replicas, off = use real data
rngalgo: 0 # 0 = ranlux, 1 = cmrg, see randomgenerator.cc
fitmethod: NGA # Minimization algorithm
ngen: 30000 # Maximum number of generations
nmutants: 80 # Number of mutants for replica
paramtype: NN
nnodes: [2, 5, 3, 1]

# NN23(QED) = sng=0,g=1,v=2,t3=3,ds=4,sp=5,sm=6,(pht=7)
# EVOL(QED) = sng=0,g=1,v=2,v3=3,v8=4,t3=5,t8=6,(pht=7)
# EVOLS(QED)= sng=0,g=1,v=2,v8=4,t3=4,t8=5,ds=6,(pht=7)
# FLVR(QED) = g=0, u=1, ubar=2, d=3, dbar=4, s=5, sbar=6, (pht=7)
fitbasis: NN31IC # EVOL (7), EVOLQED (8), etc.
basis:
# remeber to change the name of PDF accordingly with fitbasis
# pos: on for NN squared
# mutsize: mutation size
# mutprob: mutation probability
# smallx, largex: preprocessing ranges
- {fl: sng, pos: off, mutsize: [15], mutprob: [0.05], smallx: [1.046, 1.188], largex: [
1.437, 2.716]}
- {fl: g, pos: off, mutsize: [15], mutprob: [0.05], smallx: [0.9604, 1.23], largex: [
0.08459, 6.137]}
- {fl: v, pos: off, mutsize: [15], mutprob: [0.05], smallx: [0.5656, 0.7242], largex: [
1.153, 2.838]}
- {fl: v3, pos: off, mutsize: [15], mutprob: [0.05], smallx: [0.1521, 0.5611], largex: [
1.236, 2.976]}
- {fl: v8, pos: off, mutsize: [15], mutprob: [0.05], smallx: [0.5264, 0.7246], largex: [
0.6919, 3.198]}
- {fl: t3, pos: off, mutsize: [15], mutprob: [0.05], smallx: [-0.3687, 1.459], largex: [
1.664, 3.373]}
- {fl: t8, pos: off, mutsize: [15], mutprob: [0.05], smallx: [0.5357, 1.267], largex: [
1.433, 2.866]}
- {fl: cp, pos: off, mutsize: [15], mutprob: [0.05], smallx: [-0.09635, 1.204],
largex: [1.654, 7.456]}

############################################################
stopping:
stopmethod: LOOKBACK # Stopping method
lbdelta: 0 # Delta for look-back stopping
mingen: 0 # Minimum number of generations
window: 500 # Window for moving average
minchi2: 3.5 # Minimum chi2
minchi2exp: 6.0 # Minimum chi2 for experiments
nsmear: 200 # Smear for stopping
deltasm: 200 # Delta smear for stopping
rv: 2 # Ratio for validation stopping
rt: 0.5 # Ratio for training stopping
epsilon: 1e-6 # Gradient epsilon

############################################################
positivity:
posdatasets:
- {dataset: POSF2U, poslambda: 1e6} # Positivity Lagrange Multiplier
- {dataset: POSF2DW, poslambda: 1e6}
- {dataset: POSF2S, poslambda: 1e6}
- {dataset: POSFLL, poslambda: 1e6}
- {dataset: POSDYU, poslambda: 1e10}
- {dataset: POSDYD, poslambda: 1e10}
- {dataset: POSDYS, poslambda: 1e10}

############################################################
RosalynLP marked this conversation as resolved.
Show resolved Hide resolved
closuretest:
filterseed: 0 # Random seed to be used in filtering data partitions
fakedata: off # on = to use FAKEPDF to generate pseudo-data
fakepdf: MSTW2008nlo68cl # Theory input for pseudo-data
errorsize: 1.0 # uncertainties rescaling
fakenoise: off # on = to add random fluctuations to pseudo-data
rancutprob: 1.0 # Fraction of data to be included in the fit
rancutmethod: 0 # Method to select rancutprob data fraction
rancuttrnval: off # 0(1) to output training(valiation) chi2 in report
printpdf4gen: off # To print info on PDFs during minimization

############################################################
lhagrid:
nx: 150
xmin: 1e-9
xmed: 0.1
xmax: 1.0
nq: 50
qmax: 1e5

############################################################
debug: off
15 changes: 10 additions & 5 deletions doc/sphinx/source/vp/theorycov/examples.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,10 @@ You need to provide the central theory under the ``default_theory`` flag,
corresponding to :math:`(\mu_F, \mu_R) = (0,0)`,
which for NLO is theory 163.

You need to provide the required point prescription using the flag in
:ref:`this section <pointprescrips>`, e.g. ``point_prescription: "3 point"``
in the case below.

``dataspecs`` associates a chosen label (``speclabel``) with each of the theory
choices. This details what scale variation the theory corresponds to.

Expand All @@ -22,11 +26,12 @@ Here the cuts and PDF are taken from the central NLO scale-varied fit.
You must also list all the experiments you wish to include, along with any
relevant c-factors.

*IMPORTANT*: In order to ensure backwards compatibility now that the structure
of data in runcards has been updated and ``experiments`` is deprecated, you must
also include ``metadata_group: nnpdf31_process`` in the runcards, so that the
scale variation prescriptions are done by process rather than by experiment. See
:ref:`backwards-compatibility` for more details.
.. warning::
In order to ensure backwards compatibility now that the structure
of data in runcards has been updated and ``experiments`` is deprecated, you must
also include ``metadata_group: nnpdf31_process`` in the runcards, so that the
scale variation prescriptions are done by process rather than by experiment. See
:ref:`backwards-compatibility` for more details.

.. code-block:: yaml
:linenos:
Expand Down
5 changes: 3 additions & 2 deletions doc/sphinx/source/vp/theorycov/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@
The theorycovariance module
===============================


:Author: Rosalyn Pearson (r.l.pearson@ed.ac.uk)

The ``theorycovariance`` module deals with constructing, testing and
Expand Down Expand Up @@ -32,7 +31,9 @@ Summary
- Theoretical covariance matrices are built according to the various prescriptions
in :ref:`prescrips`.

- The prescription must be one of 3 point, 5 point, 5bar point, 7 point or 9 point.
- The prescription must be one of 3 point, 5 point, 5bar point, 7 point or 9 point. You can specify
this using ``point_prescription: "x point"`` in the runcard. The translation of this flag
into the relevant ``theoryids`` is handled by the ``scalevariations`` module in ``validphys``.

- As input you need theories for the relevant scale combinations which
correspond to the prescription. This information is taken from the
Expand Down
Loading