Add a library of process dependent options #1949

scarlehoff · 2024-02-19T23:30:45Z

Right now, how validphys understands the data depends on how the process is defined.

This is done in various different places:

For cuts in filters.py:

nnpdf/validphys2/src/validphys/filters.py

Line 19 in f9e8c4b

KIN_LABEL = {

For labels in the (old) parser:

nnpdf/validphys2/src/validphys/commondataparser.py

Line 17 in f9e8c4b

KINLABEL_LATEX = {

How to create the xq2map depends on the kinematic transformation:

nnpdf/validphys2/src/validphys/plotoptions/kintransforms.py

Line 72 in f9e8c4b

class Kintransform(metaclass=abc.ABCMeta):

And there's somewhere also a list of process description labels.
Then there's DIS which can be DIS_NC, DIS_CC, DIS_ALL but sometimes they are considered equal sometimes they are not.
etc

That was a result of things done in several iterations and the necessity of producing results so we never had the time to sit down and put everything together in a sensible manner.

Now with the new commondata format, while there is a lot of stuff which is based on the old ways (kinematics and results transformations scattered around the code). We have an opportunity to start doing things well instead of adding yet another point of chaos.
Not only we have an opportunity but we have a necessity since right now the xQ2 plot is broken for ttbar and it there is no process type for DIS+J #1825

Note that some of the previous ones are redundant in the new format. No need for results transformations or custom labels because the person implementing the data can decide which kinematics variables to implement, plot and cut upon. The only important thing is the variables can be understood for the given process.

My proposal here is the following: create one single library of process_options.py which will collect all process-dependent options.
Newly implemented data will be using this. The only public interface of this module is the Processes enum. When a dataset is loaded, the process type will be read and check against the accepted variables for that process. If it works, it's all ok. If it doesn't, the person implementing the dataset will need to either add a way for the process to understand the new one or choose a different set of variables.

I've put a DIS example which covers a few of the situations that we would find.

I haven't put this directly on the reader because this is just a proposal I came up with.

The only other option I can think of is that we embrace the chaos but that would require either restricting new data to the same set of variables that old data used or including the necessary variables for the x-Q2 mapping into the dataset.

scarlehoff · 2024-02-20T17:57:22Z

Ok, this is now ready. Not sure if it will pass the test, it might be missing some str(process_type) but other than that it's ok. Tomorrow I'll add a fake process type (that will be equivalent to doing str).

The idea is that we can move forward without having to wait for people to implement the relevant stuff for every process.

@andreab1997 thanks for offering to help with this, in principle we need a process type for jets / dijets / ttbar so that the data in master is ok and it would be great if you also add one for herajet / heradijet for @t7phy's dis+j

felixhekhorn · 2024-02-21T09:04:27Z

validphys2/src/validphys/process_options.py

I can already foresee that this will become a module rather then just a file ... i.e. I would move _dis_xq2map and DIS to process_options/dis.py etc.

Eventually we can. The file should be treated as a module indeed.

I'd say it is time 🙃

The problem is that we have a lot of DIS_XYZ that are basically equivalent so it is not clear what should be its own module and that shouldn't.
There would be a lot of files with one line that seems confusing to me.

I agree and I wouldn't make one file for one process, but e.g. all the fully inclusive DIS and the ttbar stuff can go together

nono, I mean all the fully inclusive DIS in one file and all the ttbar together in a separate one

ah, okok

right, we can separate by commondata process indeed since they should always branch from there.
I'll do that when I finish creating all other xq2 map (it should be possible to copy automatically all the ones that use k1/k2/k3)

I would suggest e.g.:

fully_inclusive_dis.py -> DIS,DIS_NC,DIS_CC,DIS_NCE

herajet.py -> HERAJET, HERADIJET

hqp.py -> HQP_YQ, HQP_YQQ, HQP_PTQ

if that corresponds to CD process (not sure) (names are place holders)

The cd processes should be DIS, Z0, WPWM, TTBAR, JETS, DIJETS

some might be the same, but that way someone reading the commondata file knows immediately inside which file they should look (or maybe options_dis.py the name of the file is less important).
Right now, specially the ttb and vector boson ones are a mess to find.

I think grouping by cd process is a good idea. And then inside DIJETS one can always do from JETS import HERAJET

github-actions · 2024-03-01T14:50:13Z

Greetings from your nice fit 🤖 !
I have good news for you, I just finished my tasks:

Fit Name: NNBOT-975988919-2024-03-01
Fit Report wrt master: https://vp.nnpdf.science/N2OnQPLcSAWW3qcGKLKzTQ==
Fit Report wrt latest stable reference: https://vp.nnpdf.science/C_igGV4QSGSkNCQM6fTgNA==
Fit Data: https://data.nnpdf.science/fits/NNBOT-975988919-2024-03-01.tar.gz

Check the report carefully, and please buy me a ☕ , or better, a GPU 😉!

scarlehoff · 2024-03-02T08:22:17Z

The fitbot there act as a check that old theories can still be used (the fitbot uses theory 200... it will soon be changed to theory 700). There are small changes in the fitbot (while the regression tests have not changed) because there are some dataset for which the precision of the data has changed (e.g., we had 4 digits before and now have 5), but it is only a few LHCB and maybe maybe jets, so it doesn't show up in any regression.

I will submit another fitbot just before the merge to update the reference bot.

This is a comparison to the last baseline (NNPDF40_nnlo_as_01180_qcd) where I've used exactly the same runcard (i.e., I haven't changed the names of the datasets that have been translated automatically by vp) https://vp.nnpdf.science/ClK5YFI-TjCBkzTeewuFow== (since the report is also done with new data, it might be informative to compare to a report done with old data 1)

And this one is the same fit but this time the names of the runcard have also been updated: https://vp.nnpdf.science/QaBlf8XvSmSe8UWMvzIy3g==

In both cases they are <60 replica fits.

The fits have been done with this commit 1a8bf48 which corresponds to this one 7e6599a before the rebase on top of the last batch of comments in the other branch.

RoyStegeman · 2024-03-02T10:18:27Z

Thanks for doing this check. I can't see anything that hints at a bug

scarlehoff added the data toolchain label Feb 19, 2024

scarlehoff requested a review from RoyStegeman February 19, 2024 23:31

scarlehoff marked this pull request as draft February 19, 2024 23:31

scarlehoff force-pushed the add_a_library_of_process_options branch from 8057378 to efc3d74 Compare February 20, 2024 17:48

scarlehoff requested a review from andreab1997 February 20, 2024 17:49

felixhekhorn reviewed Feb 21, 2024

View reviewed changes

scarlehoff force-pushed the final_reader_for_new_commondata_mk2 branch from 42b3989 to b7b8424 Compare February 22, 2024 11:34

scarlehoff force-pushed the add_a_library_of_process_options branch from efc3d74 to 6c406fd Compare February 22, 2024 11:53

scarlehoff mentioned this pull request Feb 22, 2024

New CommonData Reader #1678

Merged

3 tasks

scarlehoff force-pushed the final_reader_for_new_commondata_mk2 branch from 955f054 to bb57fe2 Compare February 22, 2024 16:15

scarlehoff force-pushed the add_a_library_of_process_options branch from 6c406fd to 980f671 Compare February 22, 2024 16:17

scarlehoff force-pushed the final_reader_for_new_commondata_mk2 branch from bb57fe2 to 6d05ace Compare February 22, 2024 16:23

scarlehoff force-pushed the add_a_library_of_process_options branch from 980f671 to d233f17 Compare February 22, 2024 16:25

scarlehoff force-pushed the final_reader_for_new_commondata_mk2 branch from 6d05ace to c5bee75 Compare February 23, 2024 17:37

scarlehoff force-pushed the add_a_library_of_process_options branch from d233f17 to 5773393 Compare February 23, 2024 17:37

giacomomagni mentioned this pull request Feb 27, 2024

Drop yamldb as default NNPDF/pineko#156

Closed

scarlehoff force-pushed the add_a_library_of_process_options branch from 1be7e8d to 3aefd4a Compare February 29, 2024 15:49

scarlehoff marked this pull request as ready for review February 29, 2024 15:53

scarlehoff force-pushed the final_reader_for_new_commondata_mk2 branch from 5a5d287 to a626d1b Compare March 1, 2024 11:31

scarlehoff force-pushed the add_a_library_of_process_options branch from d8c7f5d to 1a8bf48 Compare March 1, 2024 11:31

scarlehoff added the run-fit-bot Starts fit bot from a PR. label Mar 1, 2024

scarlehoff and others added 7 commits March 1, 2024 19:15

first attempt

4e510fc

minimal working version

bbacdec

Add jets and dijets

a929df0

Complete implementation of kin map of JETS and DIJETS

58228b3

Add HQP_YQ, HQP_YQQ and HQP_PTQ

c5cbfeb

add herajet and heradijet

09d82da

fix to bcdms

1974192

scarlehoff added 3 commits March 1, 2024 19:15

fix tevatron energy... it is not gevatron...

5e0df7c

updated filters and filters tests

376ec47

3.9 compatibility

7e6599a

scarlehoff force-pushed the add_a_library_of_process_options branch from 1a8bf48 to 7e6599a Compare March 1, 2024 18:16

scarlehoff removed the run-fit-bot Starts fit bot from a PR. label Mar 1, 2024

scarlehoff merged commit 07cf9e1 into final_reader_for_new_commondata_mk2 Mar 3, 2024
8 checks passed

scarlehoff deleted the add_a_library_of_process_options branch March 3, 2024 19:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a library of process dependent options #1949

Add a library of process dependent options #1949

scarlehoff commented Feb 19, 2024 •

edited

Loading

scarlehoff commented Feb 20, 2024

felixhekhorn Feb 21, 2024

scarlehoff Feb 21, 2024

felixhekhorn Mar 1, 2024

scarlehoff Mar 1, 2024

felixhekhorn Mar 1, 2024

felixhekhorn Mar 1, 2024

scarlehoff Mar 1, 2024

felixhekhorn Mar 1, 2024

scarlehoff Mar 1, 2024

felixhekhorn Mar 1, 2024

github-actions bot commented Mar 1, 2024

scarlehoff commented Mar 2, 2024 •

edited

Loading

RoyStegeman commented Mar 2, 2024

Add a library of process dependent options #1949

Add a library of process dependent options #1949

Conversation

scarlehoff commented Feb 19, 2024 • edited Loading

scarlehoff commented Feb 20, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented Mar 1, 2024

scarlehoff commented Mar 2, 2024 • edited Loading

RoyStegeman commented Mar 2, 2024

scarlehoff commented Feb 19, 2024 •

edited

Loading

scarlehoff commented Mar 2, 2024 •

edited

Loading