Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a library of process dependent options #1949

Conversation

scarlehoff
Copy link
Member

@scarlehoff scarlehoff commented Feb 19, 2024

Right now, how validphys understands the data depends on how the process is defined.

This is done in various different places:

For cuts in filters.py:

For labels in the (old) parser:

How to create the xq2map depends on the kinematic transformation:

class Kintransform(metaclass=abc.ABCMeta):

And there's somewhere also a list of process description labels.
Then there's DIS which can be DIS_NC, DIS_CC, DIS_ALL but sometimes they are considered equal sometimes they are not.
etc

That was a result of things done in several iterations and the necessity of producing results so we never had the time to sit down and put everything together in a sensible manner.

Now with the new commondata format, while there is a lot of stuff which is based on the old ways (kinematics and results transformations scattered around the code). We have an opportunity to start doing things well instead of adding yet another point of chaos.
Not only we have an opportunity but we have a necessity since right now the xQ2 plot is broken for ttbar and it there is no process type for DIS+J #1825

Note that some of the previous ones are redundant in the new format. No need for results transformations or custom labels because the person implementing the data can decide which kinematics variables to implement, plot and cut upon. The only important thing is the variables can be understood for the given process.

My proposal here is the following: create one single library of process_options.py which will collect all process-dependent options.
Newly implemented data will be using this. The only public interface of this module is the Processes enum. When a dataset is loaded, the process type will be read and check against the accepted variables for that process. If it works, it's all ok. If it doesn't, the person implementing the dataset will need to either add a way for the process to understand the new one or choose a different set of variables.

I've put a DIS example which covers a few of the situations that we would find.

I haven't put this directly on the reader because this is just a proposal I came up with.

The only other option I can think of is that we embrace the chaos but that would require either restricting new data to the same set of variables that old data used or including the necessary variables for the x-Q2 mapping into the dataset.

@scarlehoff scarlehoff marked this pull request as draft February 19, 2024 23:31
@scarlehoff scarlehoff force-pushed the add_a_library_of_process_options branch from 8057378 to efc3d74 Compare February 20, 2024 17:48
@scarlehoff
Copy link
Member Author

Ok, this is now ready. Not sure if it will pass the test, it might be missing some str(process_type) but other than that it's ok. Tomorrow I'll add a fake process type (that will be equivalent to doing str).

The idea is that we can move forward without having to wait for people to implement the relevant stuff for every process.

@andreab1997 thanks for offering to help with this, in principle we need a process type for jets / dijets / ttbar so that the data in master is ok and it would be great if you also add one for herajet / heradijet for @t7phy's dis+j

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can already foresee that this will become a module rather then just a file ... i.e. I would move _dis_xq2map and DIS to process_options/dis.py etc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eventually we can. The file should be treated as a module indeed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say it is time 🙃

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that we have a lot of DIS_XYZ that are basically equivalent so it is not clear what should be its own module and that shouldn't.
There would be a lot of files with one line that seems confusing to me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree and I wouldn't make one file for one process, but e.g. all the fully inclusive DIS and the ttbar stuff can go together

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nono, I mean all the fully inclusive DIS in one file and all the ttbar together in a separate one

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, okok

right, we can separate by commondata process indeed since they should always branch from there.
I'll do that when I finish creating all other xq2 map (it should be possible to copy automatically all the ones that use k1/k2/k3)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest e.g.:

  • fully_inclusive_dis.py -> DIS,DIS_NC,DIS_CC,DIS_NCE
  • herajet.py -> HERAJET, HERADIJET
  • hqp.py -> HQP_YQ, HQP_YQQ, HQP_PTQ

if that corresponds to CD process (not sure) (names are place holders)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cd processes should be DIS, Z0, WPWM, TTBAR, JETS, DIJETS

some might be the same, but that way someone reading the commondata file knows immediately inside which file they should look (or maybe options_dis.py the name of the file is less important).
Right now, specially the ttb and vector boson ones are a mess to find.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think grouping by cd process is a good idea. And then inside DIJETS one can always do from JETS import HERAJET

@scarlehoff scarlehoff force-pushed the final_reader_for_new_commondata_mk2 branch from 42b3989 to b7b8424 Compare February 22, 2024 11:34
@scarlehoff scarlehoff force-pushed the add_a_library_of_process_options branch from efc3d74 to 6c406fd Compare February 22, 2024 11:53
@scarlehoff scarlehoff mentioned this pull request Feb 22, 2024
3 tasks
@scarlehoff scarlehoff force-pushed the final_reader_for_new_commondata_mk2 branch from 955f054 to bb57fe2 Compare February 22, 2024 16:15
@scarlehoff scarlehoff force-pushed the add_a_library_of_process_options branch from 6c406fd to 980f671 Compare February 22, 2024 16:17
@scarlehoff scarlehoff force-pushed the final_reader_for_new_commondata_mk2 branch from bb57fe2 to 6d05ace Compare February 22, 2024 16:23
@scarlehoff scarlehoff force-pushed the add_a_library_of_process_options branch from 980f671 to d233f17 Compare February 22, 2024 16:25
@scarlehoff scarlehoff force-pushed the final_reader_for_new_commondata_mk2 branch from 6d05ace to c5bee75 Compare February 23, 2024 17:37
@scarlehoff scarlehoff force-pushed the add_a_library_of_process_options branch from d233f17 to 5773393 Compare February 23, 2024 17:37
@scarlehoff scarlehoff force-pushed the add_a_library_of_process_options branch from 1be7e8d to 3aefd4a Compare February 29, 2024 15:49
@scarlehoff scarlehoff marked this pull request as ready for review February 29, 2024 15:53
@scarlehoff scarlehoff force-pushed the final_reader_for_new_commondata_mk2 branch from 5a5d287 to a626d1b Compare March 1, 2024 11:31
@scarlehoff scarlehoff force-pushed the add_a_library_of_process_options branch from d8c7f5d to 1a8bf48 Compare March 1, 2024 11:31
@scarlehoff scarlehoff added the run-fit-bot Starts fit bot from a PR. label Mar 1, 2024
Copy link

github-actions bot commented Mar 1, 2024

Greetings from your nice fit 🤖 !
I have good news for you, I just finished my tasks:

Check the report carefully, and please buy me a ☕ , or better, a GPU 😉!

@scarlehoff scarlehoff force-pushed the add_a_library_of_process_options branch from 1a8bf48 to 7e6599a Compare March 1, 2024 18:16
@scarlehoff scarlehoff removed the run-fit-bot Starts fit bot from a PR. label Mar 1, 2024
@scarlehoff
Copy link
Member Author

scarlehoff commented Mar 2, 2024

The fitbot there act as a check that old theories can still be used (the fitbot uses theory 200... it will soon be changed to theory 700). There are small changes in the fitbot (while the regression tests have not changed) because there are some dataset for which the precision of the data has changed (e.g., we had 4 digits before and now have 5), but it is only a few LHCB and maybe maybe jets, so it doesn't show up in any regression.

I will submit another fitbot just before the merge to update the reference bot.

This is a comparison to the last baseline (NNPDF40_nnlo_as_01180_qcd) where I've used exactly the same runcard (i.e., I haven't changed the names of the datasets that have been translated automatically by vp) https://vp.nnpdf.science/ClK5YFI-TjCBkzTeewuFow== (since the report is also done with new data, it might be informative to compare to a report done with old data 1)

And this one is the same fit but this time the names of the runcard have also been updated: https://vp.nnpdf.science/QaBlf8XvSmSe8UWMvzIy3g==

In both cases they are <60 replica fits.

The fits have been done with this commit 1a8bf48 which corresponds to this one 7e6599a before the rebase on top of the last batch of comments in the other branch.

@RoyStegeman
Copy link
Member

Thanks for doing this check. I can't see anything that hints at a bug

@scarlehoff scarlehoff merged commit 07cf9e1 into final_reader_for_new_commondata_mk2 Mar 3, 2024
8 checks passed
@scarlehoff scarlehoff deleted the add_a_library_of_process_options branch March 3, 2024 19:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants