Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Write down the versions for both eko and pineko #109

Merged
merged 4 commits into from
Jul 21, 2023
Merged

Conversation

scarlehoff
Copy link
Member

Closes #108

requires = ["poetry-core>=1.0.0"]
build-backend = "poetry.core.masonry.api"
requires = ["poetry-core>=1.0.0", "poetry-dynamic-versioning"]
build-backend = "poetry_dynamic_versioning.backend"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I actually lost that there was also a back-end support. Thanks @scarlehoff!

@scarlehoff
Copy link
Member Author

eko doesn't allow for an unknown key in the runcard, is there a "description" or any other free form field that I can use for the pineko version?

@alecandido
Copy link
Member

Nope, there is not...

But EKO should be fully reproducible given its runcards, including the EKO version. Why do you want the Pineko version inside the EKOs?

@scarlehoff
Copy link
Member Author

scarlehoff commented Jul 20, 2023

pineko creates the runcard, so we I want to know which version of pineko created the runcard

@alecandido
Copy link
Member

pineko creates the runcard, so we I want to know which version of pineko created the runcard

But not in the EKO. That information you could write in the theory.

My point is that EKO is already decoupled from that: whatever version of Pineko created the runcard, if it's the same runcard it will be the same EKO (no matter Pineko).

@scarlehoff
Copy link
Member Author

The problem is that in general you might have a theory made with different versions of pineko* and I would like to save that information. The theory, template and eko might be the same but I'd like the operator card to save the pineko version as well.

*because of some ekos being very slow and, eg jets

@alecandido
Copy link
Member

Add a log file to the theory?

My point is that I'd like EKO and the EKOs to know nothing about Pineko, for better isolation.

@scarlehoff
Copy link
Member Author

But then you need a log file per dataset, is that what you meant?

However, I've done a small change so the information is stripped from the operator card before feeding it to eko.

@alecandido
Copy link
Member

But then you need a log file per dataset, is that what you meant?

Not exactly: I meant a single log file, and you log to it adding lines every time (i.e. you cat the log files per-dataset into a single one).

@scarlehoff
Copy link
Member Author

Not exactly: I meant a single log file, and you log to it adding lines every time (i.e. you cat the log files per-dataset into a single one).

You mean, having a sort of a operators.yaml with <dataset_name>: <pineko_version> within the data/operators/<theory number>/ folder?

@scarlehoff
Copy link
Member Author

Actually, I don't think this is a good idea.
This is another file to keep track of though and it would only be tracked "automatically" if people does things in the theories_slim repository. If you generate stuff in a cluster most likely it won't be the case.

I think that if there's no way of keeping the information of the pineko version in the operator card itself is better to drop it. I'd like to keep it though (just like the fktable keeps the information of the pineko version).

@felixhekhorn
Copy link
Contributor

But the current state is quite useless, no? since the "pineko operator card", which is now different then the "eko operator card", is never saved in the FK and so this information is completely local ... what we should do is open eko at the time of convolution, extract the (eko-) op_card and dump into the metadata ...

@alecandido
Copy link
Member

But the current state is quite useless, no? since the "pineko operator card", which is now different then the "eko operator card", is never saved in the FK and so this information is completely local ... what we should do is open eko at the time of convolution, extract the (eko-) op_card and dump into the metadata ...

The convolution is also done by Pineko, so that is yet a further Pineko version.
I think that storing the information for reproducibility is definitely relevant, but we should also pay attention to store sensible information and not too much, otherwise we might end up wondering about "which Pineko version is this one? the one which generated the runcard? or the one which did the convolution?", or something like that.

About the runcard I really believe there is no need to preserve this information through the EKO. And I would not account for the infinite amount of combinations on how you could generate the runcards: either you used a single version of Pineko for the whole process, or we should add some more free form logs, as the log file proposed above. Keeping structured information is not much more helpful for reproducibility, since you then should propagate it consistently and the propagation process itself might change with the Pineko version...

Fully automated = 100% reproducible
Everything else = manually hacked

@scarlehoff
Copy link
Member Author

About the runcard I really believe there is no need to preserve this information through the EKO

Indeed! I don't want to preserve the information beyond the operator card itself. If you look at the changes in the code, you can see that I drop it.

The point is that the operator card is a product of pineko, and as a product it should be possible to reproduce it and for that we need to know the pineko version.

The eko is a product of eko, and so it only needs to know about its own theory card and the version of eko it was produced.

The fktable is a product of pineappl, eko and pineko and so it knows all three.

@felixhekhorn
Copy link
Contributor

In the case of the eko op_card, I think this is actually useful information - also in the light of sim-FONLL we want to see the grids are convoluted with the correct FNS ...

so I still suggest to add below this line

fktable.set_key_value("eko_version", operators.metadata.version)

fktable.set_key_value("eko_operator_card", operators.operator_card)

@scarlehoff
Copy link
Member Author

But this issue deals with the operator card, not with the fktable itself.

@felixhekhorn
Copy link
Contributor

But this issue deals with the operator card, not with the fktable itself.

ok accepted, my comment is out of scope - still I'm not happy with this, because this creates a new configuration file which is specific to pineko: you can not use these cards any longer in regular eko, i.e. without going through the pineko interface

@scarlehoff
Copy link
Member Author

because this creates a new configuration file which is specific to pineko:

Yes, this is why I was asking for a description field or something.
Does eko read the card as a yaml? Maybe I can add it as a comment?

@felixhekhorn
Copy link
Contributor

Yes, this is why I was asking for a description field or something. Does eko read the card as a yaml? Maybe I can add it as a comment?

we currently use this https://github.com/NNPDF/eko/blob/a8ee0a8c90462bfa83a56bcaf3855b7fae461b38/src/eko/io/dictlike.py#L36
so, no - for the moment there is no way to add data beyond the fields of the dataclasses

@alecandido
Copy link
Member

alecandido commented Jul 20, 2023

we currently use this NNPDF/eko@a8ee0a8/src/eko/io/dictlike.py#L36
so, no - for the moment there is no way to add data beyond the fields of the dataclasses

True, but this is happening on top of the dictionary. And the dictionary is actually loaded from YAML, so comments are actually automatically lost.

@@ -153,6 +153,8 @@ def write_operator_card(pineappl_grid, default_card, card_path, tcard):

with open(card_path, "w", encoding="UTF-8") as f:
yaml.safe_dump(operators_card, f)
f.write(f"# {pineko_version=}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this was not what you wanted to write, but:

Suggested change
f.write(f"# {pineko_version=}")
f.write(f"# pineko_version={pineko_version}")

(mainly because of the =, that makes little sense without : and a width)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{pineko_version=} is shorthand for pineko_version={pineko_version}

But I can do pineko_version: {pineko_version} instead if you prefer the : (it's a comment so it doesn't really matter how is it written)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, great. I forgot about this option.

The reference to the : was only because = can be used as a format specifier, but the specifiers should be separated by the identifier with the :.

It is fine as it is.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw, it was new in py3.8, that by now it is perfectly fine (py3.7 officially died one month ago).
https://docs.python.org/3/whatsnew/3.8.html#bpo-36817-whatsnew

@scarlehoff scarlehoff merged commit cd14829 into main Jul 21, 2023
@scarlehoff scarlehoff deleted the ekoversion branch July 21, 2023 21:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Write the git tag of the programs when version is not available
3 participants