-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Eko from theory #1601
Eko from theory #1601
Conversation
Co-authored-by: Juan M. Cruz-Martinez <juacrumar@lairen.eu>
@felixhekhorn any idea to get the memory consumption of the convolution down? I guess in this case is ok even if it is a bit slower. |
I get the impression a lot of the things here should go in loader.py and be generally accessible from vp as providers. |
This exact problem should be address here: NNPDF/eko#105 which allows to load a single evolution operator (for a fixed final scale) at a time - instead of loading all of them right at the beginning (as is done at the moment) (Unfortunately @alecandido is quite busy at the moment so there is not much progress - and we want to delay this behind the final acceptance of the eko paper) |
We can have later on a validphys provider that provides the
I'm confused, how would this address the problem? Here we only have one initial scale but have many final scales so it is the opposite situation. (not any other tricks that we can play? I believe this might only be going beyond the limit of 7GB by a bit so any reduction would help) |
I believe it is what you want: say you have two final scales 10 and 100
|
Then for the evolution from 1.65 to 1e5 with 50 steps of Q we have in memory all 50 operators at any given time? |
in current master, yes - after NNPDF/eko#105 becomes available, no |
I guess the difference is that here we are already using things like the loader, albeit in a cumbersome way. So since it is being worked on we could as well do it nicer. |
Implementing things inside validphys and using things from validphys are two different things. In the particular case you mention (the loader, for the theory) is not even clear how it should be used (see egg and chicken problem in the other PR) since it would try to download a theory that it hasn't been built yet! |
@andreab1997 @felixhekhorn @alecandido I've noticed it does goes beyond 7 GB (getting all the way to 13 GB in my computer) but it is only for a few moments and it quickly goes back to 5GB where it stays. Do you guys have any idea why this peak could be produced? Maybe there is a way to remove that peak easily. |
With yadism I knew: there was a point in which the grid was dumped to PineAPPL. But since the operation is performed by PineAPPLpy, the cross sections are first copied in memory in the new data structure, then dumped. I guess for EKO might happen something similar, most likely with YAML. It will be solved by the usual NNPDF/eko#105 |
Ok, as soon as the computation of the operators for all theories finishes I'll try installing that branch to check whether the spike disappears. That branch is only waiting for the paper publication right? |
Nope, it is definitely more complicated. (The new data structure is self managing an EKO partially on disk, i.e. it can offload operators as soon as they are computed. But the computation part is still using the huge dictionary with all the operators in memory...) |
You can track full progress in NNPDF/eko#138 However, the moment we merge that one, we will write immediately the new "runner". Most likely it will take not too much, since most of the operations will be the exact same, we just need to save all computed pieces right after computation :) |
What do you mean by runner? I wonder whether there's a way of rewriting this function so that it is a bit less heavy in memory. If not we just wait for NNPDF/eko#138 |
exactly - that's the very function we're trying to replace: and to be more precise we want to avoid
I'm afraid we need the full PR (and as you can see that one is highly non-trivial)
it's the place where the actual computation happens (meaning where everything is glued together). However, I guess in your case your concern is more about what is currently called |
You are right, the spike is there because is (most likely) doubling the representation. But the idea is that you'd never worry about doubling, if the object you want to save is small enough in the first place. The current "runner" (the object is running the full computation and storage) first computes all the operators, and then save them all at once. |
No, the spike seem to happen -at least in montblanc- during the direct call to nnpdf/n3fit/src/evolven3fit/evolve.py Line 95 in fd0e203
|
Ok, sorry. Then it's just the complementary problem. But, actually, if you call Can you try to explicitly erase the stream? (just modify manually the package in your environment) |
They are, once the function is finished. It is really during the function lifetime that the spike occurs. I'll play around with it once all ekos are done. Of course, it might well be a giant red herring since I don't know what actually killed the CI, but I think there's a good chance it was this. |
Nope, I mean they should be dropped during function execution: there is a single variable for all the streams, so before loading the new one, the former should be out of scope, and thus dropped. |
While this cannot be used as the default evolution code for the PDF, I think it would still be good to have it so I would merge this into the other branch (because it works) and then try to 1. keep the other as close to master as possible 2. maybe even change the name to |
Ok I am merging this. Thank you! |
We want to allow
evolven3fit
to look inside the relevant theory and, if possible, load the eko needed for the evolution from there. This should actually be the default behaviour.