-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory saving output #96
Comments
at first I wanted to point out that we should use the byte representation, but then I thought maybe we could actually use this to introduce an approximation: we could use log(Q2) with a fixed precision (say 4 digits) and this way save some computations ...
actually we could even join them only at loading time ...
I'd rather put them in a separate sub-directory (just to keep them out of the way) - or we could even add them as regular operator, since after all they are regular operators leading to exactly the threshold (in the upper scheme) |
This is fine, but I wonder if
I was just thinking the other way round: the moment we can merge, we use this to extend.
I thought it was done, but maybe I'm only doing it for yadism...
This I'm not sure: it's easier, because in order to extend you just need to drop more
I was thinking to store patches and matching separately, but maybe you're right and there is no purpose. If we store them as regular operators, we should mark them as thresholds in the metadata. |
we have a detailed plan for the output: PlanThe folder will contain:
|
@felixhekhorn the part strictly addressing the title is already implemented, and most of the rest will be implemented as a consequence of #138 The only two elements that are falling outside #138 and contained here are:
Should we keep this issue for them? |
I still consider the items relevant - maybe we can put them into a new issue (with a clearer title)? |
Yes, maybe that's the way. If you open the new issue, feel free to close this one, otherwise I'll do at some point. |
Closed in favor of #193 |
We realized that most
eko
operations do not depend on multipleQ2
at the same time.The full
OperatorGrid
is a rank-5 tensor, by using only oneQ2
at a time we reduce the problem to a rank-4 tensor, with much less memory consumption.In order to get a usable and flexible structure, a few capabilities are needed for the new structure:
Q2
storage: it will be{q2}.npy
Q2
at a time, and drop them once consumedThis new structure will be the upgraded version of the current
Output
object.Moreover, a few more things might be implemented as related, and later used to support separate computation of
Q2
elements:OperatorGrid
OperatorGrid
to hold the reference to a newOutput
object, and store everything in thereQ2
results (those obtained before combining with threshold operators), together with full resultsQ2
inOutput
thresholds.npz
(containing all the threshold elements) and{q2}.part.npy
The idea is that, an object supporting these features, can be computed separately, in a completely independent way (1 process for thresholds, plus one for each
Q2
, for example), and then merged together.In order to make it easier to merge and compute the final one:
thresholds.npz
is never removed from the saved output{q2}.part.npy
come later, they can always be consumedoptimize()
orclean()
method, to get rid of itcombine()
methodThe text was updated successfully, but these errors were encountered: