-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Action to return pseudodata in n3fit #650
Conversation
There is the question of whether we want to make these things be proper vp actions. I have no strong feelings at the moment. Would need to sleep on it |
4246d2b
to
0b63e37
Compare
Since vp knows about the replicas and about the seeds it might indeed make sense that what n3fit receives is already the list of seeds. |
n3fit/src/n3fit/fit.py
Outdated
for replica_number in replica: | ||
np.random.seed(trvalseed) | ||
for i in range(replica_number): | ||
trvalseed = np.random.randint(0, pow(2, 31)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am wondering if we should be using some sort of hmac instead of this old funny contortion @scarrazza @scarlehoff .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could, we do so for the replicas I think (with emphasis on the some sort). That said, it is more complication than we need but if someone wants to have a go it can be a fun short PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd argue that now is a good time on the grounds that we don't want to ever change the random number generation after we start using it for production.
This is an interesting read, and one that raises the question of whether we should be trading some sort of random state object instead: https://numpy.org/doc/1.18/reference/random/index.html @scarlehoff Could you please remind us what is the general status of reproducible random seeds in tensorflow. btw this is somewhat connected to #77 |
Unchanged, if we were fully TF 2 the thing about the seeds should not be an issue but since not all system are still in TF 2.1 (or even TF 2.0) we are still using the compatibility library. RE the random state, ok but not so sure how to implement it (other than using it to generate seeds ofc) because there are several things that we offload, such as the replica generation (libnnpdf) or the initialization of the NN (keras). |
9ac3c5e
to
6970c94
Compare
Ok as it stands, this function i think finds the correct pseudodata, I'm gonna work more on cleaning up the output Also it is unbelievably slow, is this normal? |
b5518c9
to
b8f0387
Compare
@Zaharid @scarlehoff can you please check b8f0387 is equivalent to the previous implementation? I'm working on making this function slightly faster, but I think it does what it's supposed to do |
If the travis check passes it should be! |
a776097
to
ec7fc54
Compare
Ok @wilsonmr the Linux test passed this time, looks like a mac issue |
363f602
to
c763d69
Compare
Gonna move this function to |
7f62bc1
to
d10c793
Compare
Behold parallel code in python. This now works on the order of minutes. @scarlehoff @Zaharid ready for review. LGTM :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think upon adding new functionalities it would be good to add an example of usage (and maybe motivation and whatsnot) to the doocumentation, for instance linking the example from the docstring somewhere down https://docs.nnpdf.science/vp/api.html would be very nice so someone just has to go there and have a quick look (instead of having to swim through the automodules documentation).
validphys2/src/validphys/results.py
Outdated
with mp.Manager() as manager: | ||
L = manager.list() | ||
|
||
NPROC = mp.cpu_count() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be a good idea to use the --parallel
flag from validphys to choose whether to use multiprocessing.
In any case, using all cores as default without an easy way for turning that off might have unintended consequences.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hahaha, I was thinking this too. Could get some angry emails from cluster admins. Does the --parallel
flag apply here though? I thought that parallelized dependencies, in either case, I could have it as a function argument if you prefer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess not, but if you have access to the flag (don't know, @Zaharid ?) you can use it to know whether you are allowed to run parallel o no.
(and yes, I was also thinking about angry cluster admins :__ )
Ok good shout on adding to docs. Will add this in the next commit. P.s how difficult would it be to do something similar for old style NNFIT fits? |
This is the number of replicas before applying postfit
We now write the pseudodata for all fitted replicas to a dictionary which is then mapped from by using the FromMCReplica found in the header file
61c0cde
to
2dccb71
Compare
Ok @Zaharid @scarlehoff, I ran a quick 10 rep fit, of which only 1 survived postfit. I then pickle the |
7fbb332
to
9c8e338
Compare
I'm not too sure why the test is raising a Any ideas? Stack exchange said to use |
if it passes locally but not when doing the build then probably it's not installing the file correctly. I think that because it's a Line 38 in 0f1012a
another item for pickle files so that the installation installs the test file |
9c8e338
to
63fed3e
Compare
Ahh yes of course! Thank Mikey! Hopefully it passes this time round Edit: @wilsonmr actually, I guess I can add the pickle file to |
4eab452
to
ebaf0df
Compare
So far I've refactored seed generation. But the goal is to move towards #519 and close #639 along the way.
Also would it make more sense to have this action in vp or n3fit?
Very much a WIP I'll let you know when it's read for review