-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor preprocessing #1777
Refactor preprocessing #1777
Conversation
Please don't remove options that currently exist. Actually one of the basic runcards does exactly that. I'm ok with the changes but I would like to keep that option. |
This will also remove the possibility of having a fixed-functional form (the one we briefly talked about at some point) that @peterkrack is working on. |
@Radonirinaunimi Why exactly? You mean only the changes in the separate prepro_join_weights branch? |
Sorry if I was unclear but my comment refers to the changes you propose here:
(If I am not mistaken), having trainable weights on flavor-by-flavor basis is exactly what is required for the fixed functional form. Therefore if you restrict that option, this may no longer be possible. |
Ok, so two good reasons to keep this option. Keeping this option and having the weights as vectors will probably make things more complicated rather than simpler, so let's discard the prepro_join_weights branch then. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @APJansen, this looks good! For as far as I'm concerned this can be merged after you've addressed a few minor points.
@RoyStegeman As you saw I was having some issues with the test, thanks for fixing it. Did you understand what was going wrong? It was passing for me locally but not in the CI after incorporating your comment. |
To be honest I don't know, for me this passes locally (of course, since I also generated the |
Greetings from your nice fit 🤖 !
Check the report carefully, and please buy me a ☕ , or better, a GPU 😉! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check that I didn't made any typos in the suggestion before accepting any of them ¡!
RE the differences, by changing to using np.testing
we'll have extra information on the failures, it might have just been numerics.
Co-authored-by: Juan M. Cruz-Martinez <juacrumar@lairen.eu>
@RoyStegeman It was passing locally for me before your change, and failing after. I think it's because of tensorflow, I had this issue before. The tensorflow version I have locally doesn't handle seeds properly. i.e. initializing a layer with say Now I changed the seed at Juan's suggestion, which I agree is better, but it will break the tests of course, and if I update the numbers it will probably again pass for me locally but fail in the CI, so could you change them again? Sorry about that. |
No problem! |
This PR simplifies the preprocessing layer, improving readability without changing any results. I think these changes are harmless and uncontroversial and can easily be merged.
Additional changes proposed
However my goal was a slightly bigger change, that may cause issues so I put the last commit in a different branch.
What this does is unite all the preprocessing factors into a single alpha vector and a single beta vector, rather than having tons of scalars. This layer is the only place in the model where flavors are treated individually. Each parameter has flavor-dependent min and max values, that go into both the initializer and a constraint, but this can be done as a vector as well.
The reason this may be controversial is that it doesn't allow setting weights as trainable on a flavor-by-flavor basis, only all alphas and/or all betas. I don't see why this would be necessary, but I think I did see a runcard that does this.
Timing
I did some timing tests as well, creating a model with only the preprocessing layer and training it on random targets for 10_000 epochs:
The timing script is something like:
checks