-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nn layers refactor #1888
Nn layers refactor #1888
Conversation
@scarlehoff @RoyStegeman Does this look good to you, the plan described above I mean? And just in case it has changed since the last time I asked: do we still want to maintain the dense_per_flavour layer? |
If combining Even if we may never end up using it, I'm afraid we should keep supporting |
Yes. It is the burden of having published the code! but indeed
We have promised backwards compatibility but that doesn't mean that any improvement will affect the entire code. |
Ok no problem, working now! The issue I had with the per_flavour layer came from this old TODO about the basis_size coming from the last entry of the nodes list, while it should come form the runcard. I was overwriting Also, I don't think it makes sense to add the possibility of combining these layers with dropout, and indeed it wasn't possible before, so I just raised an error in that case. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, they are all suggestions for style / loops / naming
n3fit/src/n3fit/model_gen.py
Outdated
inits = [ | ||
initializer_generator(initializer_name, replica_seed, i_layer) | ||
for replica_seed in replica_seeds | ||
] | ||
layers = [ | ||
base_layer_selector( | ||
layer_type, | ||
kernel_initializer=init, | ||
units=nodes_out, | ||
activation=activation, | ||
input_shape=(nodes_in,), | ||
**custom_args, | ||
) | ||
for init in inits | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
inits = [ | |
initializer_generator(initializer_name, replica_seed, i_layer) | |
for replica_seed in replica_seeds | |
] | |
layers = [ | |
base_layer_selector( | |
layer_type, | |
kernel_initializer=init, | |
units=nodes_out, | |
activation=activation, | |
input_shape=(nodes_in,), | |
**custom_args, | |
) | |
for init in inits | |
] | |
for replica_seed in replica_seeds: | |
init = initializer_generator(replica_seed, i_layer) | |
layers = base_layer_selector( | |
layer_type, | |
kernel_initializer=init, | |
units=nodes_out, | |
activation=activation, | |
input_shape=(nodes_in,), | |
**custom_args, | |
) |
I think like this it is better for readability (I think you don't need inits
later right? otherwise of course no)
I've also removed the initializer name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, I think I was trying to anticipate how it will look with multi dense layers, but it doesn't matter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, your revision should have a layers = []
and a layers.append(layer)
, which I think makes it ugly again, how about what I have now? If you prefer the regular loop with appending, I'll change it again.
# ... then apply them to the input to create the models | ||
xs = [layer(x) for layer in list_of_pdf_layers[0]] | ||
for layers in list_of_pdf_layers[1:]: | ||
if type(layers) is list: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, add a comment here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean on why the if statement is needed? I added a comment, it's because dropout is shared between layers. I could also remove the if statement and replace dropout_layer
with [dropout_layer for _ in range(num_replicas)]
or something.
Greetings from your nice fit 🤖 !
Check the report carefully, and please buy me a ☕ , or better, a GPU 😉! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks very good!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Please don't merge this yet! |
Thanks, I won't merge yet! When is the next tag expected? |
Hopefully once #1901 is merged |
And when the papers are out... |
The ... seems to indicate that that will take a while? ;P Maybe it's worth creating a general waiting-for-next-tag branch or something, so that we can keep master as is but also not block further development? |
…nerate_dense and generate_dense_per_flavour
Co-authored-by: Juan M. Cruz-Martinez <juacrumar@lairen.eu>
Co-authored-by: Juan M. Cruz-Martinez <juacrumar@lairen.eu>
0475196
to
903c75b
Compare
This PR does two things that both should leave everything identical:
1. Pull together the 3 functions that were responsible for generating the neural network layers:
generate_dense_network
generate_dense_per_flavor_network
generate_nn
Only the last one remains, the first two had a lot of overlap.
I have also pulled in the loop over replicas out of
pdf_NN_layer_generator
intogenerate_nn
This is everything up to and includingthis commit.
This PR may be easier to follow commit by commit.
2. Reverse the order of the loops over replicas and layers
This is the actual point, currently we do: for all replicas, for all layers, create the layer.
To accomodate the upcoming multi-replica layers, where one layer contains all replicas, the order needed to change.
Status
I think there are 3 relevant choices to test for:
The dense_per_flavour layers aren't compatible with multiple replicas or with dropout (they could be, but in the first case it doesn't pass a check, and the second case just wasn't implemented), but the dropout and replicas choices should be independent, so we have:
default layer:
For each of these I have taken a simple runcard, ran it for 100 epochs, and compared the last 3 digits of the chi2 between this branch and master (well, replica_axis_first, which is ready to be merged). (Xs mean they pass this test)
TODO:
generate_dense_network
andgenerate_dense_per_flavour_network
MultiDense
layer itself is implemented.