Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nn layers refactor #1888

Merged
merged 30 commits into from
Jan 19, 2024
Merged

Nn layers refactor #1888

merged 30 commits into from
Jan 19, 2024

Conversation

APJansen
Copy link
Collaborator

@APJansen APJansen commented Dec 8, 2023

This PR does two things that both should leave everything identical:

1. Pull together the 3 functions that were responsible for generating the neural network layers:

  • generate_dense_network
  • generate_dense_per_flavor_network
  • generate_nn

Only the last one remains, the first two had a lot of overlap.
I have also pulled in the loop over replicas out of pdf_NN_layer_generator into generate_nn

This is everything up to and includingthis commit.
This PR may be easier to follow commit by commit.

2. Reverse the order of the loops over replicas and layers

This is the actual point, currently we do: for all replicas, for all layers, create the layer.
To accomodate the upcoming multi-replica layers, where one layer contains all replicas, the order needed to change.

Status

I think there are 3 relevant choices to test for:

  • dropout or not
  • dense layers or dense_per_flavour layers
  • 1 or multiple replicas

The dense_per_flavour layers aren't compatible with multiple replicas or with dropout (they could be, but in the first case it doesn't pass a check, and the second case just wasn't implemented), but the dropout and replicas choices should be independent, so we have:

default layer:

replicas \ dropout yes no
1 X X
multiple X X
  • dense_per_flavour, 1 replica, no dropout

For each of these I have taken a simple runcard, ran it for 100 epochs, and compared the last 3 digits of the chi2 between this branch and master (well, replica_axis_first, which is ready to be merged). (Xs mean they pass this test)

TODO:

  • Fix dense_per_flavour case, getting a size mismatch after training, at single replica models
  • Either test and fix or explicitly exclude the combination of dense_per_flavour and dropout (as it was, but no longer is)
  • Update tests which still use the old generate_dense_network and generate_dense_per_flavour_network
  • Check if a third step is possible without changing numerics: directly concatenating each layer over replicas. Actually no this isn't possible until the MultiDense layer itself is implemented.

@APJansen APJansen added Refactoring n3fit Issues and PRs related to n3fit escience labels Dec 8, 2023
@APJansen APJansen self-assigned this Dec 8, 2023
@APJansen APJansen mentioned this pull request Dec 8, 2023
@APJansen
Copy link
Collaborator Author

APJansen commented Dec 8, 2023

@scarlehoff @RoyStegeman Does this look good to you, the plan described above I mean?

And just in case it has changed since the last time I asked: do we still want to maintain the dense_per_flavour layer?

@RoyStegeman
Copy link
Member

If combining generate_dense_network and generate_dense_per_flavor_network keeps the code readable I don't have any objection, these are not functions that are used by scripts outside of executing a full fit. Reverting the loops is also fine (even necessary, as you say).

Even if we may never end up using it, I'm afraid we should keep supporting dense_per_flavour until we release 4.1 (unless @scarlehoff thinks we can get rid of it once the first of the three upcoming papers is on the arxiv? Though it's even debatable if that would be worth waiting for). In this case it's not so interesting to be able to do parallel fits though so if it's convenient not to support multi replicas that would be fine.

@scarlehoff
Copy link
Member

Even if we may never end up using it, I'm afraid we should keep supporting dense_per_flavour until we release 4.1

Yes. It is the burden of having published the code!

but indeed

if it's convenient not to support multi replicas that would be fine.

We have promised backwards compatibility but that doesn't mean that any improvement will affect the entire code.

Base automatically changed from replica-axis-first to master December 8, 2023 16:59
@APJansen
Copy link
Collaborator Author

Ok no problem, working now!

The issue I had with the per_flavour layer came from this old TODO about the basis_size coming from the last entry of the nodes list, while it should come form the runcard. I was overwriting nodes[-1] with 1 in this case, but this changed the original which was reused for the single replica model.
Is it worth fixing this old TODO? There's a basis argument in the runcard, which is a list, so we can just take its length, but I'm not sure how to add dependence on runcard arguments.

Also, I don't think it makes sense to add the possibility of combining these layers with dropout, and indeed it wasn't possible before, so I just raised an error in that case.

@APJansen APJansen marked this pull request as ready for review December 11, 2023 10:26
@APJansen APJansen added the run-fit-bot Starts fit bot from a PR. label Dec 13, 2023
Copy link
Member

@scarlehoff scarlehoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, they are all suggestions for style / loops / naming

n3fit/src/n3fit/checks.py Outdated Show resolved Hide resolved
n3fit/src/n3fit/model_gen.py Outdated Show resolved Hide resolved
n3fit/src/n3fit/model_gen.py Outdated Show resolved Hide resolved
Comment on lines 759 to 769
inits = [
initializer_generator(initializer_name, replica_seed, i_layer)
for replica_seed in replica_seeds
]
layers = [
base_layer_selector(
layer_type,
kernel_initializer=init,
units=nodes_out,
activation=activation,
input_shape=(nodes_in,),
**custom_args,
)
for init in inits
]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
inits = [
initializer_generator(initializer_name, replica_seed, i_layer)
for replica_seed in replica_seeds
]
layers = [
base_layer_selector(
layer_type,
kernel_initializer=init,
units=nodes_out,
activation=activation,
input_shape=(nodes_in,),
**custom_args,
)
for init in inits
]
for replica_seed in replica_seeds:
init = initializer_generator(replica_seed, i_layer)
layers = base_layer_selector(
layer_type,
kernel_initializer=init,
units=nodes_out,
activation=activation,
input_shape=(nodes_in,),
**custom_args,
)

I think like this it is better for readability (I think you don't need inits later right? otherwise of course no)
I've also removed the initializer name.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, I think I was trying to anticipate how it will look with multi dense layers, but it doesn't matter.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, your revision should have a layers = [] and a layers.append(layer), which I think makes it ugly again, how about what I have now? If you prefer the regular loop with appending, I'll change it again.

n3fit/src/n3fit/model_gen.py Outdated Show resolved Hide resolved
# ... then apply them to the input to create the models
xs = [layer(x) for layer in list_of_pdf_layers[0]]
for layers in list_of_pdf_layers[1:]:
if type(layers) is list:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, add a comment here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean on why the if statement is needed? I added a comment, it's because dropout is shared between layers. I could also remove the if statement and replace dropout_layer with [dropout_layer for _ in range(num_replicas)] or something.

n3fit/src/n3fit/model_gen.py Show resolved Hide resolved
Copy link

Greetings from your nice fit 🤖 !
I have good news for you, I just finished my tasks:

Check the report carefully, and please buy me a ☕ , or better, a GPU 😉!

Copy link
Member

@Radonirinaunimi Radonirinaunimi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks very good!

n3fit/src/n3fit/model_gen.py Show resolved Hide resolved
@scarlehoff scarlehoff removed the run-fit-bot Starts fit bot from a PR. label Dec 15, 2023
Copy link
Member

@scarlehoff scarlehoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@scarlehoff
Copy link
Member

Please don't merge this yet!

@scarlehoff scarlehoff added the waiting for next tag PR which are now completed and are waiting for a next tag label Dec 20, 2023
@APJansen
Copy link
Collaborator Author

APJansen commented Jan 8, 2024

Thanks, I won't merge yet! When is the next tag expected?

@RoyStegeman
Copy link
Member

Hopefully once #1901 is merged

@APJansen APJansen mentioned this pull request Jan 9, 2024
@scarlehoff
Copy link
Member

Hopefully once #1901 is merged

And when the papers are out...

@APJansen
Copy link
Collaborator Author

And when the papers are out...

The ... seems to indicate that that will take a while? ;P

Maybe it's worth creating a general waiting-for-next-tag branch or something, so that we can keep master as is but also not block further development?

@scarlehoff scarlehoff changed the base branch from master to develop_merge_20240119 January 19, 2024 13:57
@scarlehoff scarlehoff merged commit 64379d9 into develop_merge_20240119 Jan 19, 2024
6 checks passed
@scarlehoff scarlehoff deleted the nn-layers-refactor branch January 19, 2024 14:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
escience n3fit Issues and PRs related to n3fit Refactoring waiting for next tag PR which are now completed and are waiting for a next tag
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants