Models feature/normalization layers #47

jakob-schloer · 2024-12-20T11:49:31Z

Describe your changes

This PR makes it possible to switch the implementation of Linear and LayerNorm kernels in the config.

At the moment we use torch.NN implementation for many layers in Anemoi model e.g. torch.nn.layerNorm, torch.NN.linear. This has the advantage of being available out of the box with torch and portable to many systems (CPU, AMD and Nvidia GPUs). However, other layer implementations might be more efficient for certain hardware, or we might want to use a custom layer.

This PR adds the following block to config/model/.yaml:

  layer_kernels:
    LayerNorm:
      #_target_: "transformer_engine.pytorch.LayerNorm"
      _target_: "liger_kernel.transformers.rms_norm.LigerRMSNorm"
      #_target_: "torch.nn.LayerNorm" #the default PyTorch implementation
      _partial_: True
      #Any arguments to your chosen function go here e.g.
      #bias: False
    Linear:
      #_target_: "transformer_engine.pytorch.Linear"
      _target_: "torch.nn.Linear"
      _partial_: True

You can pass any parameters to your new kernels in the config file, after "partial : True". Hydra tries to load the desired kernel in "models/encoder_processor_decoder.py". If the desired library isn't available, torch currently will fall back to torch.nn..

The calls to torch.nn are then replaced with

- self.layer_norm = nn.LayerNorm(normalized_shape=num_channels)
+ self.layer_norm = layer_kernels['LayerNorm'](normalized_shape=num_channels)

In the future, this syntax could be extended to replace other layers if required.

The ConditionalLayerNorm (originally implemented by @ssmmnn11) is for instance a custom normalization layer required by the ensemble-based model.

Type of change

New feature (non-breaking change which adds functionality)
This change requires a documentation update

Checklist before requesting a review

for more information, see https://pre-commit.ci

…ion-layers

for more information, see https://pre-commit.ci

…i-models into feature/normalization-layers

for more information, see https://pre-commit.ci

…ion-layers

jakob-schloer and others added 15 commits November 22, 2024 14:11

Refactor to instantiate normalization layer.

cf0cc85

Fix dependencies for development

2079846

can pass arbitrary kernels via config

8bc7d79

[pre-commit.ci] auto fixes from pre-commit.com hooks

5505e97

for more information, see https://pre-commit.ci

Merge remote-tracking branch 'origin/develop' into feature/normalizat…

ae41452

…ion-layers

[pre-commit.ci] auto fixes from pre-commit.com hooks

cb79197

for more information, see https://pre-commit.ci

Set default behavior for layer_kernels.

ccafbb2

Merge branch 'feature/normalization-layers' of github.com:ecmwf/anemo…

bd46af7

…i-models into feature/normalization-layers

[pre-commit.ci] auto fixes from pre-commit.com hooks

b9c1ff9

for more information, see https://pre-commit.ci

Merge remote-tracking branch 'origin/develop' into feature/normalizat…

2621880

…ion-layers

Add flexible layer kernels to GNN and GraphTransformer

4e35033

Add type annotation.

84dfdfc

Add tests for layer kernels and adapt tests

ad78c59

Migrate models_feature/normalization-layers.

671e47c

Add layer kernel to config.

f7d70d9

jakob-schloer assigned jakob-schloer and cathalobrien Dec 20, 2024

Add conditional LN. Implementation originally written by @ssmmnn11

022e34d

mchantry added training models config Affects the config files to training labels Dec 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Models feature/normalization layers #47

Models feature/normalization layers #47

jakob-schloer commented Dec 20, 2024 •

edited

Loading

Models feature/normalization layers #47

Are you sure you want to change the base?

Models feature/normalization layers #47

Conversation

jakob-schloer commented Dec 20, 2024 • edited Loading

Describe your changes

Type of change

Checklist before requesting a review

jakob-schloer commented Dec 20, 2024 •

edited

Loading