layernorm1p fix #7523

dimapihtar · 2023-09-26T20:25:46Z

What does this PR do ?

Passes layernorm_zero_centered_gamma param properly which fixes layernorm1p issue.

Collection: [Note which collection this PR will affect]

Changelog

Add specific line by line info of high level changes in this PR.

Usage

You can potentially add a usage example below

# Add a code snippet demonstrating how to use this

Before your PR is "Ready for review"

Pre checks:

Make sure you read and followed Contributor guidelines
Did you write any new necessary tests?
Did you add or update any necessary documentation?
Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
- Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

New Feature
Bugfix
Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

Related to # (issue)

Signed-off-by: dimapihtar <dpihtar@gmail.com>

for more information, see https://pre-commit.ci

gshennvm

can you add an option to pass normalization == "layernorm1p"? if we used layernorm1p this code still crashes

Signed-off-by: dimapihtar <dpihtar@gmail.com>

gshennvm · 2023-09-27T16:35:42Z

examples/nlp/language_modeling/conf/megatron_gpt_config.yaml

@@ -72,6 +72,7 @@ model:
  kv_channels: null # Projection weights dimension in multi-head attention. Set to hidden_size // num_attention_heads if null
  apply_query_key_layer_scaling: False # scale Q * K^T by 1 / layer-number.
  normalization: 'layernorm' # Normalization layer to use. Options are 'layernorm', 'rmsnorm'
+  layernorm_zero_centered_gamma: True


i don't think we need this flag. When normalization is 'layernorm1p' we will overwrite this flag anyway to be True. So we can hide it from the user to keep it consistent with the previous NeMo activation selection

gshennvm · 2023-09-27T16:36:22Z

nemo/collections/nlp/models/language_modeling/megatron_gpt_model.py

@@ -1483,10 +1483,14 @@ def build_transformer_config(self) -> TransformerConfig:
        activation_func = activation_to_func(activation)

        normalization = self.cfg.get('normalization', 'layernorm')
+        layernorm_zero_centered_gamma = self.cfg.get('layernorm_zero_centered_gamma', 'False')


see above comment. Can set this to be False instead

can you change this line to layernorm_zero_centered_gamma = False ?

@gshennvm I think it's better to leave it as it is since we have layernorm_zero_centered_gamma param in Launcher configs for improved models. What do you think?

I see, ok with me. Users can customize it if needed

Why do we need both configs? layernorm1p and layernorm_zero_centered_gamma in the launcher?

@MaximumEntropy given that we're moving to mcore and the user is likely not going to be exposed to the layernorm1p argument if they are coming from mcore. I was thinking maybe we can have both options. (i.e. if the user is more used to mcore they can say LayerNorm + layernorm_zero_centered_gamma). wdyt?

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* layernorm1p fix Signed-off-by: dimapihtar <dpihtar@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add layernorm1p to if statement Signed-off-by: dimapihtar <dpihtar@gmail.com> * config changes Signed-off-by: dimapihtar <dpihtar@gmail.com> * gpt config changes Signed-off-by: dimapihtar <dpihtar@gmail.com> * remove layernorm_zero_centered_gamma from gpt config Signed-off-by: dimapihtar <dpihtar@gmail.com> * change line Signed-off-by: dimapihtar <dpihtar@gmail.com> --------- Signed-off-by: dimapihtar <dpihtar@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* layernorm1p fix * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add layernorm1p to if statement * config changes * gpt config changes * remove layernorm_zero_centered_gamma from gpt config * change line --------- Signed-off-by: dimapihtar <dpihtar@gmail.com> Co-authored-by: Dmytro Pykhtar <37850217+dimapihtar@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* layernorm1p fix * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add layernorm1p to if statement * config changes * gpt config changes * remove layernorm_zero_centered_gamma from gpt config * change line --------- Signed-off-by: dimapihtar <dpihtar@gmail.com> Co-authored-by: Dmytro Pykhtar <37850217+dimapihtar@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Sasha Meister <sasha.meister.work@gmail.com>

* layernorm1p fix * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add layernorm1p to if statement * config changes * gpt config changes * remove layernorm_zero_centered_gamma from gpt config * change line --------- Signed-off-by: dimapihtar <dpihtar@gmail.com> Co-authored-by: Dmytro Pykhtar <37850217+dimapihtar@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* layernorm1p fix * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add layernorm1p to if statement * config changes * gpt config changes * remove layernorm_zero_centered_gamma from gpt config * change line --------- Signed-off-by: dimapihtar <dpihtar@gmail.com> Co-authored-by: Dmytro Pykhtar <37850217+dimapihtar@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

layernorm1p fix

46a7a38

Signed-off-by: dimapihtar <dpihtar@gmail.com>

dimapihtar requested a review from ericharper September 26, 2023 20:25

github-actions bot added the NLP label Sep 26, 2023

[pre-commit.ci] auto fixes from pre-commit.com hooks

da9d9d0

for more information, see https://pre-commit.ci

gshennvm reviewed Sep 26, 2023

View reviewed changes

dimapihtar and others added 3 commits September 27, 2023 19:15

Merge branch 'r1.21.0' into dpykhtar/layernorm1p_fix

59c21ca

add layernorm1p to if statement

f474563

Signed-off-by: dimapihtar <dpihtar@gmail.com>

config changes

ee30f83

Signed-off-by: dimapihtar <dpihtar@gmail.com>

dimapihtar requested a review from gshennvm September 27, 2023 16:28

gshennvm reviewed Sep 27, 2023

View reviewed changes

dimapihtar added 2 commits September 27, 2023 09:59

gpt config changes

c9a87a4

Signed-off-by: dimapihtar <dpihtar@gmail.com>

remove layernorm_zero_centered_gamma from gpt config

cfc8cc1

Signed-off-by: dimapihtar <dpihtar@gmail.com>

dimapihtar requested review from gshennvm and MaximumEntropy September 27, 2023 17:17

dimapihtar and others added 3 commits September 28, 2023 18:23

Merge branch 'r1.21.0' into dpykhtar/layernorm1p_fix

c0b5632

Merge branch 'r1.21.0' into dpykhtar/layernorm1p_fix

1090ee2

change line

bddb4b7

Signed-off-by: dimapihtar <dpihtar@gmail.com>

MaximumEntropy approved these changes Sep 28, 2023

View reviewed changes

ericharper merged commit 84bbad2 into r1.21.0 Sep 28, 2023

ericharper deleted the dpykhtar/layernorm1p_fix branch September 28, 2023 22:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

layernorm1p fix #7523

layernorm1p fix #7523

dimapihtar commented Sep 26, 2023

gshennvm left a comment

gshennvm Sep 27, 2023

gshennvm Sep 27, 2023

gshennvm Sep 27, 2023 •

edited

Loading

dimapihtar Sep 27, 2023 •

edited

Loading

gshennvm Sep 27, 2023

MaximumEntropy Sep 27, 2023

gshennvm Sep 27, 2023

layernorm1p fix #7523

layernorm1p fix #7523

Conversation

dimapihtar commented Sep 26, 2023

What does this PR do ?

Changelog

Usage

Before your PR is "Ready for review"

Who can review?

Additional Information

gshennvm left a comment

Choose a reason for hiding this comment

gshennvm Sep 27, 2023

Choose a reason for hiding this comment

gshennvm Sep 27, 2023

Choose a reason for hiding this comment

gshennvm Sep 27, 2023 • edited Loading

Choose a reason for hiding this comment

dimapihtar Sep 27, 2023 • edited Loading

Choose a reason for hiding this comment

gshennvm Sep 27, 2023

Choose a reason for hiding this comment

MaximumEntropy Sep 27, 2023

Choose a reason for hiding this comment

gshennvm Sep 27, 2023

Choose a reason for hiding this comment

gshennvm Sep 27, 2023 •

edited

Loading

dimapihtar Sep 27, 2023 •

edited

Loading