Skip to content

deepspeed-chat: support explicit configuration of dropout#746

Merged
tjruwase merged 2 commits intodeepspeedai:masterfrom
mosheisland:3_dropout
Oct 3, 2023
Merged

deepspeed-chat: support explicit configuration of dropout#746
tjruwase merged 2 commits intodeepspeedai:masterfrom
mosheisland:3_dropout

Conversation

@mosheisland
Copy link
Contributor

Currently, only disable_dropout configuration is supported. However, some models (e.g. Bloom) have a default of dropout=0 in model config. Therefore, modify to support explicit dropout configuration. Also, update accordingly existing training scripts.

Change-Id: I5ee96a77ca2b58d9787573a48009e2af36a270b0

@mosheisland
Copy link
Contributor Author

@microsoft-github-policy-service agree company="Intel"

@mosheisland
Copy link
Contributor Author

The formatting error is not due to this commit.
"applications/DeepSpeed-Chat/training/utils/ds_utils.py:6:1: F401 'torch' imported but unused"

@tjruwase
Copy link
Contributor

tjruwase commented Oct 3, 2023

The formatting error is not due to this commit. "applications/DeepSpeed-Chat/training/utils/ds_utils.py:6:1: F401 'torch' imported but unused"

Please check again. I think there is a different error now.

Currently, only disable_dropout configuration is supported.
However, some models (e.g. Bloom) have a default of dropout=0 in model config.
Therefore, modify to support explicit dropout configuration.
Also, update accordingly existing training scripts.

Change-Id: I5ee96a77ca2b58d9787573a48009e2af36a270b0
Signed-off-by: Moshe Island <misland@habana.ai>
@tjruwase tjruwase merged commit 4bf1924 into deepspeedai:master Oct 3, 2023
@mosheisland mosheisland deleted the 3_dropout branch October 4, 2023 07:00
hwchen2017 pushed a commit that referenced this pull request Jun 8, 2025
Currently, only disable_dropout configuration is supported.
However, some models (e.g. Bloom) have a default of dropout=0 in model config.
Therefore, modify to support explicit dropout configuration.
Also, update accordingly existing training scripts.

Change-Id: I5ee96a77ca2b58d9787573a48009e2af36a270b0

Signed-off-by: Moshe Island <misland@habana.ai>
Co-authored-by: Moshe Island <misland@habana.ai>
Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants