-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Optimizing distributed Adam when running with one work queue (#5560)
* Dist Adam constructs a single param bucket for each GPT layer Signed-off-by: Tim Moon <tmoon@nvidia.com> * Synchronize dist Adam reduce-scatters before launching model-parallel all-reduces Signed-off-by: Tim Moon <tmoon@nvidia.com> * Configure per-layer dist Adam buckets for BERT and T5 Signed-off-by: Tim Moon <tmoon@nvidia.com> * Remove unused variables Signed-off-by: Tim Moon <tmoon@nvidia.com> * Configure GPT with one dist Adam bucket per virtual pipeline stage Signed-off-by: Tim Moon <tmoon@nvidia.com> * Configure BERT with one dist Adam bucket per virtual pipeline stage Signed-off-by: Tim Moon <tmoon@nvidia.com> * Update Apex commit in Dockerfile Need recent updates to Apex distributed Adam optimizer. Signed-off-by: Tim Moon <tmoon@nvidia.com> * Remove logic for per-virtual-pipeline distopt buckets from T5 Signed-off-by: Tim Moon <tmoon@nvidia.com> --------- Signed-off-by: Tim Moon <tmoon@nvidia.com>
- Loading branch information
Showing
5 changed files
with
97 additions
and
12 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters