Skip to content
This repository has been archived by the owner on Nov 3, 2023. It is now read-only.

Fix error with using GPT2 + DistributedDataParallel #3207

Merged
merged 3 commits into from
Oct 20, 2020
Merged

Conversation

moyapchen
Copy link
Contributor

@moyapchen moyapchen commented Oct 20, 2020

Patch description

DistributedDataParallel adds another name to the module path. Fix this with an if check.

Testing steps

Tested by running GPT2 training with DistributedDataParallel.
Also

pytest test_gpt2.py::TestDistributed::test_multitask_distributed

both with and without the change; verifies it breaks in latter and passes in the former.

DistributedDataParallel adds another name to the module path. Fix this.

Tested by running GPT2 training with DistributedDataParallel.
Also
```
pytest test_gpt2.py::TestDistributed::test_multitask_distributed
```
both with and without the change; verifies it breaks in latter and works in the former.
tests/nightly/gpu/test_gpt2.py Show resolved Hide resolved
tests/nightly/gpu/test_gpt2.py Outdated Show resolved Hide resolved
Copy link
Contributor

@klshuster klshuster left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome! as a note to @stephenroller I standardized the distributed_train_model command/moved it to testing_utils in #2371 so once that lands we can update this to use that as well

@moyapchen moyapchen merged commit e7b0e8d into master Oct 20, 2020
@moyapchen moyapchen deleted the gpt2_distributed branch October 20, 2020 20:25
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants