Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fix] Fix PyTorch1.11 Dist Remove _sync_params #1816

Merged
merged 2 commits into from
Mar 26, 2022

Conversation

teamwong111
Copy link
Contributor

Motivation

Fix PyTorch1.11 Dist Remove _sync_params.

Modification

As above.

Checklist

Before PR:

  • I have read and followed the workflow indicated in the CONTRIBUTING.md to create this PR.
  • Pre-commit or linting tools indicated in CONTRIBUTING.md are used to fix the potential lint issues.
  • Bug fixes are covered by unit tests, the case that causes the bug should be added in the unit tests.
  • New functionalities are covered by complete unit tests. If not, please add more unit test to ensure the correctness.
  • The documentation has been modified accordingly, including docstring or example tutorials.

After PR:

  • If the modification has potential influence on downstream or other related projects, this PR should be tested with some of those projects, like MMDet or MMCls.
  • CLA has been signed and all committers have signed the CLA in this PR.

@ZwwWayne
Copy link
Collaborator

Should involve downstream repos to test with this PR.

Comment on lines +52 to +53
if (getattr(self, 'require_forward_param_sync', False)
and self.require_forward_param_sync):
Copy link
Collaborator

@zhouzaida zhouzaida Mar 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/pytorch/pytorch/blob/50c90a22be3ee6a547ad0222951f2c9f50c02b50/torch/nn/parallel/distributed.py#L277

require_forward_param_sync is a attribute of DDP since torch1.2.0 so do we still need to use getattr to get it rather than self.require_forward_param_sync?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please @luopeichao have a look.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It works in parrots. LGTM.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And parrots also support self.require_forward_param_sync.

@zhouzaida
Copy link
Collaborator

val_step should also be modified.

def val_step(self, *inputs, **kwargs):

@zhouzaida zhouzaida merged commit 082dabf into open-mmlab:master Mar 26, 2022
@teamwong111 teamwong111 deleted the fix-pt111-dist branch April 2, 2022 07:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

AttributeError: 'MMDistributedDataParallel' object has no attribute '_sync_params'
4 participants