Avoid configuring `SyncBatchNorm` when not fitting #9243

four4fish · 2021-09-01T02:46:02Z

Proposed refactoring or deprecation

Conditionally configure syncbatchnorm in distributed plugins

Motivation

This issue is closely related to #6977
Carrying forward discussion from #9096
Related issue in PyTorch: pytorch/pytorch#48988

SyncBatchNorm doesn't sync stats when used under eval mode. We can conditionally check for whether to configure this if we're fitting vs validating/testing/predicting.

Pitch

Conditionally determine whether to configure syncbatchnorm in the module here: https://github.com/PyTorchLightning/pytorch-lightning/blob/a451997c4da89be3b1e4f7f79b52015bd32f2ea4/pytorch_lightning/plugins/training_type/ddp.py#L384-L387

Essentially rewrite this

        if self.sync_batchnorm:
            self.model = self.configure_sync_batchnorm(self.model)

        # skip wrapping the model if we are not fitting as no gradients need to be exchanged
        trainer_fn = self.lightning_module.trainer.state.fn
        if trainer_fn == TrainerFn.FITTING:
            self.configure_ddp()

as this

        # skip wrapping the model if we are not fitting as no gradients need to be exchanged
        trainer_fn = self.lightning_module.trainer.state.fn
        if trainer_fn == TrainerFn.FITTING:
            if self.sync_batchnorm:
                self.model = self.configure_sync_batchnorm(self.model)
            self.configure_ddp()

Additional context

If you enjoy Lightning, check out our other projects! ⚡

_{Metrics: Machine learning metrics for distributed, scalable PyTorch applications.

Flash: The fastest way to get a Lightning baseline! A collection of tasks for fast prototyping, baselining, finetuning and solving problems with deep learning

Bolts: Pretrained SOTA Deep Learning models, callbacks and more for research and production with PyTorch Lightning and PyTorch

Lightning Transformers: Flexible interface for high performance research using SOTA Transformers leveraging Pytorch Lightning, Transformers, and Hydra.}

The text was updated successfully, but these errors were encountered:

tchaton · 2021-09-03T12:30:20Z

Hey @four4fish,

Do you see any situations where users might want to update their BatchNorm stats on the validation dataset in a distributed way. If not, I think it is a good proposal.

Best,
T.C

four4fish added feature Is an improvement or enhancement help wanted Open to be worked on refactor labels Sep 1, 2021

four4fish mentioned this issue Sep 1, 2021

Avoid wrapping LightningModule in DDP plugins when not fitting #9096

Merged

12 tasks

ananthsub added distributed Generic distributed-related topic good first issue Good for newcomers labels Sep 1, 2021

ananthsub changed the title ~~Avoid converting to batchnorm when not fitting~~ Avoid configuring SyncBatchNorm when not fitting Sep 2, 2021

four4fish removed the help wanted Open to be worked on label Sep 8, 2021

four4fish self-assigned this Sep 8, 2021

tchaton added the let's do it! approved to implement label Sep 10, 2021

edward-io self-assigned this Feb 14, 2022

edward-io mentioned this issue Feb 14, 2022

check trainerfn == FITTING before configuring sync_batchnorm #11919

Merged

12 tasks

ananthsub added this to the 1.6 milestone Feb 14, 2022

carmocca added this to Frameworks Planning Feb 16, 2022

carmocca moved this to In Progress in Frameworks Planning Feb 16, 2022

awaelchli closed this as completed in #11919 Mar 12, 2022

Repository owner moved this from In Progress to Done in Frameworks Planning Mar 12, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid configuring `SyncBatchNorm` when not fitting #9243

Avoid configuring `SyncBatchNorm` when not fitting #9243

four4fish commented Sep 1, 2021 •

edited by ananthsub

Loading

tchaton commented Sep 3, 2021

Avoid configuring SyncBatchNorm when not fitting #9243

Avoid configuring SyncBatchNorm when not fitting #9243

Comments

four4fish commented Sep 1, 2021 • edited by ananthsub Loading

Proposed refactoring or deprecation

Motivation

Pitch

Additional context

If you enjoy Lightning, check out our other projects! ⚡

tchaton commented Sep 3, 2021

Avoid configuring `SyncBatchNorm` when not fitting #9243

Avoid configuring `SyncBatchNorm` when not fitting #9243

four4fish commented Sep 1, 2021 •

edited by ananthsub

Loading