Skip to content

Commit

Permalink
Merge branch 'master' into refactor/loops/private-meth-fit
Browse files Browse the repository at this point in the history
  • Loading branch information
carmocca authored Sep 15, 2021
2 parents d1b6ab1 + 200ed9e commit ef64bd7
Show file tree
Hide file tree
Showing 38 changed files with 444 additions and 349 deletions.
2 changes: 1 addition & 1 deletion .github/ISSUE_TEMPLATE/code_improvement.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
name: Code improvement
about: Suggest a code improvement, i.e. refactoring, deprecation, etc.
title: ''
labels: enhancement, help wanted, refactors / code health
labels: refactors / code health
assignees: ''
---

Expand Down
2 changes: 1 addition & 1 deletion .github/ISSUE_TEMPLATE/documentation.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
name: Typos and doc fixes
about: Typos and doc fixes
title: ''
labels: typo, documentation
labels: documentation
assignees: ''
---

Expand Down
2 changes: 1 addition & 1 deletion .github/ISSUE_TEMPLATE/feature_request.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
name: Feature request
about: Suggest an idea for this project
title: ''
labels: enhancement, help wanted
labels: enhancement
assignees: ''
---

Expand Down
74 changes: 33 additions & 41 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,8 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
* Refactored `TrainingBatchLoop` and extracted `OptimizerLoop`, splitting off automatic optimization into its own loop ([#9191](https://github.com/PyTorchLightning/pytorch-lightning/pull/9191))
* Removed `TrainingBatchLoop.backward()`; manual optimization now calls directly into `Accelerator.backward()` and automatic optimization handles backward in new `OptimizerLoop` ([#9265](https://github.com/PyTorchLightning/pytorch-lightning/pull/9265))
* Extracted `ManualOptimization` logic from `TrainingBatchLoop` into its own separate loop class ([#9266](https://github.com/PyTorchLightning/pytorch-lightning/pull/9266))
* Added `OutputResult` and `ManualResult` classes ([#9437](https://github.com/PyTorchLightning/pytorch-lightning/pull/9437))
* Added `OutputResult` and `ManualResult` classes ([#9437](https://github.com/PyTorchLightning/pytorch-lightning/pull/9437), [#9424](https://github.com/PyTorchLightning/pytorch-lightning/pull/9424))
* Marked `OptimizerLoop.backward` as protected ([#9514](https://github.com/PyTorchLightning/pytorch-lightning/pull/9514))
* Marked `FitLoop.should_accumulate` as protected ([#9515](https://github.com/PyTorchLightning/pytorch-lightning/pull/9515))


Expand Down Expand Up @@ -126,6 +127,9 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
- Added `ModelSummary` callback ([#9344](https://github.com/PyTorchLightning/pytorch-lightning/pull/9344))


- Added `PL_RECONCILE_PROCESS` environment variable to enable process reconciliation regardless of cluster environment settings ([#9389](https://github.com/PyTorchLightning/pytorch-lightning/pull/9389))


### Changed

- `pytorch_lightning.loggers.neptune.NeptuneLogger` is now consistent with new [neptune-client](https://github.com/neptune-ai/neptune-client) API ([#6867](https://github.com/PyTorchLightning/pytorch-lightning/pull/6867)).
Expand Down Expand Up @@ -186,9 +190,6 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
- Executing the `optimizer_closure` is now required when overriding the `optimizer_step` hook ([#9360](https://github.com/PyTorchLightning/pytorch-lightning/pull/9360))


- Pass init args to ShardedDataParallel ([#9483](https://github.com/PyTorchLightning/pytorch-lightning/pull/9483))


### Deprecated

- Deprecated `LightningModule.summarize()` in favor of `pytorch_lightning.utilities.model_summary.summarize()`
Expand Down Expand Up @@ -311,66 +312,60 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
- Removed deprecated properties `DeepSpeedPlugin.cpu_offload*` in favor of `offload_optimizer`, `offload_parameters` and `pin_memory` ([#9244](https://github.com/PyTorchLightning/pytorch-lightning/pull/9244))


- Removed deprecation warnings being called for `on_{task}_dataloader` ([#9279](https://github.com/PyTorchLightning/pytorch-lightning/pull/9279))


### Fixed

- Fixed save/load/resume from checkpoint for DeepSpeed Plugin (
[#8397](https://github.com/PyTorchLightning/pytorch-lightning/pull/8397),
[#8644](https://github.com/PyTorchLightning/pytorch-lightning/pull/8644),
[#8627](https://github.com/PyTorchLightning/pytorch-lightning/pull/8627))


- Fixed `EarlyStopping` running on train epoch end when `check_val_every_n_epoch>1` is set ([#9156](https://github.com/PyTorchLightning/pytorch-lightning/pull/9156))


- Fixed an issue with logger outputs not being finalized correctly after prediction runs ([#8685](https://github.com/PyTorchLightning/pytorch-lightning/pull/8685))


- Fixed the Apex and DeepSpeed plugin closure running after the `on_before_optimizer_step` hook ([#9288](https://github.com/PyTorchLightning/pytorch-lightning/issues/9288))


- Fixed the Native AMP plugin closure not running with manual optimization ([#9288](https://github.com/PyTorchLightning/pytorch-lightning/issues/9288))


- Fixed bug where data-loading functions where not getting the correct running stage passed ([#8858](https://github.com/PyTorchLightning/pytorch-lightning/pull/8858))
- Fixed `move_metrics_to_cpu` moving the loss on cpu while training on device ([#9308](https://github.com/PyTorchLightning/pytorch-lightning/pull/9308))


- Fixed intra-epoch evaluation outputs staying in memory when the respective `*_epoch_end` hook wasn't overridden ([#9261](https://github.com/PyTorchLightning/pytorch-lightning/pull/9261))
- Fixed incorrect main progress bar indicator when resuming training mid-epoch ([#9310](https://github.com/PyTorchLightning/pytorch-lightning/pull/9310))


- Fixed error handling in DDP process reconciliation when `_sync_dir` was not initialized ([#9267](https://github.com/PyTorchLightning/pytorch-lightning/pull/9267))
- Fixed freeing datafetchers during teardown ([#9387](https://github.com/PyTorchLightning/pytorch-lightning/pull/9387))


- Fixed PyTorch Profiler not enabled for manual optimization ([#9316](https://github.com/PyTorchLightning/pytorch-lightning/pull/9316))
- Fixed bug where the training step output needed to be `deepcopy`-ed ([#9349](https://github.com/PyTorchLightning/pytorch-lightning/pull/9349))


- Fixed inspection of other args when a container is specified in `save_hyperparameters` ([#9125](https://github.com/PyTorchLightning/pytorch-lightning/pull/9125))
- Fixed freeing data iterators in loop `on_run_end` ([#9386](https://github.com/PyTorchLightning/pytorch-lightning/pull/9386))


- Fixed `move_metrics_to_cpu` moving the loss on cpu while training on device ([#9308](https://github.com/PyTorchLightning/pytorch-lightning/pull/9308))

- Fixed `BasePredictionWriter` not returning the batch_indices in a non-distributed setting ([#9432](https://github.com/PyTorchLightning/pytorch-lightning/pull/9432))

- Fixed incorrect main progress bar indicator when resuming training mid-epoch ([#9310](https://github.com/PyTorchLightning/pytorch-lightning/pull/9310))

## [1.4.7] - 2021-09-14

- Fixed logging of nan parameters ([#9364](https://github.com/PyTorchLightning/pytorch-lightning/pull/9364))


- Fixed `replace_sampler` missing the batch size under specific conditions ([#9367](https://github.com/PyTorchLightning/pytorch-lightning/pull/9367))
- Pass init args to ShardedDataParallel ([#9483](https://github.com/PyTorchLightning/pytorch-lightning/pull/9483))
- Fixed collision of user argument when using ShardedDDP ([#9512](https://github.com/PyTorchLightning/pytorch-lightning/pull/9512))
- Fixed DeepSpeed crash for RNNs ([#9489](https://github.com/PyTorchLightning/pytorch-lightning/pull/9489))


- Fixed bug where the training step output needed to be `deepcopy`-ed ([#9349](https://github.com/PyTorchLightning/pytorch-lightning/pull/9349))


- Fixed freeing data iterators in loop `on_run_end` ([#9386](https://github.com/PyTorchLightning/pytorch-lightning/pull/9386))

## [1.4.6] - 2021-09-07

- Fixed `BasePredictionWriter` not returning the batch_indices in a non-distributed setting ([#9432](https://github.com/PyTorchLightning/pytorch-lightning/pull/9432))
- Fixed an issues with export to ONNX format when a model has multiple inputs ([#8800](https://github.com/PyTorchLightning/pytorch-lightning/pull/8800))
- Removed deprecation warnings being called for `on_{task}_dataloader` ([#9279](https://github.com/PyTorchLightning/pytorch-lightning/pull/9279))
- Fixed save/load/resume from checkpoint for DeepSpeed Plugin (
[#8397](https://github.com/PyTorchLightning/pytorch-lightning/pull/8397),
[#8644](https://github.com/PyTorchLightning/pytorch-lightning/pull/8644),
[#8627](https://github.com/PyTorchLightning/pytorch-lightning/pull/8627))
- Fixed `EarlyStopping` running on train epoch end when `check_val_every_n_epoch>1` is set ([#9156](https://github.com/PyTorchLightning/pytorch-lightning/pull/9156))
- Fixed an issue with logger outputs not being finalized correctly after prediction runs ([#8333](https://github.com/PyTorchLightning/pytorch-lightning/issues/8333))
- Fixed the Apex and DeepSpeed plugin closure running after the `on_before_optimizer_step` hook ([#9288](https://github.com/PyTorchLightning/pytorch-lightning/issues/9288))
- Fixed the Native AMP plugin closure not running with manual optimization ([#9288](https://github.com/PyTorchLightning/pytorch-lightning/issues/9288))
- Fixed bug where data-loading functions where not getting the correct running stage passed ([#8858](https://github.com/PyTorchLightning/pytorch-lightning/pull/8858))
- Fixed intra-epoch evaluation outputs staying in memory when the respective `*_epoch_end` hook wasn't overridden ([#9261](https://github.com/PyTorchLightning/pytorch-lightning/pull/9261))
- Fixed error handling in DDP process reconciliation when `_sync_dir` was not initialized ([#9267](https://github.com/PyTorchLightning/pytorch-lightning/pull/9267))
- Fixed PyTorch Profiler not enabled for manual optimization ([#9316](https://github.com/PyTorchLightning/pytorch-lightning/pull/9316))
- Fixed inspection of other args when a container is specified in `save_hyperparameters` ([#9125](https://github.com/PyTorchLightning/pytorch-lightning/pull/9125))
- Fixed signature of `Timer.on_train_epoch_end` and `StochasticWeightAveraging.on_train_epoch_end` to prevent unwanted deprecation warnings ([#9347](https://github.com/PyTorchLightning/pytorch-lightning/pull/9347))


- Fixed collision of user argument when using ShardedDDP ([#9512](https://github.com/PyTorchLightning/pytorch-lightning/pull/9512))
- Fixed error reporting in DDP process reconciliation when processes are launched by an external agent ([#9389](https://github.com/PyTorchLightning/pytorch-lightning/pull/9389))


## [1.4.5] - 2021-08-31
Expand Down Expand Up @@ -420,9 +415,6 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
- Fixed `accelerator=ddp` choice for CPU ([#8645](https://github.com/PyTorchLightning/pytorch-lightning/pull/8645))


- Fixed an issues with export to ONNX format when a model has multiple inputs ([#8800](https://github.com/PyTorchLightning/pytorch-lightning/pull/8800))


## [1.4.0] - 2021-07-27

### Added
Expand Down
Binary file modified docs/source/_static/images/logo.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions pl_examples/basic_examples/profiler_example.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@
from pytorch_lightning.utilities.cli import LightningCLI

DEFAULT_CMD_LINE = (
"fit",
"--trainer.max_epochs=1",
"--trainer.limit_train_batches=15",
"--trainer.limit_val_batches=15",
Expand Down
1 change: 1 addition & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,7 @@ module = [
"pytorch_lightning.callbacks.pruning",
"pytorch_lightning.loops.optimization.*",
"pytorch_lightning.loops.evaluation_loop",
"pytorch_lightning.trainer.connectors.checkpoint_connector",
"pytorch_lightning.trainer.connectors.logger_connector.*",
"pytorch_lightning.trainer.progress",
"pytorch_lightning.tuner.auto_gpu_select",
Expand Down
2 changes: 1 addition & 1 deletion pytorch_lightning/callbacks/early_stopping.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ class EarlyStopping(Callback):
Args:
monitor: quantity to be monitored.
min_delta: minimum change in the monitored quantity to qualify as an improvement, i.e. an absolute
change of less than `min_delta`, will count as no improvement.
change of less than or equal to `min_delta`, will count as no improvement.
patience: number of checks with no improvement
after which training will be stopped. Under the default configuration, one check happens after
every training epoch. However, the frequency of validation can be modified by setting various parameters on
Expand Down
7 changes: 3 additions & 4 deletions pytorch_lightning/loops/batch/training_batch_loop.py
Original file line number Diff line number Diff line change
Expand Up @@ -123,16 +123,15 @@ def advance(self, batch, batch_idx):

if self.trainer.lightning_module.automatic_optimization:
# in automatic optimization, hand over execution to the OptimizerLoop
optimizers = [optimizer for _, optimizer in self.get_active_optimizers(batch_idx)]
batch_outputs = self.optimizer_loop.run(split_batch, optimizers, batch_idx)
batch_outputs = self.optimizer_loop.run(split_batch, self.get_active_optimizers(batch_idx), batch_idx)
# combine outputs from each optimizer
for k in range(len(batch_outputs)):
self.batch_outputs[k].extend(batch_outputs[k])
else:
# in manual optimization, hand over execution to the ManualOptimization loop
result = self.manual_loop.run(split_batch, batch_idx)
if result is not None and result.loss is not None:
self.batch_outputs[0].append(result.drop_closure_loss())
if result:
self.batch_outputs[0].append(result)

def on_run_end(self) -> None:
self.optimizer_loop._hiddens = None
Expand Down
13 changes: 1 addition & 12 deletions pytorch_lightning/loops/epoch/training_epoch_loop.py
Original file line number Diff line number Diff line change
Expand Up @@ -312,18 +312,7 @@ def _prepare_outputs(
opt_outputs = [opt_outputs]

for batch_outputs in opt_outputs:
processed_tbptt_outputs = []

if isinstance(batch_outputs, OutputResult):
batch_outputs = [batch_outputs]

for tbptt_output in batch_outputs:
out = {}
if tbptt_output.loss is not None:
out["loss"] = tbptt_output.loss
out.update(tbptt_output.extra)
processed_tbptt_outputs.append(out)

processed_tbptt_outputs = batch_outputs if isinstance(batch_outputs, list) else [batch_outputs]
# if there was only one tbptt step then we can collapse that dimension
if len(processed_tbptt_outputs) == 1:
processed_tbptt_outputs = processed_tbptt_outputs[0]
Expand Down
27 changes: 24 additions & 3 deletions pytorch_lightning/loops/optimization/closure.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,16 +13,37 @@
# limitations under the License.
from abc import ABC, abstractmethod
from dataclasses import dataclass
from typing import Any, Generic, Optional, TypeVar
from typing import Any, Dict, Generic, Optional, TypeVar

from torch import Tensor

from pytorch_lightning.utilities import rank_zero_deprecation
from pytorch_lightning.utilities.apply_func import apply_to_collection
from pytorch_lightning.utilities.exceptions import MisconfigurationException

T = TypeVar("T")


@dataclass
class OutputResult:
...
@staticmethod
def _check_extra_detach_deprecation(extra: Dict[str, Any]) -> Dict[str, Any]:
# TODO: remove with the deprecation removal in v1.6
# this is only here to avoid duplication
def check_fn(v: Tensor) -> Tensor:
if v.grad_fn is not None:
rank_zero_deprecation(
f"One of the returned values {set(extra.keys())} has a `grad_fn`. We will detach it automatically"
" but this behaviour will change in v1.6. Please detach it manually:"
" `return {'loss': ..., 'something': something.detach()}`"
)
return v.detach()
return v

return apply_to_collection(extra, Tensor, check_fn)

def asdict(self) -> Dict[str, Any]:
raise NotImplementedError


class AbstractClosure(ABC, Generic[T]):
Expand All @@ -33,7 +54,7 @@ class AbstractClosure(ABC, Generic[T]):
object which later can call it like a function but without requiring to pass in any arguments.
This class provides a simple abstraction making the instance of this class callable like a function while capturing
the :class:`OutputResult` and caching it.
the closure result and caching it.
"""

def __init__(self) -> None:
Expand Down
Loading

0 comments on commit ef64bd7

Please sign in to comment.