Releases · Lightning-AI/pytorch-lightning · GitHub

07 Apr 17:58

Standard weekly patch release

[1.2.7] - 2021-04-06

Fixed

Fixed resolve a bug with omegaconf and xm.save (#6741)
Fixed an issue with IterableDataset when len is not defined (#6828)
Sanitize None params during pruning (#6836)
Enforce an epoch scheduler interval when using SWA (#6588)
Fixed TPU Colab hang issue, post training (#6816])
Fixed a bug where TensorBoardLogger would give a warning and not log correctly to a symbolic link save_dir (#6730)

Contributors

@awaelchli, @ethanwharris, @karthikprasad, @kaushikb11, @mibaumgartner, @tchaton

If we forgot someone due to not matching commit email with GitHub account, let us know :]

Assets 4

30 Mar 14:46

Standard weekly patch release

[1.2.6] - 2021-03-30

Changed

Changed the behavior of on_epoch_start to run at the beginning of validation & test epoch (#6498)

Removed

Removed legacy code to include step dictionary returns in callback_metrics. Use self.log_dict instead. (#6682)

Fixed

Fixed DummyLogger.log_hyperparams raising a TypeError when running with fast_dev_run=True (#6398)
Fixed error on TPUs when there was no ModelCheckpoint (#6654)
Fixed trainer.test freeze on TPUs (#6654)
Fixed a bug where gradients were disabled after calling Trainer.predict (#6657)
Fixed bug where no TPUs were detected in a TPU pod env (#6719)

Contributors

@awaelchli, @carmocca, @ethanwharris, @kaushikb11, @rohitgr7, @tchaton

If we forgot someone due to not matching commit email with GitHub account, let us know :]

Assets 4

24 Mar 15:17

Weekly patch release - torchmetrics compatibility

[1.2.5] - 2021-03-23

Changed

Added Autocast in validation, test and predict modes for Native AMP (#6565)
Update Gradient Clipping for the TPU Accelerator (#6576)
Refactored setup for typing friendly (#6590)

Fixed

Fixed a bug where all_gather would not work correctly with tpu_cores=8 (#6587)
Fixed comparing required versions (#6434)
Fixed duplicate logs appearing in console when using the python logging module (#6275)

Contributors

@awaelchli, @Borda, @ethanwharris, @justusschock, @kaushikb11

If we forgot someone due to not matching commit email with GitHub account, let us know :]

Assets 4

16 Mar 20:29

Standard weekly patch release

[1.2.4] - 2021-03-16

Changed

Changed the default of find_unused_parameters back to True in DDP and DDP Spawn (#6438)

Fixed

Expose DeepSpeed loss parameters to allow users to fix loss instability (#6115)
Fixed DP reduction with collection (#6324)
Fixed an issue where the tuner would not tune the learning rate if also tuning the batch size (#4688)
Fixed broadcast to use PyTorch broadcast_object_list and add reduce_decision (#6410)
Fixed logger creating directory structure too early in DDP (#6380)
Fixed DeepSpeed additional memory use on rank 0 when default device not set early enough (#6460)
Fixed DummyLogger.log_hyperparams raising a TypeError when running with fast_dev_run=True (#6398)
Fixed an issue with Tuner.scale_batch_size not finding the batch size attribute in the datamodule (#5968)
Fixed an exception in the layer summary when the model contains torch.jit scripted submodules (#6511)
Fixed when Train loop config was run during Trainer.predict (#6541)

Contributors

@awaelchli, @kaushikb11, @Palzer, @SeanNaren, @tchaton

If we forgot someone due to not matching commit email with GitHub account, let us know :]

Assets 4

09 Mar 17:28

Standard weekly patch release

[1.2.3] - 2021-03-09

Fixed

Fixed ModelPruning(make_pruning_permanent=True) pruning buffers getting removed when saved during training (#6073)
Fixed when _stable_1d_sort to work when n >= N (#6177)
Fixed AttributeError when logger=None on TPU (#6221)
Fixed PyTorch Profiler with emit_nvtx (#6260)
Fixed trainer.test from best_path hangs after calling trainer.fit (#6272)
Fixed SingleTPU calling all_gather (#6296)
Ensure we check deepspeed/sharded in multinode DDP (#6297)
Check LightningOptimizer doesn't delete optimizer hooks (#6305)
Resolve memory leak for evaluation (#6326)
Ensure that clip gradients is only called if the value is greater than 0 (#6330)
Fixed Trainer not resetting lightning_optimizers when calling Trainer.fit() multiple times (#6372)

Contributors

@awaelchli, @carmocca, @chizuchizu, @frankier, @SeanNaren, @tchaton

If we forgot someone due to not matching commit email with GitHub account, let us know :]

Assets 4

05 Mar 15:12

Standard weekly patch release

[1.2.2] - 2021-03-02

Added

Added checkpoint parameter to callback's on_save_checkpoint hook (#6072)

Changed

Changed the order of backward, step, zero_grad to zero_grad, backward, step (#6147)
Changed default for DeepSpeed CPU Offload to False, due to prohibitively slow speeds at smaller scale (#6262)

Fixed

Fixed epoch level schedulers not being called when val_check_interval < 1.0 (#6075)
Fixed multiple early stopping callbacks (#6197)
Fixed incorrect usage of detach(), cpu(), to() (#6216)
Fixed LBFGS optimizer support which didn't converge in automatic optimization (#6147)
Prevent WandbLogger from dropping values (#5931)
Fixed error thrown when using valid distributed mode in multi node (#6297)

Contributors

@akihironitta, @borisdayma, @carmocca, @dvolgyes, @SeanNaren, @SkafteNicki

If we forgot someone due to not matching commit email with GitHub account, let us know :]

Assets 4

24 Feb 17:13

Standard weekly patch release

[1.2.1] - 2021-02-23

Fixed

Fixed incorrect yield logic for the amp autocast context manager (#6080)
Fixed priority of plugin/accelerator when setting distributed mode (#6089)
Fixed error message for AMP + CPU incompatibility (#6107)

Contributors

@awaelchli, @SeanNaren, @carmocca

If we forgot someone due to not matching commit email with GitHub account, let us know :]

Assets 4

18 Feb 23:04

Pruning & Quantization & SWA

[1.2.0] - 2021-02-18

Added

Added DataType, AverageMethod and MDMCAverageMethod enum in metrics (#5657)
Added support for summarized model total params size in megabytes (#5590)
Added support for multiple train loaders (#1959)
Added Accuracy metric now generalizes to Top-k accuracy for (multi-dimensional) multi-class inputs using the top_k parameter (#4838)
Added Accuracy metric now enables the computation of subset accuracy for multi-label or multi-dimensional multi-class inputs with the subset_accuracy parameter (#4838)
Added HammingDistance metric to compute the hamming distance (loss) (#4838)
Added max_fpr parameter to auroc metric for computing partial auroc metric (#3790)
Added StatScores metric to compute the number of true positives, false positives, true negatives and false negatives (#4839)
Added R2Score metric (#5241)
Added LambdaCallback (#5347)
Added BackboneLambdaFinetuningCallback (#5377)
Accelerator all_gather supports collection (#5221)
Added image_gradients functional metric to compute the image gradients of a given input image. (#5056)
Added MetricCollection (#4318)
Added .clone() method to metrics (#4318)
Added IoU class interface (#4704)
Support to tie weights after moving model to TPU via on_post_move_to_device hook
Added missing val/test hooks in LightningModule (#5467)
The Recall and Precision metrics (and their functional counterparts recall and precision) can now be generalized to Recall@K and Precision@K with the use of top_k parameter (#4842)
Added ModelPruning Callback (#5618, #5825, #6045)
Added PyTorchProfiler (#5560)
Added compositional metrics (#5464)
Added Trainer method predict(...) for high performence predictions (#5579)
Added on_before_batch_transfer and on_after_batch_transfer data hooks (#3671)
Added AUC/AUROC class interface (#5479)
Added PredictLoop object (#5752)
Added QuantizationAwareTraining callback (#5706, #6040)
Added LightningModule.configure_callbacks to enable the definition of model-specific callbacks (#5621)
Added dim to PSNR metric for mean-squared-error reduction (#5957)
Added promxial policy optimization template to pl_examples (#5394)
Added log_graph to CometLogger (#5295)
Added possibility for nested loaders (#5404)
Added sync_step to Wandb logger (#5351)
Added StochasticWeightAveraging callback (#5640)
Added LightningDataModule.from_datasets(...) (#5133)
Added PL_TORCH_DISTRIBUTED_BACKEND env variable to select backend (#5981)
Added Trainer flag to activate Stochastic Weight Averaging (SWA) Trainer(stochastic_weight_avg=True) (#6038)
Added DeepSpeed integration (#5954, #6042)

Changed

Changed stat_scores metric now calculates stat scores over all classes and gains new parameters, in line with the new StatScores metric (#4839)
Changed computer_vision_fine_tunning example to use BackboneLambdaFinetuningCallback (#5377)
Changed automatic casting for LoggerConnector metrics (#5218)
Changed iou [func] to allow float input (#4704)
Metric compute() method will no longer automatically call reset() (#5409)
Set PyTorch 1.4 as min requirements, also for testing and examples torchvision>=0.5 and torchtext>=0.5 (#5418)
Changed callbacks argument in Trainer to allow Callback input (#5446)
Changed the default of find_unused_parameters to False in DDP (#5185)
Changed ModelCheckpoint version suffixes to start at 1 (#5008)
Progress bar metrics tensors are now converted to float (#5692)
Changed the default value for the progress_bar_refresh_rate Trainer argument in Google COLAB notebooks to 20 (#5516)
Extended support for purely iteration-based training (#5726)
Made LightningModule.global_rank, LightningModule.local_rank and LightningModule.logger read-only properties (#5730)
Forced ModelCheckpoint callbacks to run after all others to guarantee all states are saved to the checkpoint (#5731)
Refactored Accelerators and Plugins (#5743)
- Added base classes for plugins (#5715)
- Added parallel plugins for DP, DDP, DDPSpawn, DDP2 and Horovod (#5714)
- Precision Plugins (#5718)
- Added new Accelerators for CPU, GPU and TPU (#5719)
- Added Plugins for TPU training (#5719)
- Added RPC and Sharded plugins (#5732)
- Added missing LightningModule-wrapper logic to new plugins and accelerator (#5734)
- Moved device-specific teardown logic from training loop to accelerator (#5973)
- Moved accelerator_connector.py to the connectors subfolder (#6033)
- Trainer only references accelerator (#6039)
- Made parallel devices optional across all plugins (#6051)
- Cleaning (#5948, #5949, #5950)
Enabled self.log in callbacks (#5094)
Renamed xxx_AVAILABLE as protected (#5082)
Unified module names in Utils (#5199)
Separated utils: imports & enums (#5256, #5874)
Refactor: clean trainer device & distributed getters (#5300)
Simplified training phase as LightningEnum (#5419)
Updated metrics to use LightningEnum (#5689)
Changed the seq of on_train_batch_end, on_batch_end & on_train_epoch_end, on_epoch_end hooks (#5688)
Refactored setup_training and remove test_mode (#5388)
Disabled training with zero num_training_batches when insufficient limit_train_batches (#5703)
Refactored EpochResultStore (#5522)
Update lr_finder to check for attribute if not running fast_dev_run (#5990)
LightningOptimizer manual optimizer is more flexible and expose toggle_model (#5771)
MlflowLogger limit parameter value length to 250 char (#5893)
Re-introduced fix for Hydra directory sync with multiple process (#5993)

Deprecated

Function stat_scores_multiple_classes is deprecated in favor of stat_scores (#4839)
Moved accelerators and plugins to its legacy pkg (#5645)
Deprecated LightningDistributedDataParallel in favor of new wrapper module LightningDistributedModule (#5185)
Deprecated LightningDataParallel in favor of new wrapper module LightningParallelModule (#5670)
Renamed utils modules (#5199)
- argparse_utils >> argparse
- model_utils >> model_helpers
- warning_utils >> warnings
- xla_device_utils >> xla_device
Deprecated using 'val_loss' to set the ModelCheckpoint monitor (#6012)
Deprecated .get_model() with explicit .lightning_module property (#6035)
Deprecated Trainer attribute accelerator_backend in favor of accelerator (#6034)

Removed

Removed deprecated checkpoint argument filepath (#5321)
Removed deprecated Fbeta, f1_score and fbeta_score metrics (#5322)
Removed deprecated TrainResult (#5323)
Removed deprecated EvalResult (#5633)
Removed LoggerStages (#5673)

Fixed

Fixed distributed setting and ddp_cpu only with num_processes>1 (#5297)
Fixed the saved filename in ModelCheckpoint when it already exists (#4861)
Fixed DDPHPCAccelerator hangs in DDP construction by calling init_device (#5157)
Fixed num_workers for Windows example (#5375)
Fixed loading yaml (#5619)
Fixed support custom DataLoader with DDP if they can be re-instantiated (#5745)
Fixed repeated .fit() calls ignore max_steps iteration bound (#5936)
Fixed throwing MisconfigurationError on unknown mode (#5255)
Resolve bug with Finetuning (#5744)
Fixed ModelCheckpoint race condition in file existence check (#5155)
Fixed some compatibility with PyTorch 1.8 (#5864)
Fixed forward cache (#5895)
Fixed recursive detach of tensors to CPU (#6007)
Fixed passing wrong strings for scheduler interval doesn't throw an error (#5923)
Fixed wrong requires_grad state after return None with multiple optimizers (#5738)
Fixed add on_epoch_end hook at the end of validation, test epoch (#5986)
Fixed missing process_dataloader call for TPUSpawn when in distributed mode (#6015)
Fixed progress bar flickering by appending 0 to floats/strings (#6009)
Fixed synchronization issues with TPU training (#6027)
Fixed hparams.yaml saved twice when using TensorBoardLogger (#5953)
Fixed basic examples (#5912, #5985)
Fixed fairscale compatible with PT 1.8 (#5996)
Ensured process_dataloader is called when tpu_cores > 1 to use Parallel DataLoader (#6015)
Attempted SLURM auto resume call when non-shell call fails (#6002)
Fixed wrapping optimizers upon assignment (#6006)
Fixed allowing hashing of metrics with lists in their state (#5939)

Contributors

@alanhdu, @ananthsub, @awaelchli, @Borda, @borisdayma, @carmocca, @ddrevicky, @deng-cy, @ducthienbui97, @justusschock, @kartik4949, @kaushikb11, @manipopopo, @marload, @neighthan, @peblair, @prampey, @pranjaldatta, @rohitgr7, @SeanNaren, @sid-sundrani, @SkafteNicki, @tadejsv, @tchaton, @teddykoker, @titu1994, @yuntai

If we forgot someone due to not matching commit email with GitHub account, let us know :]

Assets 4

08 Feb 08:49

Standard weekly patch release

[1.1.8] - 2021-02-08

Fixed

Separate epoch validation from step validation (#5208)
Fixed toggle_optimizers not handling all optimizer parameters (#5775)

Contributors

@ananthsub, @rohitgr7

If we forgot someone due to not matching commit email with GitHub account, let us know :]

Assets 4

03 Feb 18:10

Standard weekly patch release

[1.1.7] - 2021-02-03

Fixed

Fixed TensorBoardLogger not closing SummaryWriter on finalize (#5696)
Fixed filtering of pytorch "unsqueeze" warning when using DP (#5622)
Fixed num_classes argument in F1 metric (#5663)
Fixed log_dir property (#5537)
Fixed a race condition in ModelCheckpoint when checking if a checkpoint file exists (#5144)
Remove unnecessary intermediate layers in Dockerfiles (#5697)
Fixed auto learning rate ordering (#5638)

Contributors

@awaelchli @guillochon @noamzilo @rohitgr7 @SkafteNicki @sumanthratna

If we forgot someone due to not matching commit email with GitHub account, let us know :]

Assets 4