Releases: Lightning-AI/pytorch-lightning
Standard weekly patch release
[1.2.7] - 2021-04-06
Fixed
- Fixed resolve a bug with omegaconf and
xm.save
(#6741) - Fixed an issue with IterableDataset when len is not defined (#6828)
- Sanitize None params during pruning (#6836)
- Enforce an epoch scheduler interval when using SWA (#6588)
- Fixed TPU Colab hang issue, post training (#6816])
- Fixed a bug where
TensorBoardLogger
would give a warning and not log correctly to a symbolic linksave_dir
(#6730)
Contributors
@awaelchli, @ethanwharris, @karthikprasad, @kaushikb11, @mibaumgartner, @tchaton
If we forgot someone due to not matching commit email with GitHub account, let us know :]
Standard weekly patch release
[1.2.6] - 2021-03-30
Changed
- Changed the behavior of
on_epoch_start
to run at the beginning of validation & test epoch (#6498)
Removed
- Removed legacy code to include
step
dictionary returns incallback_metrics
. Useself.log_dict
instead. (#6682)
Fixed
- Fixed
DummyLogger.log_hyperparams
raising aTypeError
when running withfast_dev_run=True
(#6398) - Fixed error on TPUs when there was no
ModelCheckpoint
(#6654) - Fixed
trainer.test
freeze on TPUs (#6654) - Fixed a bug where gradients were disabled after calling
Trainer.predict
(#6657) - Fixed bug where no TPUs were detected in a TPU pod env (#6719)
Contributors
@awaelchli, @carmocca, @ethanwharris, @kaushikb11, @rohitgr7, @tchaton
If we forgot someone due to not matching commit email with GitHub account, let us know :]
Weekly patch release - torchmetrics compatibility
[1.2.5] - 2021-03-23
Changed
- Added Autocast in validation, test and predict modes for Native AMP (#6565)
- Update Gradient Clipping for the TPU Accelerator (#6576)
- Refactored setup for typing friendly (#6590)
Fixed
- Fixed a bug where
all_gather
would not work correctly withtpu_cores=8
(#6587) - Fixed comparing required versions (#6434)
- Fixed duplicate logs appearing in console when using the python logging module (#6275)
Contributors
@awaelchli, @Borda, @ethanwharris, @justusschock, @kaushikb11
If we forgot someone due to not matching commit email with GitHub account, let us know :]
Standard weekly patch release
[1.2.4] - 2021-03-16
Changed
- Changed the default of
find_unused_parameters
back toTrue
in DDP and DDP Spawn (#6438)
Fixed
- Expose DeepSpeed loss parameters to allow users to fix loss instability (#6115)
- Fixed DP reduction with collection (#6324)
- Fixed an issue where the tuner would not tune the learning rate if also tuning the batch size (#4688)
- Fixed broadcast to use PyTorch
broadcast_object_list
and addreduce_decision
(#6410) - Fixed logger creating directory structure too early in DDP (#6380)
- Fixed DeepSpeed additional memory use on rank 0 when default device not set early enough (#6460)
- Fixed
DummyLogger.log_hyperparams
raising aTypeError
when running withfast_dev_run=True
(#6398) - Fixed an issue with
Tuner.scale_batch_size
not finding the batch size attribute in the datamodule (#5968) - Fixed an exception in the layer summary when the model contains torch.jit scripted submodules (#6511)
- Fixed when Train loop config was run during
Trainer.predict
(#6541)
Contributors
@awaelchli, @kaushikb11, @Palzer, @SeanNaren, @tchaton
If we forgot someone due to not matching commit email with GitHub account, let us know :]
Standard weekly patch release
[1.2.3] - 2021-03-09
Fixed
- Fixed
ModelPruning(make_pruning_permanent=True)
pruning buffers getting removed when saved during training (#6073) - Fixed when
_stable_1d_sort
to work whenn >= N
(#6177) - Fixed
AttributeError
whenlogger=None
on TPU (#6221) - Fixed PyTorch Profiler with
emit_nvtx
(#6260) - Fixed
trainer.test
frombest_path
hangs after callingtrainer.fit
(#6272) - Fixed
SingleTPU
callingall_gather
(#6296) - Ensure we check deepspeed/sharded in multinode DDP (#6297)
- Check
LightningOptimizer
doesn't delete optimizer hooks (#6305) - Resolve memory leak for evaluation (#6326)
- Ensure that clip gradients is only called if the value is greater than 0 (#6330)
- Fixed
Trainer
not resettinglightning_optimizers
when callingTrainer.fit()
multiple times (#6372)
Contributors
@awaelchli, @carmocca, @chizuchizu, @frankier, @SeanNaren, @tchaton
If we forgot someone due to not matching commit email with GitHub account, let us know :]
Standard weekly patch release
[1.2.2] - 2021-03-02
Added
- Added
checkpoint
parameter to callback'son_save_checkpoint
hook (#6072)
Changed
- Changed the order of
backward
,step
,zero_grad
tozero_grad
,backward
,step
(#6147) - Changed default for DeepSpeed CPU Offload to False, due to prohibitively slow speeds at smaller scale (#6262)
Fixed
- Fixed epoch level schedulers not being called when
val_check_interval < 1.0
(#6075) - Fixed multiple early stopping callbacks (#6197)
- Fixed incorrect usage of
detach()
,cpu()
,to()
(#6216) - Fixed LBFGS optimizer support which didn't converge in automatic optimization (#6147)
- Prevent
WandbLogger
from dropping values (#5931) - Fixed error thrown when using valid distributed mode in multi node (#6297)
Contributors
@akihironitta, @borisdayma, @carmocca, @dvolgyes, @SeanNaren, @SkafteNicki
If we forgot someone due to not matching commit email with GitHub account, let us know :]
Standard weekly patch release
[1.2.1] - 2021-02-23
Fixed
- Fixed incorrect yield logic for the amp autocast context manager (#6080)
- Fixed priority of plugin/accelerator when setting distributed mode (#6089)
- Fixed error message for AMP + CPU incompatibility (#6107)
Contributors
@awaelchli, @SeanNaren, @carmocca
If we forgot someone due to not matching commit email with GitHub account, let us know :]
Pruning & Quantization & SWA
[1.2.0] - 2021-02-18
Added
- Added
DataType
,AverageMethod
andMDMCAverageMethod
enum in metrics (#5657) - Added support for summarized model total params size in megabytes (#5590)
- Added support for multiple train loaders (#1959)
- Added
Accuracy
metric now generalizes to Top-k accuracy for (multi-dimensional) multi-class inputs using thetop_k
parameter (#4838) - Added
Accuracy
metric now enables the computation of subset accuracy for multi-label or multi-dimensional multi-class inputs with thesubset_accuracy
parameter (#4838) - Added
HammingDistance
metric to compute the hamming distance (loss) (#4838) - Added
max_fpr
parameter toauroc
metric for computing partial auroc metric (#3790) - Added
StatScores
metric to compute the number of true positives, false positives, true negatives and false negatives (#4839) - Added
R2Score
metric (#5241) - Added
LambdaCallback
(#5347) - Added
BackboneLambdaFinetuningCallback
(#5377) - Accelerator
all_gather
supports collection (#5221) - Added
image_gradients
functional metric to compute the image gradients of a given input image. (#5056) - Added
MetricCollection
(#4318) - Added
.clone()
method to metrics (#4318) - Added
IoU
class interface (#4704) - Support to tie weights after moving model to TPU via
on_post_move_to_device
hook - Added missing val/test hooks in
LightningModule
(#5467) - The
Recall
andPrecision
metrics (and their functional counterpartsrecall
andprecision
) can now be generalized to Recall@K and Precision@K with the use oftop_k
parameter (#4842) - Added
ModelPruning
Callback (#5618, #5825, #6045) - Added
PyTorchProfiler
(#5560) - Added compositional metrics (#5464)
- Added Trainer method
predict(...)
for high performence predictions (#5579) - Added
on_before_batch_transfer
andon_after_batch_transfer
data hooks (#3671) - Added AUC/AUROC class interface (#5479)
- Added
PredictLoop
object (#5752) - Added
QuantizationAwareTraining
callback (#5706, #6040) - Added
LightningModule.configure_callbacks
to enable the definition of model-specific callbacks (#5621) - Added
dim
toPSNR
metric for mean-squared-error reduction (#5957) - Added promxial policy optimization template to pl_examples (#5394)
- Added
log_graph
toCometLogger
(#5295) - Added possibility for nested loaders (#5404)
- Added
sync_step
to Wandb logger (#5351) - Added
StochasticWeightAveraging
callback (#5640) - Added
LightningDataModule.from_datasets(...)
(#5133) - Added
PL_TORCH_DISTRIBUTED_BACKEND
env variable to select backend (#5981) - Added
Trainer
flag to activate Stochastic Weight Averaging (SWA)Trainer(stochastic_weight_avg=True)
(#6038) - Added DeepSpeed integration (#5954, #6042)
Changed
- Changed
stat_scores
metric now calculates stat scores over all classes and gains new parameters, in line with the newStatScores
metric (#4839) - Changed
computer_vision_fine_tunning
example to useBackboneLambdaFinetuningCallback
(#5377) - Changed
automatic casting
for LoggerConnectormetrics
(#5218) - Changed
iou
[func] to allow float input (#4704) - Metric
compute()
method will no longer automatically callreset()
(#5409) - Set PyTorch 1.4 as min requirements, also for testing and examples
torchvision>=0.5
andtorchtext>=0.5
(#5418) - Changed
callbacks
argument inTrainer
to allowCallback
input (#5446) - Changed the default of
find_unused_parameters
toFalse
in DDP (#5185) - Changed
ModelCheckpoint
version suffixes to start at 1 (#5008) - Progress bar metrics tensors are now converted to float (#5692)
- Changed the default value for the
progress_bar_refresh_rate
Trainer argument in Google COLAB notebooks to 20 (#5516) - Extended support for purely iteration-based training (#5726)
- Made
LightningModule.global_rank
,LightningModule.local_rank
andLightningModule.logger
read-only properties (#5730) - Forced
ModelCheckpoint
callbacks to run after all others to guarantee all states are saved to the checkpoint (#5731) - Refactored Accelerators and Plugins (#5743)
- Added base classes for plugins (#5715)
- Added parallel plugins for DP, DDP, DDPSpawn, DDP2 and Horovod (#5714)
- Precision Plugins (#5718)
- Added new Accelerators for CPU, GPU and TPU (#5719)
- Added Plugins for TPU training (#5719)
- Added RPC and Sharded plugins (#5732)
- Added missing
LightningModule
-wrapper logic to new plugins and accelerator (#5734) - Moved device-specific teardown logic from training loop to accelerator (#5973)
- Moved accelerator_connector.py to the connectors subfolder (#6033)
- Trainer only references accelerator (#6039)
- Made parallel devices optional across all plugins (#6051)
- Cleaning (#5948, #5949, #5950)
- Enabled
self.log
in callbacks (#5094) - Renamed xxx_AVAILABLE as protected (#5082)
- Unified module names in Utils (#5199)
- Separated utils: imports & enums (#5256, #5874)
- Refactor: clean trainer device & distributed getters (#5300)
- Simplified training phase as LightningEnum (#5419)
- Updated metrics to use LightningEnum (#5689)
- Changed the seq of
on_train_batch_end
,on_batch_end
&on_train_epoch_end
,on_epoch_end hooks
(#5688) - Refactored
setup_training
and removetest_mode
(#5388) - Disabled training with zero
num_training_batches
when insufficientlimit_train_batches
(#5703) - Refactored
EpochResultStore
(#5522) - Update
lr_finder
to check for attribute if not runningfast_dev_run
(#5990) - LightningOptimizer manual optimizer is more flexible and expose
toggle_model
(#5771) MlflowLogger
limit parameter value length to 250 char (#5893)- Re-introduced fix for Hydra directory sync with multiple process (#5993)
Deprecated
- Function
stat_scores_multiple_classes
is deprecated in favor ofstat_scores
(#4839) - Moved accelerators and plugins to its
legacy
pkg (#5645) - Deprecated
LightningDistributedDataParallel
in favor of new wrapper moduleLightningDistributedModule
(#5185) - Deprecated
LightningDataParallel
in favor of new wrapper moduleLightningParallelModule
(#5670) - Renamed utils modules (#5199)
argparse_utils
>>argparse
model_utils
>>model_helpers
warning_utils
>>warnings
xla_device_utils
>>xla_device
- Deprecated using
'val_loss'
to set theModelCheckpoint
monitor (#6012) - Deprecated
.get_model()
with explicit.lightning_module
property (#6035) - Deprecated Trainer attribute
accelerator_backend
in favor ofaccelerator
(#6034)
Removed
- Removed deprecated checkpoint argument
filepath
(#5321) - Removed deprecated
Fbeta
,f1_score
andfbeta_score
metrics (#5322) - Removed deprecated
TrainResult
(#5323) - Removed deprecated
EvalResult
(#5633) - Removed
LoggerStages
(#5673)
Fixed
- Fixed distributed setting and
ddp_cpu
only withnum_processes>1
(#5297) - Fixed the saved filename in
ModelCheckpoint
when it already exists (#4861) - Fixed
DDPHPCAccelerator
hangs in DDP construction by callinginit_device
(#5157) - Fixed
num_workers
for Windows example (#5375) - Fixed loading yaml (#5619)
- Fixed support custom DataLoader with DDP if they can be re-instantiated (#5745)
- Fixed repeated
.fit()
calls ignore max_steps iteration bound (#5936) - Fixed throwing
MisconfigurationError
on unknown mode (#5255) - Resolve bug with Finetuning (#5744)
- Fixed
ModelCheckpoint
race condition in file existence check (#5155) - Fixed some compatibility with PyTorch 1.8 (#5864)
- Fixed forward cache (#5895)
- Fixed recursive detach of tensors to CPU (#6007)
- Fixed passing wrong strings for scheduler interval doesn't throw an error (#5923)
- Fixed wrong
requires_grad
state afterreturn None
with multiple optimizers (#5738) - Fixed add
on_epoch_end
hook at the end ofvalidation
,test
epoch (#5986) - Fixed missing
process_dataloader
call forTPUSpawn
when in distributed mode (#6015) - Fixed progress bar flickering by appending 0 to floats/strings (#6009)
- Fixed synchronization issues with TPU training (#6027)
- Fixed
hparams.yaml
saved twice when usingTensorBoardLogger
(#5953) - Fixed basic examples (#5912, #5985)
- Fixed
fairscale
compatible with PT 1.8 (#5996) - Ensured
process_dataloader
is called whentpu_cores > 1
to use Parallel DataLoader (#6015) - Attempted SLURM auto resume call when non-shell call fails (#6002)
- Fixed wrapping optimizers upon assignment (#6006)
- Fixed allowing hashing of metrics with lists in their state (#5939)
Contributors
@alanhdu, @ananthsub, @awaelchli, @Borda, @borisdayma, @carmocca, @ddrevicky, @deng-cy, @ducthienbui97, @justusschock, @kartik4949, @kaushikb11, @manipopopo, @marload, @neighthan, @peblair, @prampey, @pranjaldatta, @rohitgr7, @SeanNaren, @sid-sundrani, @SkafteNicki, @tadejsv, @tchaton, @teddykoker, @titu1994, @yuntai
If we forgot someone due to not matching commit email with GitHub account, let us know :]
Standard weekly patch release
Standard weekly patch release
[1.1.7] - 2021-02-03
Fixed
- Fixed
TensorBoardLogger
not closingSummaryWriter
onfinalize
(#5696) - Fixed filtering of pytorch "unsqueeze" warning when using DP (#5622)
- Fixed
num_classes
argument in F1 metric (#5663) - Fixed
log_dir
property (#5537) - Fixed a race condition in
ModelCheckpoint
when checking if a checkpoint file exists (#5144) - Remove unnecessary intermediate layers in Dockerfiles (#5697)
- Fixed auto learning rate ordering (#5638)
Contributors
@awaelchli @guillochon @noamzilo @rohitgr7 @SkafteNicki @sumanthratna
If we forgot someone due to not matching commit email with GitHub account, let us know :]