Releases: Lightning-AI/pytorch-lightning
Standard weekly patch release
[1.1.6] - 2021-01-26
Changed
- Increased TPU check timeout from 20s to 100s (#5598)
- Ignored
step
param in Neptune logger's log_metric method (#5510) - Pass batch outputs to
on_train_batch_end
instead ofepoch_end
outputs (#4369)
Fixed
- Fixed
toggle_optimizer
to resetrequires_grad
state (#5574) - Fixed FileNotFoundError for best checkpoint when using DDP with Hydra (#5629)
- Fixed an error when logging a progress bar metric with a reserved name (#5620)
- Fixed
Metric
'sstate_dict
not included when child modules (#5614) - Fixed Neptune logger creating multiple experiments when GPUs > 1 (#3256)
- Fixed duplicate logs appearing in console when using the python logging module (#5509)
- Fixed tensor printing in
trainer.test()
(#5138) - Fixed not using dataloader when
hparams
present (#4559)
Contributors
@awaelchli @bryant1410 @lezwon @manipopopo @PiotrJander @psinger @rnett @SeanNaren @swethmandava @tchaton
If we forgot someone due to not matching commit email with GitHub account, let us know :]
Standard weekly patch release
[1.1.5] - 2021-01-19
Fixed
- Fixed a visual bug in the progress bar display initialization (#4579)
- Fixed logging
on_train_batch_end
in a callback with multiple optimizers (#5521) - Fixed
reinit_scheduler_properties
with correct optimizer (#5519) - Fixed
val_check_interval
withfast_dev_run
(#5540)
Contributors
@awaelchli, @carmocca, @rohitgr7
If we forgot someone due to not matching commit email with GitHub account, let us know :]
Standard weekly patch release
[1.1.4] - 2021-01-12
Added
- Add automatic optimization property setter to lightning module (#5169)
Changed
- Changed deprecated
enable_pl_optimizer=True
(#5244)
Fixed
- Fixed
transfer_batch_to_device
for DDP withlen(devices_ids) == 1
(#5195) - Logging only on
not should_accumulate()
during training (#5417) - Resolve interpolation bug with Hydra (#5406)
- Check environ before selecting a seed to prevent warning message (#4743)
Contributors
@ananthsub, @SeanNaren, @tchaton
If we forgot someone due to not matching commit email with GitHub account, let us know :]
Standard weekly patch release
[1.1.3] - 2021-01-05
Added
- Added a check for optimizer attached to
lr_scheduler
(#5338) - Added support for passing non-existing
filepaths
toresume_from_checkpoint
(#4402)
Changed
- Skip restore from
resume_from_checkpoint
whiletesting
(#5161) - Allowed
log_momentum
for adaptive optimizers inLearningRateMonitor
(#5333) - Disabled checkpointing, earlystopping and logging with
fast_dev_run
(#5277) - Distributed group defaults to
WORLD
ifNone
(#5125)
Fixed
- Fixed
trainer.test
returning non-test metrics (#5214) - Fixed metric state reset (#5273)
- Fixed
--num-nodes
onDDPSequentialPlugin
(#5327) - Fixed invalid value for
weights_summary
(#5296) - Fixed
Trainer.test
not using the latestbest_model_path
(#5161) - Fixed existence check for
hparams
not using underlying filesystem (#5250) - Fixed
LightningOptimizer
AMP bug (#5191) - Fixed casted key to string in
_flatten_dict
(#5354)
Contributors
@8greg8, @haven-jeon, @kandluis, @marload, @rohitgr7, @tadejsv, @tarepan, @tchaton
If we forgot someone due to not matching commit email with GitHub account, let us know :]
standard weekly patch release
Overview
Detail changes
Added
- Support number for logging with
sync_dist=True
(#5080) - Added offset logging step when resuming for Wandb logger (#5050)
Removed
enable_pl_optimizer=False
by default to temporarily fix AMP issues (#5163)
Fixed
- Metric reduction with Logging (#5150)
- Remove nan loss in manual optimization (#5121)
- Un-balanced logging properly supported (#5119)
- Fix hanging in DDP HPC accelerators (#5157)
- Fix saved filename in
ModelCheckpoint
if it already exists (#4861) - Fix reset
TensorRunningAccum
(#5106) - Updated
DALIClassificationLoader
to not use deprecated arguments (#4925) - Corrected call to
torch.no_grad
(#5124)
Contributors
@8greg8, @ananthsub, @borisdayma, @gan3sh500, @rohitgr7, @SeanNaren, @tchaton, @VinhLoiIT
If we forgot someone due to not matching commit email with GitHub account, let us know :]
standard weekly patch release
Overview
Detail changes
Added
- Add a notebook example to reach a quick baseline of ~94% accuracy on CIFAR10 using Resnet in Lightning (#4818)
Changed
Removed
Fixed
- Fixed trainer by default
None
inDDPAccelerator
(#4915) - Fixed
LightningOptimizer
to expose optimizer attributes (#5095) - Do not warn when the
name
key is used in thelr_scheduler
dict (#5057) - Check if optimizer supports closure (#4981)
- Extend LightningOptimizer to exposure underlying Optimizer attributes + update doc (#5095)
- Add deprecated metric utility functions back to functional (#5067, #5068)
- Allow any input in
to_onnx
andto_torchscript
(#4378) - Do not warn when the name key is used in the
lr_scheduler
dict (#5057)
Contributors
@Borda, @carmocca, @hemildesai, @rohitgr7, @s-rog, @tarepan, @tchaton
If we forgot someone due to not matching commit email with GitHub account, let us know :]
Model Parallelism Training and More Logging Options
Overview
Lightning 1.1 is out! You can now train models with twice the parameters and zero code changes with the new sharded model training! We also have a new plugin for sequential model parallelism, more logging options, and a lot of improvements!
Release highlights: https://bit.ly/3gyLZpP
Learn more about sharded training: https://bit.ly/2W3hgI0
Detail changes
Added
- Added "monitor" key to saved
ModelCheckpoints
(#4383) - Added
ConfusionMatrix
class interface (#4348) - Added multiclass AUROC metric (#4236)
- Added global step indexing to the checkpoint name for a better sub-epoch checkpointing experience (#3807)
- Added optimizer hooks in callbacks (#4379)
- Added option to log momentum (#4384)
- Added
current_score
toModelCheckpoint.on_save_checkpoint
(#4721) - Added logging using
self.log
in train and evaluation for epoch end hooks (#4913) - Added ability for DDP plugin to modify optimizer state saving (#4675)
- Added casting to python types for NumPy scalars when logging
hparams
(#4647) - Added
prefix
argument in loggers (#4557) - Added printing of total num of params, trainable and non-trainable params in ModelSummary (#4521)
- Added
PrecisionRecallCurve, ROC, AveragePrecision
class metric (#4549) - Added custom
Apex
andNativeAMP
asPrecision plugins
(#4355) - Added
DALI MNIST
example (#3721) - Added
sharded plugin
for DDP for multi-GPU training memory optimizations (#4773) - Added
experiment_id
to the NeptuneLogger (#3462) - Added
Pytorch Geometric
integration example with Lightning (#4568) - Added
all_gather
method toLightningModule
which allows gradient-based tensor synchronizations for use-cases such as negative sampling. (#5012) - Enabled
self.log
in most functions (#4969) - Added changeable extension variable for
ModelCheckpoint
(#4977)
Changed
- Removed
multiclass_roc
andmulticlass_precision_recall_curve
, useroc
andprecision_recall_curve
instead (#4549) - Tuner algorithms will be skipped if
fast_dev_run=True
(#3903) - WandbLogger does not force wandb
reinit
arg to True anymore and creates a run only when needed (#4648) - Changed
automatic_optimization
to be a model attribute (#4602) - Changed
Simple Profiler
report to order by percentage time spent + num calls (#4880) - Simplify optimization Logic (#4984)
- Classification metrics overhaul (#4837)
- Updated
fast_dev_run
to accept integer representing num_batches (#4629) - Refactored optimizer (#4658)
Deprecated
- Deprecated
prefix
argument inModelCheckpoint
(#4765) - Deprecated the old way of assigning hyper-parameters through
self.hparams = ...
(#4813) - Deprecated
mode='auto'
fromModelCheckpoint
andEarlyStopping
(#4695)
Removed
- Removed
reorder
parameter of theauc
metric (#5004)
Fixed
- Added feature to move tensors to CPU before saving (#4309)
- Fixed
LoggerConnector
to have logged metrics on root device in DP (#4138) - Auto convert tensors to contiguous format when
gather_all
(#4907) - Fixed
PYTHONPATH
for DDP test model (#4528) - Fixed allowing logger to support indexing (#4595)
- Fixed DDP and manual_optimization (#4976)
Contributors
@ananyahjha93, @awaelchli, @blatr, @Borda, @borisdayma, @carmocca, @ddrevicky, @george-gca, @gianscarpe, @irustandi, @janhenriklambrechts, @jeremyjordan, @justusschock, @lezwon, @rohitgr7, @s-rog, @SeanNaren, @SkafteNicki, @tadejsv, @tchaton, @williamFalcon, @zippeurfou
If we forgot someone due to not matching commit email with GitHub account, let us know :]
standard weekly patch release
Detail changes
Added
- Added casting to python types for numpy scalars when logging
hparams
(#4647) - Added warning when progress bar refresh rate is less than 20 on Google Colab to prevent crashing (#4654)
- Added
F1
class metric (#4656)
Changed
- Consistently use
step=trainer.global_step
inLearningRateMonitor
independently oflogging_interval
(#4376) - Metric states are no longer as default added to
state_dict
(#4685) - Renamed class metric
Fbeta
>>FBeta
(#4656) - Model summary: add 1 decimal place (#4745)
- Do not override
PYTHONWARNINGS
(#4700)
Fixed
- Fixed checkpoint
hparams
dict casting whenomegaconf
is available (#4770) - Fixed incomplete progress bars when total batches not divisible by refresh rate (#4577)
- Updated SSIM metric (#4566)(#4656)
- Fixed batch_arg_name - add
batch_arg_name
to all calls to_adjust_batch_size
bug (#4812) - Fixed
torchtext
data to GPU (#4785) - Fixed a crash bug in MLFlow logger (#4716)
Contributors
@awaelchli, @jonashaag, @jungwhank, @M-Salti, @moi90, @pgagarinov, @s-rog, @Samyak2, @SkafteNicki, @teddykoker, @ydcjeff
If we forgot someone due to not matching commit email with GitHub account, let us know :]
standard weekly patch release
Detail changes
Added
- Added lambda closure to
manual_optimizer_step
(#4618)
Changed
- Change Metrics
persistent
default mode toFalse
(#4685)
Fixed
- Prevent crash if
sync_dist=True
on CPU (#4626) - Fixed average pbar Metrics (#4534)
- Fixed
setup
callback hook to correctly pass the LightningModule through (#4608) - Allowing decorate model init with saving
hparams
inside (#4662) - Fixed
split_idx
set byLoggerConnector
inon_trainer_init
toTrainer
(#4697)
Contributors
@ananthsub, @Borda, @SeanNaren, @SkafteNicki, @tchaton
If we forgot someone due to not matching commit email with GitHub account, let us know :]
standard weekly patch release
Detail changes
Added
- Added metrics aggregation in Horovod and fixed early stopping (#3775)
- Added
manual_optimizer_step
which work withAMP Native
andaccumulated_grad_batches
(#4485) - Added
persistent(mode)
method to metrics, to enable and disable metric states being added tostate_dict
(#4482) - Added congratulations at the end of our notebooks (#4555)
Changed
- Changed
fsspec
to tuner (#4458) - Unify sLURM/TorchElastic under backend plugin (#4578, #4580, #4581, #4582, #4583)
Fixed
- Fixed feature-lack in
hpc_load
(#4526) - Fixed metrics states being overridden in DDP mode (#4482)
- Fixed
lightning_getattr
,lightning_hasattr
not finding the correct attributes in datamodule (#4347) - Fixed automatic optimization AMP by
manual_optimization_step
(#4485) - Replace
MisconfigurationException
with warning inModelCheckpoint
Callback (#4560) - Fixed logged keys in mlflow logger (#4412)
- Fixed
is_picklable
by catchingAttributeError
(#4508)
Contributors
@dscarmo, @jtamir, @kazhang, @maxjeblick, @rohitgr7, @SkafteNicki, @tarepan, @tchaton, @tgaddair, @williamFalcon
If we forgot someone due to not matching commit email with GitHub account, let us know :]