All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog.
- Support for Lightning and PyTorch
2.5.0
- FTS support for PyTorch's composable distributed (e.g.
fully_shard
,checkpoint
) and Tensor Parallelism (TP) APIs - Support for Lightning's
ModelParallelStrategy
- Experimental 'Auto' FSDP2 Plan Configuration feature, allowing application of the
fully_shard
API using module name/pattern-based configuration instead of manually inspecting modules and applying the API inLightningModule.configure_model
- FSDP2 'Auto' Plan Convenience Aliases, simplifying use of both composable and non-composable activation checkpointing APIs
- Flexible orchestration of advanced profiling combining multiple complementary PyTorch profilers with FTS
MemProfiler
- removed support for PyTorch
2.1
- Support for Lightning and PyTorch
2.4.1
- Added logic to more robustly condition depth-aligned checkpoint metadata updates to address edge-cases where
current_score
precisely equaled thebest_model_score
at multiple different depths. Resolved #15.
- Support for Lightning and PyTorch
2.4.1
- Added logic to more robustly condition depth-aligned checkpoint metadata updates to address edge-cases where
current_score
precisely equaled thebest_model_score
at multiple different depths. Resolved #15.
- Support for Lightning and PyTorch
2.4.0
- Support for Python
3.12
- Changed default value of the
frozen_bn_track_running_stats
option to the FTS callback constructor toTrue
.
- removed support for PyTorch
2.0
- removed support for Python
3.8
- Support for Lightning <=
2.3.3
(includes critical security fixes) and PyTorch <=2.3.1
- Support for Lightning <=
2.3.2
and PyTorch <=2.3.1
- Support for Lightning and PyTorch
2.3.0
- Introduced the
frozen_bn_track_running_stats
option to the FTS callback constructor, allowing the user to override the default Lightning behavior that disablestrack_running_stats
when freezing BatchNorm layers. Resolves#13.
- removed support for PyTorch
1.13
- Support for Lightning
2.2.4
and PyTorch2.2.2
- Support for Lightning
2.2.1
- Support for Lightning and PyTorch
2.2.0
- FTS now inspects any base
EarlyStopping
orModelCheckpoint
configuration passed in by the user and applies that configuration when instantiating the required FTS callback dependencies (i.e.,FTSEarlyStopping
orFTSCheckpoint
). Part of the resolution to #12.
- updated reference to renamed
FSDPPrecision
- increased
jsonargparse
minimum supported version to4.26.1
- Explicitly
rank_zero_only
-guardedScheduleImplMixin.save_schedule
andScheduleImplMixin.gen_ft_schedule
. Some codepaths were incorrectly invoking them from non-rank_zero_only
guarded contexts. Resolved #11. - Added a note in the documentation indicating more clearly the behavior of FTS when no monitor metric configuration is provided. Part of the resolution to #12.
- removed support for PyTorch
1.12
- removed legacy FTS examples
- Support for Lightning
2.1.4
- bumped
sphinx
requirement to>5.0,<6.0
- removed deprecated lr
verbose
init param usage - removed deprecated
tensorboard.dev
references
- Support for Lightning
2.1.3
- Support for Lightning
2.1.2
- Explicitly
rank_zero_only
-guardedScheduleImplMixin.save_schedule
andScheduleImplMixin.gen_ft_schedule
. Some codepaths were incorrectly invoking them from non-rank_zero_only
guarded contexts. Resolves #11.
- Support for Lightning
2.1.1
- Support for Lightning and PyTorch
2.1.0
- Support for Python
3.11
- Support for simplified scheduled FSDP training with PyTorch >=
2.1.0
anduse_orig_params
set toTrue
- Unified different FSDP
use_orig_params
mode code-paths to support saving/restoring full, consolidated OSD (PyTorch versions >=2.0.0
) - added support for FSDP
activation_checkpointing_policy
and updated FSDP profiling examples accordingly - added support for
CustomPolicy
and new implementation ofModuleWrapPolicy
with FSDP2.1.0
- FSDP profiling examples now use a patched version of
FSDPStrategy
to avoid omni-us/jsonargparse#337 withjsonargparse
<4.23.1
- updated
validate_min_wrap_condition
to avoid overly restrictive validation in someuse_orig_params
contexts - for PyTorch versions < 2.0, when using the FSDP strategy, disabled optimizer state saving/restoration per Lightning-AI/pytorch-lightning#18296
- improved fsdp strategy adapter
no_decay
attribute handling
FSDPStrategyAdapter
now uses theconfigure_model
hook rather than the deprecatedconfigure_sharded_model
hook to apply the relevant model wrapping. See Lightning-AI/pytorch-lightning#18004 for more context regardingconfigure_sharded_model
deprecation.- Dropped support for PyTorch
1.11.x
.
- Support for Lightning 2.0.8 and 2.0.9
- Support for Lightning 2.0.7
- Support for Lightning 2.0.5 and 2.0.6
- Support for PyTorch Lightning 2.0.3 and 2.0.4
- adjusted default example log name
- disabled fsdp 1.x mixed precision tests temporarily until Lightning-AI/pytorch-lightning#17807 is merged
- Beta support for optimizer reinitialization. Resolves #6
- Use structural typing for Fine-Tuning Scheduler supported optimizers with
ParamGroupAddable
- Support for
jsonargparse
version4.20.1
- During schedule phase transitions, the latest LR state will be restored before proceeding with the next phase configuration and execution (mostly relevant to lr scheduler and optimizer reinitialization but also improves configuration when restoring best checkpoints across multiple depths)
- Allow sharded optimizers
ZeroRedundancyOptimizer
to be properly reconfigured if necessary in the context ofenforce_phase0_params
set toTrue
.
- Support for PyTorch Lightning 2.0.1
- Lightning support for
use_orig_params
via (#16733)
- Support for PyTorch and PyTorch Lightning 2.0.0!
- New
enforce_phase0_params
feature. FTS ensures the optimizer configured inconfigure_optimizers
will optimize the parameters (and only those parameters) scheduled to be optimized in phase0
of the current fine-tuning schedule. (#9) - Support for
torch.compile
- Support for numerous new FSDP options including preview support for some FSDP options coming soon to Lightning (e.g.
use_orig_params
) - When using FTS with FSDP, support the use of
_FSDPPolicy
auto_wrap_policy
wrappers (new in PyTorch 2.0.0) - Extensive testing for FSDP in many newly supported 2.x contexts (including 1.x FSDP compatibility multi-gpu tests)
- Support for strategies that do not have a canonical
strategy_name
but use_strategy_flag
- Now that the core Lightning package is
lightning
rather thanpytorch-lightning
, Fine-Tuning Scheduler (FTS) by default depends upon thelightning
package rather than the standalonepytorch-lightning
. If you would like to continue to use FTS with the standalonepytorch-lightning
package instead, you can still do so (see README). Resolves (#8). - Fine-Tuning Scheduler (FTS) major version numbers will align with the rest of the PyTorch ecosystem (e.g. FTS 2.x supports PyTorch and Lightning >= 2.0)
- Switched to use
ruff
instead offlake8
for linting - Replaced
fsdp_optim_view
with eitherfsdp_optim_transform
orfsdp_optim_inspect
depending on usage context because the transformation is now not always read-only - Moved Lightning 1.x examples to
legacy
subfolder and created new FTS/Lightning 2.x examples instable
subfolder
- Removed
training_epoch_end
andvalidation_epoch_end
in accord with Lightning - Removed
DP
strategy support in accord with Lightning - Removed support for Python
3.7
and PyTorch1.10
in accord with Lightning
- Adapted loop synchronization during training resume to upstream Lightning changes
- Support for
pytorch-lightning
1.9.4 (which may be the final Lightning 1.x release as PyTorch 2.0 will be released tomorrow)
- FSDP Scheduled Fine-Tuning is now supported! See the tutorial here.
- Introduced
StrategyAdapter
s. If you want to extend Fine-Tuning Scheduler (FTS) to use a custom, currently unsupported strategy or override current FTS behavior in the context of a given training strategy, subclassingStrategyAdapter
is now a way to do so. SeeFSDPStrategyAdapter
for an example implementation. - support for
pytorch-lightning
1.9.0
- decomposed
add_optimizer_groups
to accommodate the corner case where FTS is being used without an lr scheduler configuration, also cleanup unrequired example testing warning exceptions - updated the fts repo issue template
- removed PATH adjustments that are no longer necessary due to Lightning-AI/pytorch-lightning#15485
- removed references to the
finetuning-scheduler
conda-forge package (at least temporarily) due to the current unavailability of upstream dependencies (i.e. the pytorch-lightning conda-forge package ). Installation of FTS via pip within a conda env is the recommended installation approach (both in the interim and in general).
- support for
pytorch-lightning
1.8.6 - Notify the user when
max_depth
is reached and provide the current training session stopping conditions. Resolves #7.
- set package version ceilings for the examples requirements along with a note regarding their introduction for stability
- promoted PL CLI references to top-level package
- replaced deprecated
Batch
object reference withLazyDict
- support for
pytorch-lightning
1.8.4
- pinned
jsonargparse
dependency to <4.18.0 until #205 is fixed
- support for
pytorch-lightning
1.8.2
- support for
pytorch-lightning
1.8.1 - augmented
standalone_tests.sh
to be more robust to false negatives
- added temporary expected
distutils
warning until fixed upstream in PL - updated
depth
type hint to accommodate updated mypy default config - bumped full test timeout to be more conservative given a dependent package that is currently slow to install in some contexts (i.e.
grpcio
on MacOS 11 with python3.10
)
- support for pytorch-lightning 1.8.0
- support for python 3.10
- support for PyTorch 1.13
- support for
ZeroRedundancyOptimizer
- call to PL
BaseFinetuning.freeze
did not properly hand control ofBatchNorm
module thawing to FTS schedule. Resolves #5. - fixed codecov config for azure pipeline gpu-based coverage
- Refactored unexpected and expected multi-warning checks to use a single test helper function
- Adjusted multiple FTS imports to adapt to reorganized PL/Lite imports
- Refactored fts-torch collect_env interface to allow for (slow) collect_env evolution on a per-torch version basis
- Bumped required jsonargparse version
- adapted to PL protection of
_distributed_available
- made callback setup stage arg mandatory
- updated mypy config to align with PL
Trainer
handling - updated dockerfile defs for PyTorch 1.13 and python 3.10
- updated github actions versions to current versions
- excluded python 3.10 from torch 1.9 testing due to incompatibility
- removed use of deprecated
LightningCLI
save_config_overwrite
in PL 1.8
- support for pytorch-lightning 1.7.7
- add new temporary HF expected warning to examples
- added HF
evaluate
dependency for examples
- Use HF
evaluate.load()
instead ofdatasets.load_metric()
- support for pytorch-lightning 1.7.6
- added detection of multiple instances of a given callback dependency parent
- add new expected warning to examples
- import fts to workaround pl TypeError via sphinx import, switch to non-TLS pytorch inv object connection due to current certificate issues
- bumped pytorch dependency in docker image to 1.12.1
- support for pytorch-lightning 1.7.1
- added support for ReduceLROnPlateau lr schedulers
- improved user experience with additional lr scheduler configuration inspection (using an allowlist approach) and
enhanced documentation. Expanded use of
allow_untested
to allow use of unsupported/untested lr schedulers - added initial user-configured optimizer state inspection prior to phase
0
execution, issuing warnings to the user if appropriate. Added associated documentation #4
- pruned test_examples.py from wheel
- removed a few unused internal conditions relating to lr scheduler reinitialization and parameter group addition
- support for pytorch-lightning 1.7.0
- switched to src-layout project structure
- increased flexibility of internal package management
- added a patch to examples to allow them to work with torch 1.12.0 despite issue #80809
- added sync for test log calls for multi-gpu testing
- adjusted runif condition for examples tests
- minor type annotation stylistic correction to avoid jsonargparse issue fixed in #148
- streamlined MANIFEST.in directives
- updated docker image dependencies
- disable mypy unused ignore warnings due to variable behavior depending on ptl installation method (e.g. pytorch-lightning vs full lightning package)
- changed full ci testing on mac to use macOS-11 instead of macOS-10.15
- several type-hint mypy directive updates
- unpinned protobuf in requirements as no longer necessary
- updated cuda docker images to use pytorch-lightning 1.7.0, torch 1.12.0 and cuda-11.6
- refactored mock strategy test to use a different mock strategy
- updated pyproject.toml with jupytext metadata bypass configuration for nb test cleanup
- updated ptl external class references for ptl 1.7.0
- narrowed scope of runif test helper module to only used conditions
- updated nb tutorial links to point to stable branch of docs
- unpinned jsonargparse and bumped min version to 4.9.0
- moved core requirements.txt to requirements/base.txt and update load_requirements and setup to reference lightning meta package
- update azure pipelines ci to use torch 1.12.0
- renamed instantiate_registered_class meth to instantiate_class due to ptl 1.7 deprecation of cli registry functionality
- removed ddp2 support
- removed use of ptl cli registries in examples due to its deprecation
- enhanced support and testing for lr schedulers with lr_lambdas attributes
- accept and automatically convert schedules with non-integer phase keys (that are convertible to integers) to integers
- pinned jsonargparse to be <= 4.10.1 due to regression with PTL cli with 4.10.2
- updated PL links for new lightning-ai github urls
- added a minimum hydra requirement for cli usage (due to omegaconf version incompatibility)
- separated cli requirements
- replace closed compound instances of
finetuning
with the hyphenated compound versionfine-tuning
in textual contexts. (The way language evolves,fine-tuning
will eventually becomefinetuning
but it seems like the research community prefers the hyphenated form for now.) - update fine-tuning scheduler logo for hyphenation
- update strategy resolution in test helper module runif
- bump omegaconf version requirement in examples reqs (in addition to extra reqs) due to omegaconf bug
- Enable use of untested strategies with new flag and user warning
- Update various dependency minimum versions
- Minor example logging update
- minor privacy policy link update
- bump omegaconf version requirement due to omegaconf bug
- Bumped latest tested PL patch version to 1.6.4
- Added basic notebook-based example tests a new ipynb-specific extra
- Updated docker definitions
- Extended multi-gpu testing to include both oldest and latest supported PyTorch versions
- Enhanced requirements parsing functionality
- cleaned up acknowledged warnings in multi-gpu example testing
- Added LR scheduler reinitialization functionality (#2)
- Added advanced usage documentation
- Added advanced scheduling examples
- added notebook-based tutorial link
- enhanced cli-based example hparam logging among other code clarifications
- addressed URI length limit for custom badge
- allow new deberta fast tokenizer conversion warning for transformers >= 4.19
- bumped latest tested PL patch version to 1.6.3
- added multiple badges (docker, conda, zenodo)
- added build status matrix to readme
- bumped latest tested PL patch version to 1.6.2
- updated citation cff configuration to include all version metadata
- removed tag-based trigger for azure-pipelines multi-gpu job
- added conda-forge package
- added docker release and pypi workflows
- additional badges for readme, testing enhancements for oldest/newest pl patch versions
- bumped latest tested PL patch version to 1.6.1, CLI example depends on PL logger fix (#12609)
- Addressed version prefix issue with readme transformation for pypi
- None (initial release)
- None (initial release)
- None (initial release)
- None (initial release)