-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] Deprecate weights_summary
off the Trainer constructor
#9043
Comments
@ananthsub the callback idea sounds nice but how do we make it so that a user can disable it? It's the same reason why we have the |
This assumes the weights-summary is by default enabled. In this scenario, an incremental approach would be:
Or, we can make this opt-in. Remove the argument from the trainer constructor entirely, and enforce that users set this by instantiating a callback and passing it to the |
Yes this was my assumption and I hope we can keep that. I will definitely vote for that, but I am biased since I have been working on that summary, so I prefer if others could comment @PyTorchLightning/core-contributors . I strongly believe everyone working with ML models should be aware with how many parameters they are training with. |
What this would likely also introduce is users needing to extend their callback from a particular summary base class in order for the trainer to validate whether it should add the default callback or not. This is the case for the Model However, the model checkpoint callback is not explicitly designed with extensibility in mind. See prior issues for offering a base interface: This could limit what a custom summary could do in case the base class is too restrictive. I'd also like to avoid dependencies on inheritance wherever possible. |
Would definitely like to hear @kaushikb11 's opinion given work on the Rich-based Model Summary |
@ananthsub I absolutely agree with you on introducing a new There are two parts to this issue as you mentioned. Regarding the second part proposal, I strongly believe we shouldn't deprecate the Also, with the current model = LitModel()
ModelSummary(model, max_depth=1) Hence, we should be supporting both, the existing |
@kaushikb11 @awaelchli - what do you think of this to mirror what was done for
|
I like that very much. This is the minimal functionality I wish we could keep. Btw on a side note, I believe the best moving forward would be to have a model summary class with the only responsibility of collecting the summary data (like it is now) BUT not contain the logic for printing and visualization. I think it would be best if this would live in the callback. this way it will also be easier to customize things like rich logging etc while keeping the actual model summary untouched. not sure if @kaushikb11 was already going in that direction, he might have |
@awaelchli I fully agree regarding where the output of the summarization should go. Outputting the summary should live in the callback and not in the utils, as it does today |
Awesome ! Let's do this :) |
How would we support the following then? from pytorch_lightning.utilities.model_summary import ModelSummary
model = LitModel()
ModelSummary(model, max_depth=1) IMO, we could have the default string output for the |
@ananthsub would it make sense to drop the "enable_" prefix? It seems redundant because the type is bool anyway.
#9664 has the same problem imo. |
callback idea is great , we can call it anywhere plus it gives the flexibility of setting the ModelSummary() |
Proposed refactoring or deprecation
summarize
: https://github.com/PyTorchLightning/pytorch-lightning/blob/8a931732ae5135e3e55d9c7b7031d81837e5798a/pytorch_lightning/utilities/model_summary.py#L437-L439weights_summary
off the Trainer constructorMotivation
We are auditing the Lightning components and APIs to assess opportunities for improvements:
This is a followup to #8478 and #9006
Why do we want to remove this from the core trainer logic?
max_depth
). This gives model summarization more room to grow without cascading changes elsewhere.fit()
. But users could want to call this potentially multiple times during each oftrainer.fit()
,trainer.validate()
,trainer.test()
ortrainer.predict()
.example_input_array
is set as a property on the LightningModule. For instance, a model wrapped with FSDP will break because parameters need to be all-gathered across layers across ranks.on_pretrain_routine_start/end
hooks. Would we still need these hooks if the summarization logic was removed from the trainer? Why doesn't this happen inon_train_start
today? We don't haveon_prevalidation_routine_start/end
hooks: the necessity of these hooks for training isn't clear to me, and further deprecating these hooks could bring greater API clarity & simplification.https://github.com/PyTorchLightning/pytorch-lightning/blob/8a931732ae5135e3e55d9c7b7031d81837e5798a/pytorch_lightning/trainer/trainer.py#L1103-L1113
Pitch
A callback in Lightning naturally fits this extension purpose. It generalizes well across lightning modules, has great flexibility for when it can be called, and allows users to customize the summarization logic (e.g. integrate other libraries more easily).
https://github.com/tyleryep/torchinfo
https://github.com/facebookresearch/fvcore/blob/master/fvcore/nn/flop_count.py
With this callback available, this logic can be removed from the core Trainer in order to be more pluggable:
https://github.com/PyTorchLightning/pytorch-lightning/blob/6604fc1344e1b8a459c45a5a2157aa7fc60d950d/pytorch_lightning/trainer/trainer.py#L1000-L1004
Additional context
The model summary is by default enabled right now. This is likely the core issue we have to resolve as to whether this is opt-in or opt-out: #8478 (comment)
Seeking @edenafek @tchaton 's input on this
If you enjoy Lightning, check out our other projects! ⚡
Metrics: Machine learning metrics for distributed, scalable PyTorch applications.
Flash: The fastest way to get a Lightning baseline! A collection of tasks for fast prototyping, baselining, finetuning and solving problems with deep learning
Bolts: Pretrained SOTA Deep Learning models, callbacks and more for research and production with PyTorch Lightning and PyTorch
Lightning Transformers: Flexible interface for high performance research using SOTA Transformers leveraging Pytorch Lightning, Transformers, and Hydra.
The text was updated successfully, but these errors were encountered: