Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removed flush_logs_every_n_steps argument from Trainer #13074

Merged
merged 6 commits into from
May 25, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -120,6 +120,9 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).

### Removed

- Removed the deprecated `flush_logs_every_n_steps` argument from the `Trainer` constructor ([#13074](https://github.com/PyTorchLightning/pytorch-lightning/pull/13074))


- Removed the deprecated `process_position` argument from the `Trainer` constructor ([13071](https://github.com/PyTorchLightning/pytorch-lightning/pull/13071))


Expand Down
24 changes: 0 additions & 24 deletions docs/source/common/trainer.rst
Original file line number Diff line number Diff line change
Expand Up @@ -695,30 +695,6 @@ impact to subsequent runs. These are the changes enabled:
- Disables the Tuner.
- If using the CLI, the configuration file is not saved.

flush_logs_every_n_steps
^^^^^^^^^^^^^^^^^^^^^^^^

.. warning:: ``flush_logs_every_n_steps`` has been deprecated in v1.5 and will be removed in v1.7.
Please configure flushing directly in the logger instead.

.. raw:: html

<video width="50%" max-width="400px" controls
poster="https://pl-bolts-doc-images.s3.us-east-2.amazonaws.com/pl_docs/trainer_flags/thumb/flush_logs%E2%80%A8_every_n_steps.jpg"
src="https://pl-bolts-doc-images.s3.us-east-2.amazonaws.com/pl_docs/trainer_flags/flush_logs_every_n_steps.mp4"></video>

|

Writes logs to disk this often.

.. testcode::

# default used by the Trainer
trainer = Trainer(flush_logs_every_n_steps=100)

See Also:
- :doc:`logging <../extensions/logging>`

.. _gpus:

gpus
Expand Down
17 changes: 8 additions & 9 deletions docs/source/visualize/logging_advanced.rst
Original file line number Diff line number Diff line change
Expand Up @@ -49,20 +49,19 @@ To change this behaviour, set the *log_every_n_steps* :class:`~pytorch_lightning
Modify flushing frequency
awaelchli marked this conversation as resolved.
Show resolved Hide resolved
=========================

Metrics are kept in memory for N steps to improve training efficiency. Every N steps, metrics flush to disk. To change the frequency of this flushing, use the *flush_logs_every_n_steps* Trainer argument.
Some loggers keep logged metrics in memory for N steps and only periodically flush them to disk to improve training efficiency.
Every logger handles this a bit differently. For example, here is how to fine-tune flushing for the TensorBoard logger:

.. code-block:: python

# faster training, high memory
Trainer(flush_logs_every_n_steps=500)
# Default used by TensorBoard: Write to disk after 10 logging events or every two minutes
logger = TensorBoardLogger(..., max_queue=10, flush_secs=120)

# slower training, low memory
Trainer(flush_logs_every_n_steps=500)
# Faster training, more memory used
logger = TensorBoardLogger(..., max_queue=100)

The higher *flush_logs_every_n_steps* is, the faster the model will train but the memory will build up until the next flush.
The smaller *flush_logs_every_n_steps* is, the slower the model will train but memory will be kept to a minimum.

TODO: chart
# Slower training, less memory used
logger = TensorBoardLogger(..., max_queue=1)

----

Expand Down
4 changes: 1 addition & 3 deletions pytorch_lightning/loops/epoch/training_epoch_loop.py
Original file line number Diff line number Diff line change
Expand Up @@ -528,9 +528,7 @@ def _should_check_val_fx(self, batch_idx: int, is_last_batch: bool) -> bool:

def _save_loggers_on_train_batch_end(self) -> None:
"""Flushes loggers to disk."""
# this assumes that `batches_that_stepped` was increased before
should_flush = self._batches_that_stepped % self.trainer.flush_logs_every_n_steps == 0
shenoynikhil marked this conversation as resolved.
Show resolved Hide resolved
if should_flush or self.trainer.should_stop:
if self.trainer.should_stop:
for logger in self.trainer.loggers:
logger.save()

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -43,19 +43,10 @@ def __init__(self, trainer: "pl.Trainer") -> None:
def on_trainer_init(
self,
logger: Union[bool, Logger, Iterable[Logger]],
flush_logs_every_n_steps: Optional[int],
log_every_n_steps: int,
move_metrics_to_cpu: bool,
) -> None:
self.configure_logger(logger)
if flush_logs_every_n_steps is not None:
rank_zero_deprecation(
f"Setting `Trainer(flush_logs_every_n_steps={flush_logs_every_n_steps})` is deprecated in v1.5 "
"and will be removed in v1.7. Please configure flushing in the logger instead."
)
else:
flush_logs_every_n_steps = 100 # original default parameter
self.trainer.flush_logs_every_n_steps = flush_logs_every_n_steps
shenoynikhil marked this conversation as resolved.
Show resolved Hide resolved
self.trainer.log_every_n_steps = log_every_n_steps
self.trainer.move_metrics_to_cpu = move_metrics_to_cpu
for logger in self.trainer.loggers:
Expand Down
9 changes: 1 addition & 8 deletions pytorch_lightning/trainer/trainer.py
Original file line number Diff line number Diff line change
Expand Up @@ -159,7 +159,6 @@ def __init__(
limit_test_batches: Optional[Union[int, float]] = None,
limit_predict_batches: Optional[Union[int, float]] = None,
val_check_interval: Optional[Union[int, float]] = None,
flush_logs_every_n_steps: Optional[int] = None,
log_every_n_steps: int = 50,
accelerator: Optional[Union[str, Accelerator]] = None,
strategy: Optional[Union[str, Strategy]] = None,
Expand Down Expand Up @@ -260,12 +259,6 @@ def __init__(
of train, val and test to find any bugs (ie: a sort of unit test).
Default: ``False``.

flush_logs_every_n_steps: How often to flush logs to disk (defaults to every 100 steps).

.. deprecated:: v1.5
``flush_logs_every_n_steps`` has been deprecated in v1.5 and will be removed in v1.7.
Please configure flushing directly in the logger instead.

gpus: Number of GPUs to train on (int) or which GPUs to train on (list or str) applied per node
Default: ``None``.

Expand Down Expand Up @@ -555,7 +548,7 @@ def __init__(

# init logger flags
self._loggers: List[Logger]
self._logger_connector.on_trainer_init(logger, flush_logs_every_n_steps, log_every_n_steps, move_metrics_to_cpu)
self._logger_connector.on_trainer_init(logger, log_every_n_steps, move_metrics_to_cpu)

# init debugging flags
self.val_check_interval: Union[int, float]
Expand Down
5 changes: 0 additions & 5 deletions tests/deprecated_api/test_remove_1-7.py
Original file line number Diff line number Diff line change
Expand Up @@ -61,11 +61,6 @@ def on_keyboard_interrupt(self, trainer, pl_module):
trainer.fit(model)


def test_v1_7_0_flush_logs_every_n_steps_trainer_constructor(tmpdir):
with pytest.deprecated_call(match=r"Setting `Trainer\(flush_logs_every_n_steps=10\)` is deprecated in v1.5"):
_ = Trainer(flush_logs_every_n_steps=10)


class BoringCallbackDDPSpawnModel(BoringModel):
def add_to_queue(self, queue):
...
Expand Down