Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use "Warning" in documentation #29

Merged
merged 1 commit into from
Oct 12, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 8 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@ The scheduled learning rate is dampened by the multiplication of the warmup fact
<p align="center"><img src="https://github.com/Tony-Y/pytorch_warmup/raw/master/examples/emnist/figs/learning_rate.png" alt="Learning rate" width="400"/></p>

#### Approach 1

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Tony-Y/colab-notebooks/blob/master/PyTorch_Warmup_Approach1_chaining.ipynb)

When the learning rate schedule uses the global iteration number, the untuned linear warmup can be used
Expand All @@ -66,9 +67,12 @@ for epoch in range(1,num_epochs+1):
with warmup_scheduler.dampening():
lr_scheduler.step()
```
Note that the warmup schedule must not be initialized before the initialization of the learning rate schedule.

> [!Warning]
> Note that the warmup schedule must not be initialized before the initialization of the learning rate schedule.

If you want to use the learning rate schedule *chaining*, which is supported for PyTorch 1.4 or above, you may simply write a code of learning rate schedulers as a suite of the `with` statement:

```python
lr_scheduler1 = torch.optim.lr_scheduler.ExponentialLR(optimizer, gamma=0.9)
lr_scheduler2 = torch.optim.lr_scheduler.StepLR(optimizer, step_size=3, gamma=0.1)
Expand All @@ -83,6 +87,7 @@ for epoch in range(1,num_epochs+1):
```

If you want to start the learning rate schedule after the end of the linear warmup, delay it by the warmup period:

```python
warmup_period = 2000
num_steps = len(dataloader) * num_epochs - warmup_period
Expand All @@ -98,6 +103,7 @@ for epoch in range(1,num_epochs+1):
```

#### Approach 2

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Tony-Y/colab-notebooks/blob/master/PyTorch_Warmup_Approach2_chaining.ipynb)

When the learning rate schedule uses the epoch number, the warmup schedule can be used as follows:
Expand Down Expand Up @@ -133,6 +139,7 @@ for epoch in range(1,num_epochs+1):
```

#### Approach 3

When you use `CosineAnnealingWarmRestarts`, the warmup schedule can be used as follows:

```python
Expand Down Expand Up @@ -216,7 +223,6 @@ lr_scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max=num_s
warmup_scheduler = warmup.UntunedLinearWarmup(optimizer)
```


## License

MIT License
Expand Down
4 changes: 3 additions & 1 deletion docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,9 @@ together with :class:`Adam` or its variant (:class:`AdamW`, :class:`NAdam`, etc.
with warmup_scheduler.dampening():
lr_scheduler.step()

Note that the warmup schedule must not be initialized before the initialization of the learning rate schedule.
.. warning::
Note that the warmup schedule must not be initialized before the initialization of the learning rate schedule.

Other approaches can be found in `README <https://github.com/Tony-Y/pytorch_warmup?tab=readme-ov-file#usage>`_.

.. toctree::
Expand Down
4 changes: 2 additions & 2 deletions pytorch_warmup/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -170,7 +170,7 @@ class LinearWarmup(BaseWarmup):
>>> with warmup_scheduler.dampening():
>>> lr_scheduler.step()

Note:
Warning:
The warmup schedule must not be initialized before the initialization of the learning rate schedule.
"""

Expand Down Expand Up @@ -218,7 +218,7 @@ class ExponentialWarmup(BaseWarmup):
>>> with warmup_scheduler.dampening():
>>> lr_scheduler.step()

Note:
Warning:
The warmup schedule must not be initialized before the initialization of the learning rate schedule.
"""

Expand Down
2 changes: 1 addition & 1 deletion pytorch_warmup/radam.py
Original file line number Diff line number Diff line change
Expand Up @@ -117,7 +117,7 @@ class RAdamWarmup(BaseWarmup):
>>> with warmup_scheduler.dampening():
>>> lr_scheduler.step()

Note:
Warning:
The warmup schedule must not be initialized before the initialization of the learning rate schedule.
"""

Expand Down
4 changes: 2 additions & 2 deletions pytorch_warmup/untuned.py
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ class UntunedLinearWarmup(LinearWarmup):
>>> with warmup_scheduler.dampening():
>>> lr_scheduler.step()

Note:
Warning:
The warmup schedule must not be initialized before the initialization of the learning rate schedule.
"""

Expand Down Expand Up @@ -133,7 +133,7 @@ class UntunedExponentialWarmup(ExponentialWarmup):
>>> with warmup_scheduler.dampening():
>>> lr_scheduler.step()

Note:
Warning:
The warmup schedule must not be initialized before the initialization of the learning rate schedule.
"""

Expand Down