Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

【Hackathon No.13】为 Paddle 新增 CyclicLR 优化调度器 #4315

Merged
merged 12 commits into from
Jun 8, 2022
3 changes: 2 additions & 1 deletion docs/api/api_label
Original file line number Diff line number Diff line change
Expand Up @@ -1010,7 +1010,8 @@ paddle.optimizer.lr.ReduceOnPlateau .. _api_paddle_optimizer_lr_ReduceOnPlateau:
paddle.optimizer.lr.StepDecay .. _api_paddle_optimizer_lr_StepDecay:
paddle.optimizer.lr.PolynomialDecay .. _api_paddle_optimizer_lr_PolynomialDecay:
paddle.optimizer.lr.NaturalExpDecay .. _api_paddle_optimizer_lr_NaturalExpDecay:
paddle.optimizer.lr.OneCycleLR .. _cn_api_paddle_optimizer_lr_OneCycleLR:
paddle.optimizer.lr.OneCycleLR .. _api_paddle_optimizer_lr_OneCycleLR:
paddle.optimizer.lr.CyclicLR .. _api_paddle_optimizer_lr_CyclicLR:
paddle.regularizer.L1Decay .. _api_paddle_regularizer_L1Decay:
paddle.regularizer.L2Decay .. _api_paddle_regularizer_L2Decay:
paddle.static.InputSpec .. _api_paddle_static_InputSpec:
Expand Down
1 change: 1 addition & 0 deletions docs/api/paddle/optimizer/Overview_cn.rst
Original file line number Diff line number Diff line change
Expand Up @@ -54,3 +54,4 @@ paddle.optimizer 目录下包含飞桨框架支持的优化器算法相关的API
" :ref:`StepDecay <cn_api_paddle_optimizer_lr_StepDecay>` ", "按指定间隔轮数学习率衰减"
" :ref:`MultiplicativeDecay <cn_api_paddle_optimizer_lr_MultiplicativeDecay>` ", "根据lambda函数进行学习率衰减"
" :ref:`OneCycleLR <cn_api_paddle_optimizer_lr_OneCycleLR>` ", "One Cycle学习率衰减"
" :ref:`CyclicLR <cn_api_paddle_optimizer_lr_CyclicLR>` ", "Cyclic学习率衰减"
62 changes: 62 additions & 0 deletions docs/api/paddle/optimizer/lr/CyclicLR_cn.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
.. _cn_api_paddle_optimizer_lr_CyclicLR:

CyclicLR
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

http://preview-pr-4315.paddle-docs-preview.paddlepaddle.org.cn/documentation/docs/zh/api/paddle/optimizer/lr/CyclicLR_cn.html

可以对着预览检查一下格式,目前有些格式问题,比如以下:
image

image

-----------------------------------

.. py:class:: paddle.optimizer.lr.CyclicLR(base_learning_rate, max_learning_rate, step_size_up, step_size_down, mode, gamma, scale_fn, scale_mode, last_epoch, verbose)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gamma -> exp_gamma


该接口提供一种学习率按固定频率在两个边界之间循环的策略。
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

该接口

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修改


该策略将学习率调整的过程视为一个又一个的循环,学习率根据指定的缩放策略以固定的频率在最大和最小学习率之间变化。

相关论文: `Cyclic Learning Rates for Training Neural Networks <https://arxiv.org/abs/1506.01186>`_

内置了三种学习率缩放策略,分别如下:
- **triangular**: 没有任何缩放的三角循环。
- **triangular2**:每个三角循环里将初始幅度缩放一半。
- **exp_range**:每个循环中将初始幅度按照指数函数进行缩放,公式为 :math:`gamma^{cycle iterations}`。

初始幅度由`max_learning_rate - base_learning_rate`定义,:math:`gamma` 为一个常量,:math:`cycle iterations` 表示`cycle`数或'iterations'数。
cycle定义为:math:`cycle = 1 + floor(epoch / (step_size_up + step_size_down))` , 需要注意的是,CyclicLR应在每个批次的训练后调用step,因此这里的epoch等同于iterations,都表示当前实际迭代数。

参数
::::::::::::

- **base_learning_rate** (float) - 初始学习率,也是学习率变化的下边界。
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

和英文对齐语义,其他部分也同理,需要中英文对齐语义

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修改

- **max_learning_rate** (float) - 最大学习率,需要注意的是,实际的学习率由 ``base_learning_rate`` 与初始幅度的缩放求和而来,因此实际学习率可能达不到 ``max_learning_rate`` 。
- **step_size_up** (int) - 学习率从初始学习率增长到最大学习率所需步数。
- **step_size_down** (int,可选) - 学习率从最大学习率下降到初始学习率所需步数。若未指定,则其值默认等于 ``step_size_up`` 。
- **mode** (str,可选) - 可以是'triangular'、'triangular2'或者'exp_range',对应策略已在上文描述,当`scale_fn`被指定时时,该参数将被忽略。默认值:'triangular'。
- **exp_gamma** (float,可选) - 'exp_range'缩放函数中的常量。默认值:1,0。
- **sacle_fn** (function, 可选) - 一个有且仅有单个参数的函数,且对于任意的输入x,都必须满足0 <= scale_fn(x) <= 1;如果该参数被指定,则会忽略`mode`参数。默认值: ``False`` 。
- **scale_mode** (str,可选) - 'cycle'或者'iterations',表示缩放函数使用`cycle`数或`iterations`数作为输入。
- **last_epoch** (int,可选) - 上一轮的轮数,重启训练时设置为上一轮的epoch数。默认值为 -1,则为初始学习率。
- **verbose** (bool,可选) - 如果是 ``True`` ,则在每一轮更新时在标准输出 `stdout` 输出一条信息。默认值为 ``False`` 。

返回:
::::::::::::
用于调整学习率的``CyclicLR``实例对象。
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里 `` 前后要空格

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

down


代码示例
::::::::::::

COPY-FROM: paddle.optimizer.lr.CyclicLR

方法
::::::::::::
step(epoch=None)
'''''''''

step函数需要在优化器的 `optimizer.step()` 函数之后调用,调用之后将会根据epoch数来更新学习率,更新之后的学习率将会在优化器下一轮更新参数时使用。

**参数**

- **epoch** (int,可选) - 指定具体的epoch数。默认值None,此时将会从-1自动累加 ``epoch`` 数。

**返回**

无。

**代码示例**

参照上述示例代码。
4 changes: 3 additions & 1 deletion docs/api/paddle/optimizer/lr/LRScheduler_cn.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ LRScheduler

学习率策略的基类。定义了所有学习率调整策略的公共接口。

目前在paddle中基于该基类,已经实现了13种策略,分别为:
目前在paddle中基于该基类,已经实现了14种策略,分别为:

* :code:`NoamDecay`: 诺姆衰减,相关算法请参考 `《Attention Is All You Need》 <https://arxiv.org/pdf/1706.03762.pdf>`_ 。请参考 :ref:`cn_api_paddle_optimizer_lr_NoamDecay`。

Expand Down Expand Up @@ -37,6 +37,8 @@ LRScheduler

* :code:`OneCycleLR`: One Cycle衰减,学习率上升至最大,再下降至最小。请参考 :ref:`cn_api_paddle_optimizer_lr_OneCycleLR`。

* :code:`CyclicLR`: Cyclic学习率衰减,其将学习率变化的过程视为一个又一个循环,学习率根据固定的频率在最小和最大学习率之间不停变化。请参考 :ref:`cn_api_paddle_optimizer_lr_CyclicLR`。

你可以继承该基类实现任意的学习率策略,导出基类的方法为 ``form paddle.optimizer.lr import LRScheduler`` ,
必须要重写该基类的 ``get_lr()`` 函数,否则会抛出 ``NotImplementedError`` 异常。

Expand Down
21 changes: 20 additions & 1 deletion docs/api/paddle/optimizer/lr/OneCycleLR_cn.rst
Original file line number Diff line number Diff line change
Expand Up @@ -35,4 +35,23 @@ OneCycleLR
代码示例
::::::::::::

COPY-FROM: paddle.optimizer.lr.OneCycleLR
COPY-FROM: paddle.optimizer.lr.OneCycleLR

方法
::::::::::::
step(epoch=None)
'''''''''

step函数需要在优化器的 `optimizer.step()` 函数之后调用,调用之后将会根据epoch数来更新学习率,更新之后的学习率将会在优化器下一轮更新参数时使用。

**参数**

- **epoch** (int,可选) - 指定具体的epoch数。默认值None,此时将会从-1自动累加 ``epoch`` 数。

**返回**

无。

**代码示例**

参照上述示例代码。
5 changes: 4 additions & 1 deletion docs/api_guides/low_level/layers/learning_rate_scheduler.rst
Original file line number Diff line number Diff line change
Expand Up @@ -57,4 +57,7 @@
相关API Reference请参考 :ref:`cn_api_paddle_optimizer_lr_MultiplicativeDecay`

* :code:`OneCycleLR`: One Cycle衰减,学习率上升至最大,再下降至最小.
相关API Reference请参考 :ref:`cn_api_paddle_optimizer_lr_OneCycleLR`
相关API Reference请参考 :ref:`cn_api_paddle_optimizer_lr_OneCycleLR`

* :code:`CyclicLR`: 学习率根据指定的缩放策略以固定频率在最小和最大学习率之间进行循环。
相关API Reference请参考 :ref:`_cn_api_paddle_optimizer_lr_CyclicLR`
Original file line number Diff line number Diff line change
Expand Up @@ -41,3 +41,6 @@ The following content describes the APIs related to the learning rate scheduler:

* :code:`ReduceOnPlateau`: Adjuge the learning rate according to monitoring index(In general, it's loss), and decay the learning rate when monitoring index becomes stable. For related API Reference please refer to :ref:`api_paddle_optimizer_lr_ReduceOnPlateau`

* :code:`OneCycleLR`: One cycle decay. That is, the initial learning rate first increases to maximum learning rate, and then it decreases to minimum learning rate which is much less than initial learning rate. For related API Reference please refer to :ref:`cn_api_paddle_optimizer_lr_OneCycleLR`

* :code:`CyclicLR`: Cyclic decay. That is, the learning rate cycles between minimum and maximum learning rate with a constant frequency in specified a sacle method. For related API Reference please refer to :ref:`api_paddle_optimizer_lr_CyclicLR`