[2.0API] Reconstruct all API related to LR Scheduler, unify dygraph and static #26550

zhwesky2010 · 2020-08-21T14:26:59Z

PR types

New features

PR changes

APIs

Describe

Reconstruct all API related to lr scheduler, A total of 12 kinds of class _LRScheduler:

Unify dygraph to manual update learning rate by .step() function. User should update learning rate manually by step() .
Unify static with dygraph. User should update learning rate manually by step() after executor.run() , every executor.run() will feed the python float value of lr_scheduler into global learning_rate variable.

中文文档

PaddlePaddle/docs#2459

英文文档

… develop

paddle-bot-old · 2020-08-21T14:27:25Z

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

wawltor · 2020-08-21T18:42:01Z

python/paddle/optimizer/lr_scheduler.py

+        """
+        self.keys = ['last_epoch', 'last_lr']
+
+    def set_dict(self, state_dict):


上次有说过建立一个别名，set_state_dict

wawltor · 2020-08-21T18:45:33Z

python/paddle/optimizer/lr_scheduler.py

+
+
+    Args:
+        d$_{model}$(int): The dimensionality of input and output feature vector of model. It is a python float number.


为什么要写吃d$_{model} 这种了？

为了文档上model是下标的形式

wawltor · 2020-08-21T18:47:16Z

python/paddle/optimizer/lr_scheduler.py

+        d$_{model}$(int): The dimensionality of input and output feature vector of model. It is a python float number.
+        warmup_steps(Variable|int): The number of warmup steps. A super parameter. It is a python float number
+        learning_rate (float): The initial learning rate. It is a python float number. Default: 1.0.
+        last_epoch (int, optional):  If ``True``, prints a message to stdout for each update. Default: -1, means initial learning rate.


看pytorch的实现方式last_epoch，是指如果想重启训练时，可以设置重启训练的epoch数然后来计算学习率，而等于-1时，默认的学习率就是初始学习率

wawltor · 2020-08-21T18:47:29Z

python/paddle/optimizer/lr_scheduler.py

+
+    Args:
+        d$_{model}$(int): The dimensionality of input and output feature vector of model. It is a python float number.
+        warmup_steps(Variable|int): The number of warmup steps. A super parameter. It is a python float number


Variable->Tensor

wawltor · 2020-08-21T18:50:55Z

python/paddle/optimizer/lr_scheduler.py

+            last_epoch=last_epoch, verbose=verbose)
+
+    def get_lr(self):
+


可以把这行去掉

wawltor · 2020-08-21T18:51:56Z

python/paddle/optimizer/lr_scheduler.py

+        learning_rate (float): The initial learning rate. It is a python float number.
+        gamma (float, optional): The Ratio that the learning rate will be reduced. ``new_lr = origin_lr * decay_rate`` . 
+            It should be less than 1.0. Default: 0.1.
+        last_epoch (int, optional):  If ``True``, prints a message to stdout for each update. Default: -1, means initial learning rate.


wawltor · 2020-08-21T19:05:12Z

python/paddle/fluid/optimizer.py

+            lr_var = self._global_learning_rate()
+            # only create global lr_var once
+            if not isinstance(lr_var, framework.Variable):
+                print("create global learning rate")


这行日志去掉吧

wawltor · 2020-08-21T19:10:40Z

python/paddle/fluid/optimizer.py

+                    persistable=True,
+                    stop_gradient=True,
+                    dtype='float32' if self._dtype is None else self._dtype)
+                main_prog = framework.default_main_program()


这里为什么是main_program, 如果不是main_program会不会有问题？

如果optimizer op在哪个program就要设在放对应program里，被设置了这个属性的program会在每次executor run时，会feed相应float型学习率到对应Variable里->前向->反向->优化，跟着optimize op走的

wawltor · 2020-08-22T15:47:46Z

python/paddle/optimizer/lr_scheduler.py

+    def step(self, epoch=None):
+        """
+        step should be called after 'minimize' . It will Update the learning rate in optimizer according to 'epoch'.  
+        The new learning rate will take effect on next optimize operation.


Update->update

minimize -> step 后续优化器也是调用step函数

wawltor · 2020-08-22T16:02:29Z

python/paddle/optimizer/lr_scheduler.py

+            learning_rate = 0.1
+
+    Args:
+        learning_rate (float): The initial learning rate. It is a python float number.


learning_rate 好像不在初始化参数列表中

wawltor · 2020-08-22T16:26:35Z

python/paddle/optimizer/lr_scheduler.py

+        decay_steps(int): The decay step size. It determines the decay cycle.
+        end_lr(float, optional): The minimum final learning rate. Default: 0.0001.
+        power(float, optional): Power of polynomial. Default: 1.0.
+        cycle(bool, optional): If set true, decay the learning rate every decay_steps. Default: False.


cycle 这个解释有问题，可以看一下PolynomialDecay的解释

wawltor · 2020-08-22T16:36:17Z

python/paddle/optimizer/lr_scheduler.py

+
+class LinearLrWarmup(_LRScheduler):
+    """
+


这里缺少一些该学习率的介绍，之前的API是有解释的

… develop

wawltor

LGTM

XiaoguangHu01 · 2020-08-24T06:24:15Z

python/paddle/optimizer/lr_scheduler.py

+            paddle.disable_static()
+            x = np.random.uniform(-1, 1, [10, 10]).astype("float32")
+            linear = paddle.nn.Linear(10, 10)
+            scheduler = paddle.optimizer.NoamLR(d_model=0.01, warmup_steps=100, verbose=True)


区分下Optimizer
paddle.optimizer.lr_scheduler.NoamLR

XiaoguangHu01 · 2020-08-24T06:30:06Z

python/paddle/optimizer/lr_scheduler.py

+                    out = linear(x)
+                    loss = paddle.reduce_mean(out)
+                    out.backward()
+                    sgd.minimize(loss)


原来的写法还可以用，动态图下推荐用新的写法：
sgd.step()
sgd.clear_grad()
静态图下的minimize和动态图下的minimize虽然函数名相同，但两者区别较大：

静态图minimize只被调用一次，动态图会被反复调用

静态图需要传入loss参数，动态图不需要
所以动态图下新增了一个step函数

目前sgd大部分optimizer还不支持step

XiaoguangHu01 · 2020-08-24T06:34:10Z

python/paddle/optimizer/lr_scheduler.py

+                    x = paddle.to_tensor(x)
+                    out = linear(x)
+                    loss = paddle.reduce_mean(out)
+                    out.backward()


loss.backward()

XiaoguangHu01 · 2020-08-24T06:35:11Z

python/paddle/optimizer/lr_scheduler.py

+            x = np.random.uniform(-1, 1, [10, 10]).astype("float32")
+            linear = paddle.nn.Linear(10, 10)
+            scheduler = paddle.optimizer.NoamLR(d_model=0.01, warmup_steps=100, verbose=True)
+            sgd = paddle.optimizer.SGD(learning_rate=scheduler, parameter_list=linear.parameters())


optimizer使用新的参数名称
parameter_list -> parameters
#26288

下个PR统一修改文档

XiaoguangHu01 · 2020-08-24T06:35:44Z

python/paddle/optimizer/lr_scheduler.py

+            main_prog = paddle.static.Program()
+            start_prog = paddle.static.Program()
+            with paddle.static.program_guard(main_prog, start_prog):
+                x = paddle.static.data(name='x', shape=[-1, 4, 5])


shape=[None, 4, 5]

XiaoguangHu01 · 2020-08-24T06:38:53Z

python/paddle/optimizer/lr_scheduler.py

+                scheduler = paddle.optimizer.NoamLR(d_model=0.01, warmup_steps=100, verbose=True)
+                sgd = paddle.optimizer.SGD(learning_rate=scheduler)
+                sgd.minimize(loss)
+                lr_var = sgd._global_learning_rate()


这里为什么需要调用一个内部的函数？

Done，删去

XiaoguangHu01 · 2020-08-24T06:39:50Z

python/paddle/optimizer/lr_scheduler.py

+                            'x': np.random.randn(3, 4, 5).astype('float32'),
+                            'y': np.random.randn(3, 4, 5).astype('float32')
+                        },
+                        fetch_list=lr_var.name)


这里为什么需要fetch lr_var? 并没有看到有使用返回的out。

Done，删去

XiaoguangHu01 · 2020-08-24T06:46:14Z

python/paddle/fluid/optimizer.py

        self._parameter_list = list(
            parameter_list) if parameter_list is not None else None
        self._name = name
        if framework.in_dygraph_mode():
-            if not isinstance(learning_rate, float) and \
-                    not isinstance(learning_rate, LearningRateDecay):
+            if not isinstance(learning_rate,


这里为什么修改的是paddle.fluid.optimizer.py文件，而不是paddle.optimizer.optimizer.py文件?
1.8版本写的代码，运行的行为会发生变化。

新optimizer目前不支持大部分优化器，通知迁移优化器同学将fluid 中optimizer行为迁移到paddle optimizer中。

是做的兼容升级，1.8中不会有行为变化，但支持新的逻辑。

zhwesky2010 · 2020-08-24T07:15:18Z

文档修改在下个PR统一修复

XiaoguangHu01

先合入，下个PR更新示例代码。

jzhang533

lgtm
will have followup pr.

TCChenlong · 2020-08-25T02:10:47Z

python/paddle/optimizer/lr_scheduler.py

+
+    Args:
+        learning_rate (float): The initial learning rate. It is a python float number.
+        gamma (float, optional): The Ratio that the learning rate will be reduced. ``new_lr = origin_lr * gamma`` . 


看init 是必选参数吧？

TCChenlong · 2020-08-25T02:10:56Z

python/paddle/optimizer/lr_scheduler.py

+        gamma (float, optional): The Ratio that the learning rate will be reduced. ``new_lr = origin_lr * gamma`` . 
+            It should be less than 1.0. Default: 0.1.
+        last_epoch (int, optional):  The index of last epoch. Can be set to restart training. Default: -1, means initial learning rate.
+        verbose (bool): If ``True``, prints a message to stdout for each update. Default: ``False`` .


TCChenlong · 2020-08-25T02:15:36Z

python/paddle/optimizer/lr_scheduler.py

+
+    Args:
+        learning_rate (float): The initial learning rate. It is a python float number.
+        gamma (float, optional): The Ratio that the learning rate will be reduced. ``new_lr = origin_lr * gamma`` . 


gamma 是否为 optional

TCChenlong · 2020-08-25T02:15:50Z

python/paddle/optimizer/lr_scheduler.py

+        gamma (float, optional): The Ratio that the learning rate will be reduced. ``new_lr = origin_lr * gamma`` . 
+            It should be less than 1.0. Default: 0.1.
+        last_epoch (int, optional):  The index of last epoch. Can be set to restart training. Default: -1, means initial learning rate.
+        verbose (bool): If ``True``, prints a message to stdout for each update. Default: ``False`` .


同上缺少optional

TCChenlong · 2020-08-25T02:18:04Z

python/paddle/optimizer/lr_scheduler.py

+        learning_rate (float): The initial learning rate. It is a python float number.
+        lr_lambda (function): A function which computes a factor by ``epoch`` , and then multiply the initial learning rate by this factor.
+        last_epoch (int, optional):  The index of last epoch. Can be set to restart training. Default: -1, means initial learning rate.
+        verbose (bool): If ``True``, prints a message to stdout for each update. Default: ``False`` .


同上缺少optional

TCChenlong · 2020-08-25T02:23:49Z

python/paddle/optimizer/lr_scheduler.py

+        warmup_steps(int): The number of warmup steps. A super parameter. It is a python int number
+        learning_rate (float): The initial learning rate. It is a python float number. Default: 1.0.
+        last_epoch (int, optional):  The index of last epoch. Can be set to restart training. Default: -1, means initial learning rate.
+        verbose (bool): If ``True``, prints a message to stdout for each update. Default: ``False`` .


同 optional

TCChenlong · 2020-08-25T02:24:35Z

python/paddle/optimizer/lr_scheduler.py

+        values(list): A list of learning rate values that will be picked during different epoch boundaries. 
+            The type of element in the list is python float.
+        last_epoch (int, optional):  The index of last epoch. Can be set to restart training. Default: -1, means initial learning rate.
+        verbose (bool): If ``True``, prints a message to stdout for each update. Default: ``False`` .


同 optional

TCChenlong · 2020-08-25T02:25:36Z

python/paddle/optimizer/lr_scheduler.py

+        cycle(bool, optional): Whether the learning rate rises again. If True, then the learning rate will rise when it decrease 
+            to ``end_lr`` .  If False, the learning rate is monotone decreasing. Default: False.
+        last_epoch (int, optional):  The index of last epoch. Can be set to restart training. Default: -1, means initial learning rate.
+        verbose (bool): If ``True``, prints a message to stdout for each update. Default: ``False`` .


同 optional

TCChenlong · 2020-08-25T02:27:17Z

python/paddle/optimizer/lr_scheduler.py

+            change of ``loss`` is ``threshold`` . Default: ``'rel'`` .
+        cooldown (int, optional): The number of epochs to wait before resuming normal operation. Default: 0.
+        min_lr (float, optional): The lower bound of the learning rate after reduction. Default: 0.
+        epsilon (float, optional): Minimal decay applied to lr. If the difference between new and old lr is smaller than eps, the update is


smaller than epsilon

TCChenlong · 2020-08-25T02:28:50Z

python/paddle/optimizer/lr_scheduler.py

+        gamma (float, optional): The Ratio that the learning rate will be reduced. ``new_lr = origin_lr * gamma`` . 
+            It should be less than 1.0. Default: 0.1.
+        last_epoch (int, optional):  The index of last epoch. Can be set to restart training. Default: -1, means initial learning rate.
+        verbose (bool): If ``True``, prints a message to stdout for each update. Default: ``False`` .


同上 optional

jzhang533 · 2020-08-25T13:55:37Z

lr scheduler都有一个verbose参数，感觉并不是很必要吧？

zhwesky2010 · 2020-08-26T02:14:58Z

lr scheduler都有一个verbose参数，感觉并不是很必要吧？

这个功能感觉还比较实用

zhwesky2010 added 2 commits August 20, 2020 20:52

Reconstruct all API related to lr scheduler, unify dygraph and static

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.

GPG key ID: 4AEE18F83AFDEB23
Expired

Verified
Learn about vigilant mode

f1522d5

wawltor reviewed Aug 21, 2020

View reviewed changes

zhwesky2010 force-pushed the reconstruct_lr_scheduler1 branch from d5ab480 to 818692d Compare August 22, 2020 12:37

Reconstruct all API related to lr scheduler, unify dygraph and static

6cb899b

zhwesky2010 force-pushed the reconstruct_lr_scheduler1 branch from 818692d to 6cb899b Compare August 22, 2020 12:50

zhwesky2010 changed the title ~~Reconstruct all API related to lr scheduler, unify dygraph and static~~ [2.0API] Reconstruct all API related to lr scheduler, unify dygraph and static Aug 22, 2020

fix doc

801df84

PaddlePaddle locked and limited conversation to collaborators Aug 22, 2020

PaddlePaddle unlocked this conversation Aug 22, 2020

zhwesky2010 added 2 commits August 22, 2020 14:53

fix doc

146e191

fix doc of lr_scheduler

38559f1

wawltor reviewed Aug 22, 2020

View reviewed changes

wawltor mentioned this pull request Aug 23, 2020

Add dynamic/static learning rate doc for the api2.0 PaddlePaddle/docs#2459

Merged

zhwesky2010 added 5 commits August 23, 2020 10:55

fix unittest and english doc

cef2787

fix english doc

ecfea6e

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

f7f000f

… develop

fix confilt

1252607

fix doc

e5cb9fb

zhwesky2010 changed the title ~~[2.0API] Reconstruct all API related to lr scheduler, unify dygraph and static~~ [2.0API] Reconstruct all API related to LR Scheduler, unify dygraph and static Aug 23, 2020

wawltor approved these changes Aug 24, 2020

View reviewed changes

zhwesky2010 requested a review from XiaoguangHu01 August 24, 2020 06:06

XiaoguangHu01 reviewed Aug 24, 2020

View reviewed changes

XiaoguangHu01 approved these changes Aug 24, 2020

View reviewed changes

jzhang533 approved these changes Aug 24, 2020

View reviewed changes

raindrops2sea approved these changes Aug 24, 2020

View reviewed changes

zhwesky2010 merged commit 407de03 into PaddlePaddle:develop Aug 24, 2020

TCChenlong reviewed Aug 25, 2020

View reviewed changes

zhwesky2010 mentioned this pull request Aug 28, 2020

[2.0API]support 2.0 lr_scheduler for 2.0 optimizer #26737

Merged

This was referenced Mar 8, 2022

【PaddlePaddle Hackathon 2】13、为 Paddle 新增 CyclicLR 优化调度器 #40321

Closed

【PaddlePaddle Hackathon 2】12、为 Paddle 新增 OneCycleLR 优化调度器 #40322

Closed

JerryFu2000 mentioned this pull request Apr 15, 2022

【PaddlePaddle Hackathon 第二期】任务总览 #40234

Closed



		Args:
		d$_{model}$(int): The dimensionality of input and output feature vector of model. It is a python float number.

[2.0API] Reconstruct all API related to LR Scheduler, unify dygraph and static #26550

[2.0API] Reconstruct all API related to LR Scheduler, unify dygraph and static #26550

Conversation

zhwesky2010 commented Aug 21, 2020 • edited Loading

PR types

PR changes

Describe

中文文档

英文文档

paddle-bot-old bot commented Aug 21, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wawltor left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zhwesky2010 commented Aug 24, 2020

XiaoguangHu01 left a comment

Choose a reason for hiding this comment

jzhang533 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jzhang533 commented Aug 25, 2020

zhwesky2010 commented Aug 26, 2020

zhwesky2010 commented Aug 21, 2020 •

edited

Loading