Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update annotation of layers.py and faq #4329

Merged
merged 7 commits into from
Sep 25, 2017
Merged
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 32 additions & 3 deletions doc/faq/index_cn.rst
Original file line number Diff line number Diff line change
Expand Up @@ -440,7 +440,6 @@ PaddlePaddle保存的模型参数文件内容由16字节头信息和网络参数

* :code:`paddle.layer.lstmemory`、:code:`paddle.layer.grumemory`、:code:`paddle.layer.recurrent` 不是通过一般的方式来实现对输出的激活,所以不能采用第一种方式在这几个layer里设置 :code:`drop_rate` 来使用dropout。若要对这几个layer使用dropout,可采用第二种方式,即使用 :code:`paddle.layer.dropout`。


22. 如何设置learning_rate_schedule
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • 如何设置学习率退火(learning rate annealing )
  • please make the underlying "------" equal to or longer than the title.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

---------------------------------

Expand All @@ -454,7 +453,7 @@ PaddlePaddle保存的模型参数文件内容由16字节头信息和网络参数
learning_rate_decay_b=0.75,
learning_rate_schedule="poly",)

PaddlePaddle目前支持5种learning_rate_schedule,这5种learning_rate_schedule及其对应学习率计算方式如下
PaddlePaddle目前支持8种learning_rate_schedule,这8种learning_rate_schedule及其对应学习率计算方式如下

* "constant"

Expand All @@ -464,6 +463,12 @@ PaddlePaddle目前支持5种learning_rate_schedule,这5种learning_rate_schedu

lr = learning_rate * pow(1 + learning_rate_decay_a * num_samples_processed, -learning_rate_decay_b)

其中,num_samples_processed为已训练样本数,下同。

* "caffe_poly"

lr = learning_rate * pow(1.0 - num_samples_processed / learning_rate_decay_a, learning_rate_decay_b)

* "exp"

lr = learning_rate * pow(learning_rate_decay_a, num_samples_processed / learning_rate_decay_b)
Expand All @@ -476,4 +481,28 @@ PaddlePaddle目前支持5种learning_rate_schedule,这5种learning_rate_schedu

lr = max(learning_rate - learning_rate_decay_a * num_samples_processed, learning_rate_decay_b)

其中,num_samples_processed为当前已处理的样本总数。
* "manual"

这是一种按已训练样本数分段取值的learning_rate_schedule。使用该learning_rate_schedule时,用户通过参数 :code:`learning_rate_args` 设置学习率衰减因子分段函数,当前的学习率为所设置 :code:`learning_rate` 与当前的衰减因子的乘积。以使用Adam算法为例,代码如下:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • 这是一种按已训练样本数分段取值的学习率退火方法,用户通过参数……

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


.. code-block:: python

optimizer = paddle.optimizer.Adam(
learning_rate=1e-3,
learning_rate_schedule="manual",
learning_rate_args="1000:1.0,2000:0.9,3000:0.8",)

在该示例中,当已训练样本数小于等于1000时,学习率为 :code:`1e-3 * 1.0`;当已训练样本数大于1000小于等于2000时,学习率为 :code:`1e-3 * 0.9`;当已训练样本数大于2000时,学习率为 :code:`1e-3 * 0.8`。

* "pass_manual"

这是一种按已训练pass数分段取值的learning_rate_schedule。使用该learning_rate_schedule时,用户通过参数 :code:`learning_rate_args` 设置学习率衰减因子分段函数,当前的学习率为所设置 :code:`learning_rate` 与当前的衰减因子的乘积。以使用Adam算法为例,代码如下:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • 这是一种按已训练pass数分段取值的学习率退火方法,用户通过参数 ……

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


.. code-block:: python

optimizer = paddle.optimizer.Adam(
learning_rate=1e-3,
learning_rate_schedule="manual",
learning_rate_args="1:1.0,2:0.9,3:0.8",)

在该示例中,当已训练pass数小于等于1时,学习率为 :code:`1e-3 * 1.0`;当已训练pass数大于1小于等于2时,学习率为 :code:`1e-3 * 0.9`;当已训练pass数大于2时,学习率为 :code:`1e-3 * 0.8`。