Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update the annotation of layers.py #4158

Merged
merged 12 commits into from
Sep 20, 2017
Merged

Update the annotation of layers.py #4158

merged 12 commits into from
Sep 20, 2017

Conversation

ranqiu92
Copy link
Contributor

@ranqiu92 ranqiu92 commented Sep 18, 2017

No description provided.

@ranqiu92 ranqiu92 requested a review from lcy-seso September 18, 2017 09:42
Copy link
Contributor

@lcy-seso lcy-seso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the work.

:param bias_attr: Bias parameter attribute. True if no bias.
:param bias_attr: The Bias Attribute. If no bias, then pass False or
something not type of ParameterAttribute. None will get a
default Bias.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • If this parameter is set to False, no bias is defined. If this parameter is set to None, bias with default initialization settings is defined.
  • please help to check what is the default setting of the bias parameter and add it into the doc,

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please fix this.

@@ -5790,7 +5792,7 @@ def sum_cost(input, name=None, layer_attr=None):

:param input: The first input layer.
:type input: LayerOutput.
:param name: The name of this layers. It is not necessary.
:param name: The name of this layer. It is not necessary.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please delete the sentence "It is not necessary."

@@ -5835,7 +5837,7 @@ def huber_regression_cost(input,
:type input: LayerOutput.
:param label: The input label.
:type input: LayerOutput.
:param name: The name of this layers. It is not necessary.
:param name: The name of this layer. It is not necessary.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please change the sentence "It is not necessary." into "It is not required."
The same below.

@@ -5593,7 +5595,7 @@ def rank_cost(left,
:param weight: The weight affects the cost, namely the scale of cost.
It is an optional argument.
:type weight: LayerOutput
:param name: The name of this layers. It is not necessary.
:param name: The name of this layer. It is not necessary.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please change the sentence "It is not necessary." into "It is not required."
The same below.

@@ -5658,7 +5660,7 @@ def lambda_cost(input,
than the size of a list, the algorithm will sort the
entire list of get gradient.
:type max_sort_size: int
:param name: The name of this layers. It is not necessary.
:param name: The name of this layer. It is not necessary.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please change the sentence "It is not necessary." into "It is not required."
The same below.

@@ -5702,7 +5704,7 @@ def cross_entropy(input,
:type input: LayerOutput.
:param label: The input label.
:type input: LayerOutput.
:param name: The name of this layers. It is not necessary.
:param name: The name of this layer. It is not necessary.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please change the sentence "It is not necessary." into "It is not required."
The same below.

@@ -5750,7 +5752,7 @@ def cross_entropy_with_selfnorm(input,
:type input: LayerOutput.
:param label: The input label.
:type input: LayerOutput.
:param name: The name of this layers. It is not necessary.
:param name: The name of this layer. It is not necessary.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please change the sentence "It is not necessary." into "It is not required."
The same below.

@ranqiu92
Copy link
Contributor Author

@lcy-seso Done


def load_parameter(file_name, h, w):
with open(file_name, 'rb') as f:
f.read(16) # skip header.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. 调整一下16和17的顺序。把17提前。
  2. 解释一些这里加载的预训练参数格式,保存的模型为什么需要 skip_header?因为并不是所有情况需要这个操作。不要没有任何解释的放在这里。

17. PaddlePaddle存储的参数格式是什么,如何和明文进行相互转化
---------------------------------------------------------

PaddlePaddle保存的二进制参数文件内容由16位头信息和网络参数两部分组成。头信息中,第一位固定为0,第二位为4,在使用double精度时,第二位为8,第三位记录共有多少个数值。
Copy link
Contributor

@lcy-seso lcy-seso Sep 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. 第一位固定为0,第二位为4。文档不能这么写。解释一下这两位。
  2. 第一位是paddle的版本信息,paddle预留,用户不用修改,一般写0即可。第二位是浮点数占用多少个字节。4 是float 精度,8是double精度。一般由于训练时保存的模型是float精度,第2位通常固定为4。
  3. 请组织一下上面内容的表述方式,不要直接照抄。

@@ -361,7 +340,7 @@ PaddlePaddle保存的二进制参数文件内容由16位头信息和网络参数
fmt="%.6f", delimiter=",")


将明文参数转化为PaddlePaddle可加载的模型参数时,先根据参数规模写入头信息,再写入具体网络参数。以下为将随机生成的矩阵转化为PaddlePaddle可加载的模型参数示例:
将明文参数转化为PaddlePaddle可加载的模型参数时,先根据数据类型和参数规模写入头信息,再写入具体网络参数。以下为将随机生成的矩阵转化为PaddlePaddle可加载的模型参数示例:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

第一句话稍作修改为:

将明文参数转化为PaddlePaddle可加载的模型参数时,首先构造头信息,再写入网络参数。下面将随机生成的矩阵转化为可以被PaddlePaddle加载的模型参数。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

---------------------------------------------------------

PaddlePaddle保存的二进制参数文件内容由16位头信息和网络参数两部分组成。头信息中,第一位固定为0,第二位为4,在使用double精度时,第二位为8,第三位记录共有多少个数值
PaddlePaddle保存的模型参数文件内容由16字节头信息和网络参数两部分组成。头信息中,1~4字节表示PaddlePaddle版本信息;5~8字节表示每个参数占用的字节数,当保存的网络参数为float类型时为4,double类型时为8;9~16字节表示保存的参数总个数
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • 1 ~ 4字节表示PaddlePaddle的版本信息,在大多数情况下,可以直接填充0。@luotao 使用MKLDNN时,这个版本信息是如何定义呢?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

使用MKLDNN的时候,与paddle原来的保持一致,也就是直接填0。MKLDNN保存的参数还是paddle原来的格式,转换在程序里面自动做。这样训练出来的模型,可以直接拿着在非MKLDNN版本上用了。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


将PaddlePaddle保存的二进制参数还原回明文时,先跳过PaddlePaddle模型参数文件的头信息,再提取网络参数,示例如下:
将PaddlePaddle保存的模型参数还原回明文时,可以使用相应数据类型的 :code:`numpy.array` 加载具体网络参数,此时需要跳过PaddlePaddle模型参数文件的头信息。一般情况下,PaddlePaddle保存的模型参数数据类型为float,所以在使用 :code:`numpy.array` 时一般设置 :code:`dtype=float32` 。示例如下:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • 此时需要跳过-->此时可以跳过
  • 一般情况下,PaddlePaddle保存的模型参数数据类型为float,这句话稍作修改为:
    • 若在PaddlePaddle编译时,未指定按照double精度编译,默认情况下按照float精度计算,保存的参数也是float类型。这时在使用 :code:numpy.array 时,一般设置 :code:dtype=float32 。示例如下:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -371,3 +350,23 @@ PaddlePaddle保存的二进制参数文件内容由16位头信息和网络参数
param = np.float32(np.random.rand(height, width))
with open(param_file, "w") as fparam:
fparam.write(header + param.tostring())

17. 如何加载预训练embedding参数
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

如何加载预训练embedding参数 --> 如何加载预训练参数

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

17. 如何加载预训练embedding参数
------------------------------

设置embedding的参数属性 :code:`is_static=True`,使embedding参数在训练过程中保持不变,从模型文件将预训练参数载入 :code:`numpy.array`,在创建parameters后,使用 :code:`parameters.set()` 加载预训练参数。PaddlePaddle保存的模型参数文件前16字节为头信息,用户将参数载入 :code:`numpy.array` 时须从第17字节开始。
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • 拆成1, 2 这样的两条,不要放在一大段里面
  • 对要加载预训练参数的层,设置param_attr=***,以embedding层为例,代码如下:
    .. code-block:: python
    ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

return np.fromfile(f, dtype=np.float32).reshape(h, w)


emb_para = paddle.attr.Param(name='emb', initial_std=0., is_static=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

paddle.attr.Param(is_static=True)
其它两个参数不是必须的,这里删掉。

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

抱歉,name是必须的。initial_std=0去掉,这一条是讲加载预训练参数,initial_std 没有实际影响,保持文档的简单直接,这里删掉。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -325,9 +325,9 @@ pip install python/dist/paddle*.whl && pip install ../paddle/dist/py_paddle*.whl
16. PaddlePaddle存储的参数格式是什么,如何和明文进行相互转化
---------------------------------------------------------

PaddlePaddle保存的模型参数文件内容由16字节头信息和网络参数两部分组成。头信息中,1~4字节表示PaddlePaddle版本信息;5~8字节表示每个参数占用的字节数,当保存的网络参数为float类型时为4,double类型时为8;9~16字节表示保存的参数总个数。
PaddlePaddle保存的模型参数文件内容由16字节头信息和网络参数两部分组成。头信息中,1~4字节表示PaddlePaddle版本信息,在多数情况下,可以直接填充0;5~8字节表示每个参数占用的字节数,当保存的网络参数为float类型时为4,double类型时为8;9~16字节表示保存的参数总个数。
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

根据@luotao 的comment,MKLDNN 现在并没有使用版本信息这一位。
“1~4字节表示PaddlePaddle版本信息,在多数情况下,可以直接填充0” 这句话,作为文档,我们不要为用户留下模棱两可的描述,直接改完下面的吧:

1~4字节表示PaddlePaddle版本信息,请直接填充0。

@@ -340,7 +340,7 @@ PaddlePaddle保存的模型参数文件内容由16字节头信息和网络参数
fmt="%.6f", delimiter=",")


将明文参数转化为PaddlePaddle可加载的模型参数时,先根据数据类型和参数规模写入头信息,再写入具体网络参数。以下为将随机生成的矩阵转化为PaddlePaddle可加载的模型参数示例:
将明文参数转化为PaddlePaddle可加载的模型参数时,首先构造头信息,再写入网络参数。下面将随机生成的矩阵转化为可以被PaddlePaddle加载的模型参数。
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

下面的代码将随机矩阵存储为为可以被PaddlePaddle加载的模型参数。

@lcy-seso lcy-seso merged commit fe84517 into PaddlePaddle:develop Sep 20, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants