-
Notifications
You must be signed in to change notification settings - Fork 764
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Docathon][Add CN Doc No.17] #6369
Conversation
感谢你贡献飞桨文档,文档预览构建中,Docs-New 跑完后即可预览,预览链接:http://preview-pr-6369.paddle-docs-preview.paddlepaddle.org.cn/documentation/docs/zh/api/index_cn.html |
|
||
返回 | ||
:::::::::::: | ||
- Tensor|tuple:如果 ``cache_kvs`` 为 None,则返回与 ``x`` 形状和数据类型相同的张量,代表变压器层的输出。如果 ``cache_kvs`` 不为 None,则返回元组(output, cache_kvs),其中 output 是变压器层的输出,cache_kvs 与输入`cache_kvs`原地更新。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Transformer这种比较普及的名词直接用英文即可,不要翻译成“变压器层”这类说法
参数 | ||
:::::::::::: | ||
- **x** (Tensor) - 输入张量可以是 3-D 张量,输入数据类型可以是 float16 或 float32,形状为`[batch\_size, sequence\_length, d\_model]`。 | ||
- **ln_scales** (list(Tensor)|tuple(Tensor)) - 注意力层归一化的权重张量,形状为`[d\_model]`。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
注意力机制中层归一化层的权重张量,后面同是如此
- **qkv_biases** (list(Tensor)|tuple(Tensor)|None) - 注意力 qkv 计算的偏置张量,形状为`[3, num\_head, dim\_head]`。 | ||
- **linear_weights** (list(Tensor)|tuple(Tensor)) - 注意力线性的权重张量,形状为`[num\_head * dim\_head, d\_model]`。 | ||
- **linear_biases** (list(Tensor)|tuple(Tensor)|None) - 注意力线性的偏置张量,形状为`[d\_model]`。 | ||
- **ffn_ln_scales** (list(Tensor)|tuple(Tensor)) - 前馈层归一化的权重张量,形状为`[d\_model]`。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
前馈层中层归一化层的权重张量
- **linear_biases** (list(Tensor)|tuple(Tensor)|None) - 注意力线性的偏置张量,形状为`[d\_model]`。 | ||
- **ffn_ln_scales** (list(Tensor)|tuple(Tensor)) - 前馈层归一化的权重张量,形状为`[d\_model]`。 | ||
- **ffn_ln_biases** (list(Tensor)|tuple(Tensor)) - 前馈层归一化的偏置张量,形状为`[d\_model]`。 | ||
- **ffn1_weights** (list(Tensor)|tuple(Tensor)) - 前馈第一线性的权重张量,形状为`[d\_model, dim\_feedforward]`。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
前馈层中第一个线性变换层的权重张量,后面同是如此
- **seq_lens** (Tensor,可选) - 此批次的序列长度。形状为`[bsz]`。默认为 None。 | ||
- **rotary_embs** (Tensor,可选) - 用于旋转计算的 RoPE 嵌入。形状为`[2, bsz, 1, seq\_len, head\_dim]`。默认为 None。 | ||
- **time_step** (Tensor,可选) - 生成模型的时间步张量。用于解码阶段,表示时间步,即 CacheKV 的实际 seq_len。形状为`[1]`,必须位于 CPUPlace。默认为 None。 | ||
- **attn_mask** (Tensor,可选) - 用于多头注意力中防止对某些不需要的位置(通常是填充或后续位置)进行注意力。其形状为`[batch_size, 1, sequence_length, sequence_length]`。默认为 None。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个中文直译英文稍微有点别扭,暂时可以先这样,不过后一个attention应该是一个动词,还是改成 “ 用于多头注意力层中防止对某些不需要的位置(通常是填充或后续位置)进行注意 ”吧
@Courtesy-Xs Down |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
docs/api/paddle/incubate/nn/functional/fused_multi_transformer_cn.rst
Outdated
Show resolved
Hide resolved
docs/api/paddle/incubate/nn/functional/fused_multi_transformer_cn.rst
Outdated
Show resolved
Hide resolved
…_cn.rst Co-authored-by: zachary sun <70642955+sunzhongkai588@users.noreply.github.com>
…_cn.rst Co-authored-by: zachary sun <70642955+sunzhongkai588@users.noreply.github.com>
docs/api/paddle/incubate/nn/functional/fused_multi_transformer_cn.rst
Outdated
Show resolved
Hide resolved
…_cn.rst Co-authored-by: zachary sun <70642955+sunzhongkai588@users.noreply.github.com>
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR types
Others
PR changes
Docs
Description
中文文档添加任务
#6193
新增中文文档:
英文文档链接:
@JamesLim-sy @sunzhongkai588