-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Typing][C-85] Add type annotations for python/paddle/incubate/nn/layer/fused_transformer.py
#67178
Conversation
你的PR提交成功,感谢你对开源项目的贡献! |
另外,CI 中的示例代码运行错误了,我本地也错误 ... ... import paddle
from paddle.incubate.nn import FusedMultiTransformer
paddle.device.set_device('gpu')
# encoder input: [batch_size, src_len, d_model]
enc_input = paddle.rand((2, 4, 128))
# self attention mask: [batch_size, 1, src_len, src_len]
attn_mask = paddle.rand((2, 1, 4, 4))
encoder_layers = FusedMultiTransformer(128, 2, 512, num_layers=1)
enc_output = encoder_layers(enc_input, attn_mask)
print(enc_output.shape) @SigureMo 找研发看看?还是直接 skip 或者提 issue ? |
现在只有 fp16 kernel 了 Paddle/paddle/fluid/operators/fused/fused_multi_transformer_op.cu Lines 841 to 852 in efdd967
我看是 #64125 改的,改成 fp16 输入试试呢? |
不行 ... ... 如果只对输入作修改: 代码里面直接读取默认类型 In [16]: import paddle
...: from paddle.incubate.nn import FusedMultiTransformer
...: paddle.device.set_device('gpu')
...:
...: # encoder input: [batch_size, src_len, d_model]
...: enc_input = paddle.rand((2, 4, 128)).astype('float16')
...: # self attention mask: [batch_size, 1, src_len, src_len]
...: attn_mask = paddle.rand((2, 1, 4, 4)).astype('float16')
...: encoder_layers = FusedMultiTransformer(128, 2, 512, num_layers=1)
...: enc_output = encoder_layers(enc_input, attn_mask)
...: print(enc_output.shape) 导致类型不一致 ValueError: (InvalidArgument) The type of data we are trying to retrieve (float16) does not match the type of data (float32) currently contained in the container. 如果设置默认数据类型 import paddle
from paddle.incubate.nn import FusedMultiTransformer
paddle.device.set_device('gpu')
paddle.set_default_dtype('float16')
# encoder input: [batch_size, src_len, d_model]
enc_input = paddle.rand((2, 4, 128))
# self attention mask: [batch_size, 1, src_len, src_len]
attn_mask = paddle.rand((2, 1, 4, 4))
encoder_layers = FusedMultiTransformer(128, 2, 512, num_layers=1)
enc_output = encoder_layers(enc_input, attn_mask)
print(enc_output.shape) 运行也有错误 OSError: (External) Error in Flash-Attention, detail information is: `is_sm8x || is_sm90` check failed at /paddle/third_party/flashattn/csrc/capi/flash_attn.cu:681
[Hint: Expected status == true, but received status:0 != true:1.] (at /paddle/paddle/phi/kernels/gpu/flash_attn_utils.h:360)
[operator < fused_multi_transformer > error] 单测里面只有静态图的,也木的参考 🫠 |
给相关同学反馈了,先 skip 吧,不能阻塞这边任务 |
import paddle
from paddle.incubate.nn import FusedMultiTransformer
paddle.device.set_device('gpu')
# encoder input: [batch_size, src_len, d_model]
paddle.set_default_dtype('float16')
enc_input = paddle.rand((2, 4, 128)).astype('float16')
# self attention mask: [batch_size, 1, src_len, src_len]
attn_mask = paddle.rand((2, 1, 4, 4))
encoder_layers = FusedMultiTransformer(128, 2, 512, num_layers=1)
enc_output = encoder_layers(enc_input, attn_mask)
print(enc_output.shape) 相关同学反馈这个是可以跑的,我这边没带 flash attention 编,在 CI 上试试呢?如果还有问题就先 skip |
CI 挂了 ~ @enkilee 示例代码 SKIP 吧,理由就写需要编译 |
收到 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM ~ 🤟
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
…yer/fused_transformer.py` (PaddlePaddle#67178)
PR Category
User Experience
PR Types
Improvements
Description
为公开 API 标注类型提示信息
C-85 python/paddle/incubate/nn/layer/fused_transformer.py
@megemini