-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[hybrid] seed and dropout op support force-cpu #35820
Conversation
Thanks for your contribution! |
paddle/fluid/operators/dropout_op.cc
Outdated
framework::OpKernelType GetKernelTypeForVar( | ||
const std::string& var_name, const Tensor& tensor, | ||
const framework::OpKernelType& expected_kernel_type) const override { | ||
if (var_name == "Seed" && platform::is_cpu_place(tensor.place())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里是不是不需要判断is_cpu_place了
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已经修改
paddle/fluid/operators/dropout_op.cc
Outdated
const std::string& var_name, const Tensor& tensor, | ||
const framework::OpKernelType& expected_kernel_type) const override { | ||
if (var_name == "Seed" && platform::is_cpu_place(tensor.place())) { | ||
VLOG(10) << "var_name:" << var_name << " need not to transform"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
建议VLOG加上在什么op, does not need to transform in dropout op
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已修改
@@ -39,6 +39,11 @@ class SeedOpMaker : public framework::OpProtoAndCheckerMaker { | |||
void Make() override { | |||
AddOutput("Out", "The output of seed op."); | |||
AddAttr<int>("seed", "Dropout random seed.").SetDefault(0); | |||
AddAttr<bool>("force_cpu", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
看下这个op预测会不会用,可能需要加上 AddCheckpoint 保证预测的兼容性
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已加上AddCheckpoint
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
如果预测不需要是不是还得加AsExtra(),新出的规范
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已经确认,并已加上AsExtra()
paddle/fluid/operators/seed_op.cu
Outdated
platform::DeviceContextPool::Instance(); | ||
auto &dev_ctx = *pool.Get(context.GetPlace()); | ||
out->mutable_data<T>(platform::CPUPlace(), | ||
framework::proto::VarType::SIZE_T); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个SIZE_T是干啥的
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已删除
attrs={ | ||
'seed': seed, | ||
'op_device': op_device, | ||
'force_cpu': True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
加一点点注释,为啥设置为True
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已加注释
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
* [HIP] fix op not support AMD GPU bug, the flag PADDLE_WITH_ROCM is invalid * [HIP] fix op not support AMD GPU bug, the flag PADDLE_WITH_ROCM is invalid * [HIP] fix op not support AMD GPU bug * [hybrid] seed and dropout op support force-cpu * [hybrid] seed and dropout op support force-cpu * [hybrid] seed and dropout op support force-cpu * [hybrid] seed and dropout op support force-cpu * [hybrid] seed and dropout op support force-cpu * [hybrid] fix seed ci failed issue * add AsExtra for force_cpu of seed op
* [HIP] fix op not support AMD GPU bug, the flag PADDLE_WITH_ROCM is invalid * [HIP] fix op not support AMD GPU bug, the flag PADDLE_WITH_ROCM is invalid * [HIP] fix op not support AMD GPU bug * [hybrid] seed and dropout op support force-cpu * [hybrid] seed and dropout op support force-cpu * [hybrid] seed and dropout op support force-cpu * [hybrid] seed and dropout op support force-cpu * [hybrid] seed and dropout op support force-cpu * [hybrid] fix seed ci failed issue * add AsExtra for force_cpu of seed op
… RandomSeedGenerator (#36682) * Revert "Add fused_dropout wrapper to ease use. (#36185) (#36640)" This reverts commit 05d7e2f. * [hybrid] seed and dropout op support force-cpu (#35820) * [HIP] fix op not support AMD GPU bug, the flag PADDLE_WITH_ROCM is invalid * [HIP] fix op not support AMD GPU bug, the flag PADDLE_WITH_ROCM is invalid * [HIP] fix op not support AMD GPU bug * [hybrid] seed and dropout op support force-cpu * [hybrid] seed and dropout op support force-cpu * [hybrid] seed and dropout op support force-cpu * [hybrid] seed and dropout op support force-cpu * [hybrid] seed and dropout op support force-cpu * [hybrid] fix seed ci failed issue * add AsExtra for force_cpu of seed op * Add fused_dropout wrapper to ease use. (#36185) * [hybrid] static model parallel dropout support deterministic RandomSeedGenerator (#36228) Co-authored-by: xiayanming <41795079@qq.com> Co-authored-by: Li Min <11663212+limin2021@users.noreply.github.com>
PR types
Performance optimization
PR changes
OPs
Describe
Seed Op增加force_cpu可选参数,开启时Seed Op时将其out在cpu place,在Dropout中作为Input的Seed可以直接从cpu读取(避免seed先从cpu copy到gpu,然后在dropout中又从gpu copy到了cpu,多了一次同步copy)