set device id of Place() to get GPUContext needed by LimitGridDim in ElemwiseGradBroadcast #42320

FlyingQianMM · 2022-04-27T07:42:44Z

PR types

Bug fixes

PR changes

OPs

Describe

When use multiple gpu device to train a model, a error was raised as bellow:

NotImplementedError: 

--------------------------------------
C++ Traceback (most recent call last):
--------------------------------------
0   paddle::imperative::BasicEngine::Execute()
1   paddle::imperative::PreparedOp::Run(paddle::imperative::NameVariableWrapperMap const&, paddle::imperative::NameVariableWrapperMap const&, paddle::framework::AttributeMap const&, paddle::framework::AttributeMap const&)
2   phi::KernelImpl<void (*)(phi::GPUContext const&, phi::DenseTensor const&, phi::DenseTensor const&, phi::DenseTensor const&, paddle::optional<phi::DenseTensor const&>, paddle::optional<phi::DenseTensor const&>, int, phi::DenseTensor*, phi::DenseTensor*, phi::DenseTensor*), &(void phi::MultiplyDoubleGradKernel<float, phi::GPUContext>(phi::GPUContext const&, phi::DenseTensor const&, phi::DenseTensor const&, phi::DenseTensor const&, paddle::optional<phi::DenseTensor const&>, paddle::optional<phi::DenseTensor const&>, int, phi::DenseTensor*, phi::DenseTensor*, phi::DenseTensor*))>::Compute(phi::KernelContext*)
3   void phi::MultiplyDoubleGradKernel<float, phi::GPUContext>(phi::GPUContext const&, phi::DenseTensor const&, phi::DenseTensor const&, phi::DenseTensor const&, paddle::optional<phi::DenseTensor const&>, paddle::optional<phi::DenseTensor const&>, int, phi::DenseTensor*, phi::DenseTensor*, phi::DenseTensor*)
4   void phi::funcs::ElemwiseGradComputeWithBroadcast<float, phi::MulGradDX<float>, phi::MulGradDY<float>, float>(phi::GPUContext const&, phi::DDim const&, phi::DDim const&, phi::DenseTensor const&, phi::DenseTensor const&, phi::DenseTensor const&, phi::DenseTensor const&, int, phi::DenseTensor*, phi::DenseTensor*, phi::MulGradDX<float>, phi::MulGradDY<float>)
5   paddle::platform::DeviceContextPool::Get(phi::Place const&)
6   phi::enforce::EnforceNotMet::EnforceNotMet(phi::ErrorSummary const&, char const*, int)
7   phi::enforce::GetCurrentTraceBackString[abi:cxx11](bool)

----------------------
Error Message Summary:
----------------------
UnimplementedError: Place Place(gpu:0) is not supported. Please check that your paddle compiles with WITH_GPU, WITH_XPU, WITH_IPU, WITH_MLU or WITH_ASCEND_CL option or check that your train process set the correct device id if you use Executor. (at /ssd2/liuquanxiang/binary_search_pr/PaddleGAN_py37_102_docker_citest/paddle/paddle/fluid/platform/device_context.cc:139)

We find the the reason is that the current device id of a initialized Place() should be set to get a GPUContext needed by LimitGridDim in ElemwiseGradBroadcast, so we fix this bug.

…ElemwiseGradBroadcast

ZzSean

LGTM

…ElemwiseGradBroadcast (PaddlePaddle#42320)

…ElemwiseGradBroadcast (#42320) (#42332)

set device id of Place() to get GPUContext needed by LimitGridDim in …

a9a3cc2

…ElemwiseGradBroadcast

FlyingQianMM mentioned this pull request Apr 27, 2022

[cherry-pick] set device id of Place() to get GPUContext needed by LimitGridDim in ElemwiseGradBroadcast #42332

Merged

ZzSean previously approved these changes Apr 27, 2022

View reviewed changes

FlyingQianMM added a commit to FlyingQianMM/Paddle that referenced this pull request Apr 27, 2022

set device id of Place() to get GPUContext needed by LimitGridDim in …

e039970

…ElemwiseGradBroadcast (PaddlePaddle#42320)

fix code style

716e14f

FlyingQianMM dismissed ZzSean’s stale review via 716e14f April 27, 2022 13:25

ZzSean approved these changes Apr 27, 2022

View reviewed changes

FlyingQianMM merged commit 22d3c56 into PaddlePaddle:develop Apr 28, 2022

FlyingQianMM deleted the develop_place branch April 28, 2022 01:00

XiaoguangHu01 pushed a commit that referenced this pull request Apr 28, 2022

set device id of Place() to get GPUContext needed by LimitGridDim in …

0fe0aea

…ElemwiseGradBroadcast (#42320) (#42332)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

set device id of Place() to get GPUContext needed by LimitGridDim in ElemwiseGradBroadcast #42320

set device id of Place() to get GPUContext needed by LimitGridDim in ElemwiseGradBroadcast #42320

FlyingQianMM commented Apr 27, 2022

ZzSean left a comment

set device id of Place() to get GPUContext needed by LimitGridDim in ElemwiseGradBroadcast #42320

set device id of Place() to get GPUContext needed by LimitGridDim in ElemwiseGradBroadcast #42320

Conversation

FlyingQianMM commented Apr 27, 2022

PR types

PR changes

Describe

ZzSean left a comment

Choose a reason for hiding this comment