[PHI decoupling]decouple tensor_utils of TensorCopy #50264

engineer1109 · 2023-02-06T12:59:18Z

PR types

Others

PR changes

Others

Describe

decouple tensor_utils of TensorCopy, and other functions

paddle-bot · 2023-02-06T12:59:24Z

你的PR提交成功，感谢你对开源项目的贡献!
请关注后续CI自动化测试结果，详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

engineer1109 · 2023-02-07T03:26:08Z

部分是降低耦合程度
但是仍需要

#include "paddle/fluid/memory/memcpy.h"
#include "paddle/fluid/platform/place.h"

Approval需要豁免

YuanRisheng · 2023-02-07T12:47:08Z

paddle/phi/common/scalar.cc

@@ -14,9 +14,12 @@ limitations under the License. */

 #include "paddle/phi/common/scalar.h"

-#include "paddle/fluid/framework/tensor_util.h"
+#include "paddle/fluid/platform/place.h"


可以尝试删除这个头文件，如果遇到需要判断是哪种place（比如paddle::platform::is_same_place(tensor_in.place(), cpu_place)），可以参考这个代码：

YuanRisheng · 2023-02-07T12:51:43Z

paddle/phi/kernels/gpu/interpolate_kernel.cu

-          *out_size, paddle::platform::CPUPlace(), &sizes);
+      phi::DeviceContextPool& pool = phi::DeviceContextPool::Instance();
+      auto dev_ctx = pool.Get(out_size->place());
+      phi::Copy(*dev_ctx, *out_size, phi::CPUPlace(), true, &sizes);


这里看起来函数参数有传入dev_ctx，还需要再从pool中拿吗

YuanRisheng · 2023-02-07T12:52:08Z

paddle/phi/kernels/gpu/interpolate_kernel.cu

-          *out_size, paddle::platform::CPUPlace(), &sizes);
+      phi::DeviceContextPool& pool = phi::DeviceContextPool::Instance();
+      auto dev_ctx = pool.Get(out_size->place());
+      phi::Copy(*dev_ctx, *out_size, phi::CPUPlace(), true, &sizes);


YuanRisheng · 2023-02-07T12:52:37Z

paddle/phi/kernels/gpu/interpolate_kernel.cu

-          *out_size, paddle::platform::CPUPlace(), &sizes);
+      phi::DeviceContextPool& pool = phi::DeviceContextPool::Instance();
+      auto dev_ctx = pool.Get(out_size->place());
+      phi::Copy(*dev_ctx, *out_size, phi::CPUPlace(), true, &sizes);


YuanRisheng · 2023-02-07T12:56:19Z

paddle/phi/kernels/funcs/interpolate_function.h

@@ -96,15 +95,16 @@ inline std::vector<int> get_new_shape(
 #ifdef PADDLE_WITH_XPU
    if (tensor->place().GetType() == phi::AllocationType::XPU) {
      DenseTensor temp;
-      paddle::framework::TensorCopySync(*tensor, phi::CPUPlace(), &temp);
+      phi::Copy<phi::CPUContext>(
+          phi::CPUContext(), *tensor, phi::CPUPlace(), true, &temp);


这里应该有Bug，创建了CPUContext临时对象

engineer1109 · 2023-02-09T07:48:34Z

修改超20个文件 PR-CI-APPROVAL 需要豁免
PR-CI-Coverage 覆盖率C++ 需要豁免
PR-CI-OP-benchmark 算子过多一直超时，可能需要豁免

YuanRisheng · 2023-02-09T08:25:19Z

paddle/phi/api/lib/data_transform.cc

@@ -188,7 +186,13 @@ inline phi::DenseTensor TransDataPlace(const phi::DenseTensor& tensor,
  // But the embarrassment is that this solution this solution makes training
  // slower.
  phi::DenseTensor out;
-  paddle::framework::TensorCopySync(tensor, dst_place, &out);
+  phi::DeviceContext* dev_ctx;
+  if (dst_place.GetType() != AllocationType::CPU) {


这里为何要进行判断区分，直接通过dst_place拿到dev_ctx会有什么问题吗

之前windows有个test_yolo_loss的fluid测试，直接dst_place获取dev_ctx，会出错。
dst_place可能是CPU，src_place可能是CUDA，这会导致ctx_place是CPU。
最终会导致https://github.com/PaddlePaddle/Paddle/blob/701888a09cb6a89f136c92f96b85f440cad2e198/paddle/phi/core/tensor_utils.cc
120行throw异常
Context place error, excepted GPUPlace, but actually CPUPlace

@YuanRisheng

修改超20个文件 PR-CI-APPROVAL 需要豁免
PR-CI-Coverage 覆盖率C++ 需要豁免
PR-CI-OP-benchmark 算子过多一直超时，可能需要豁免

@engineer1109 op-benchmark是测试性能的，copy的修改可能会造成性能问题，多rerun几次看是否能跑过

@YuanRisheng 已经跑了一个星期了，又超时了。https://xly.bce.baidu.com/paddlepaddle/paddle/newipipe/detail/7756128/job/21674196

看起来没啥问题，下周我让相关同学再看一下

engineer1109 · 2023-02-10T04:48:13Z

Resolve conflict

engineer1109 · 2023-02-10T04:54:31Z

custom_device_test.cc有人在更新，改动放弃

zyfncg · 2023-02-10T12:49:04Z

paddle/phi/common/scalar.cc

-    framework::TensorCopySync(tensor_in, cpu_place, &tensor);
+    phi::DeviceContextPool& pool = phi::DeviceContextPool::Instance();
+    auto dev_ctx = pool.Get(tensor_in.place());
+    phi::Copy(*dev_ctx, tensor_in, cpu_place, false, &tensor);


这里建议使用同步拷贝

@zyfncg scalar.cc已经纠正

fix X remove TensorCopy codestyle add fluid memory header fix symbol fix cmake fix cmake fix context fix header fix place fix context fix context fix context fix code fix custom context fix custom context fix copy fix data_transform fix style remove changes of custom fix scalar

engineer1109 · 2023-02-13T02:05:45Z

@zyfncg @YuanRisheng 还有什么问题吗？

ZzSean

LGTM for CI-OP-Benchmark

engineer1109 · 2023-02-13T06:14:24Z

@luotao1 这个PR现在如何？

luotao1 · 2023-02-13T07:33:50Z

Coverage流水线已豁免

XiaoguangHu01

LGTM

engineer1109 · 2023-02-14T13:24:18Z

#47615 refer

paddle-bot bot added contributor External developers status: proposed labels Feb 6, 2023

engineer1109 force-pushed the tensorutil2 branch from 5b09efd to 0906f6e Compare February 7, 2023 01:10

luotao1 assigned luotao1 and YuanRisheng Feb 7, 2023

engineer1109 force-pushed the tensorutil2 branch 3 times, most recently from b791a5f to 8235d06 Compare February 7, 2023 03:24

engineer1109 force-pushed the tensorutil2 branch 5 times, most recently from 61cf74c to 88d0fae Compare February 7, 2023 09:38

YuanRisheng reviewed Feb 7, 2023

View reviewed changes

engineer1109 force-pushed the tensorutil2 branch 9 times, most recently from 91e864f to 701888a Compare February 9, 2023 01:35

YuanRisheng reviewed Feb 9, 2023

View reviewed changes

engineer1109 force-pushed the tensorutil2 branch from 701888a to 6e398fa Compare February 10, 2023 04:46

engineer1109 force-pushed the tensorutil2 branch from 6e398fa to 861b9aa Compare February 10, 2023 04:53

YuanRisheng previously approved these changes Feb 10, 2023

View reviewed changes

zyfncg reviewed Feb 10, 2023

View reviewed changes

engineer1109 dismissed YuanRisheng’s stale review via 2e45dda February 12, 2023 09:12

engineer1109 force-pushed the tensorutil2 branch from 861b9aa to 2e45dda Compare February 12, 2023 09:12

YuanRisheng approved these changes Feb 13, 2023

View reviewed changes

ZzSean approved these changes Feb 13, 2023

View reviewed changes

zyfncg approved these changes Feb 13, 2023

View reviewed changes

XiaoguangHu01 approved these changes Feb 14, 2023

View reviewed changes

YuanRisheng merged commit 057cdb9 into PaddlePaddle:develop Feb 14, 2023

luotao1 mentioned this pull request Feb 15, 2023

[phi] Decoupled phi from fluid tracking issue #47615

Closed

This was referenced Feb 22, 2023

Remove utils in phi (with fluid) #50697

Closed

【Hackathon No.68】Remove utils in phi #50833

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[PHI decoupling]decouple tensor_utils of TensorCopy #50264

[PHI decoupling]decouple tensor_utils of TensorCopy #50264

engineer1109 commented Feb 6, 2023

paddle-bot bot commented Feb 6, 2023

engineer1109 commented Feb 7, 2023

YuanRisheng Feb 7, 2023

YuanRisheng Feb 7, 2023

YuanRisheng Feb 7, 2023

YuanRisheng Feb 7, 2023

YuanRisheng Feb 7, 2023

engineer1109 commented Feb 9, 2023

YuanRisheng Feb 9, 2023

engineer1109 Feb 9, 2023

engineer1109 Feb 9, 2023

YuanRisheng Feb 10, 2023

engineer1109 Feb 10, 2023

YuanRisheng Feb 10, 2023

engineer1109 commented Feb 10, 2023

engineer1109 commented Feb 10, 2023

zyfncg Feb 10, 2023

engineer1109 Feb 12, 2023

engineer1109 commented Feb 13, 2023

ZzSean left a comment

engineer1109 commented Feb 13, 2023

luotao1 commented Feb 13, 2023

XiaoguangHu01 left a comment

engineer1109 commented Feb 14, 2023

[PHI decoupling]decouple tensor_utils of TensorCopy #50264

[PHI decoupling]decouple tensor_utils of TensorCopy #50264

Conversation

engineer1109 commented Feb 6, 2023

PR types

PR changes

Describe

paddle-bot bot commented Feb 6, 2023

engineer1109 commented Feb 7, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

engineer1109 commented Feb 9, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

engineer1109 commented Feb 10, 2023

engineer1109 commented Feb 10, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

engineer1109 commented Feb 13, 2023

ZzSean left a comment

Choose a reason for hiding this comment

engineer1109 commented Feb 13, 2023

luotao1 commented Feb 13, 2023

XiaoguangHu01 left a comment

Choose a reason for hiding this comment

engineer1109 commented Feb 14, 2023