Add CPU/GPU Memcpy in memory folder #2970

gangliao · 2017-07-19T14:47:23Z

No description provided.

gangliao

Some comments

gangliao · 2017-07-20T00:37:16Z

paddle/platform/enforce.h

@@ -43,10 +43,26 @@ namespace platform {
 // For more details, please check https://stackoverflow.com/a/43870188/724872.
 #define UNLIKELY(condition) __builtin_expect(static_cast<bool>(condition), 0)

+template <typename T>


Fix special case of PADDLE_ENFORCE

PADDLE_ENFORCE(condition, "hello world"); // OK, if using old implementation PADDLE_ENFORCE(condition) // Failed, if using old implementation. But, it's addressed.

qingqing01 · 2017-07-20T02:21:21Z

paddle/memory/memory.cc

+                                                  const void* src, size_t num,
+                                                  cudaStream_t stream) {
+  platform::GpuMemcpyAsync(dst, src, num, cudaMemcpyHostToDevice, stream);
+}


Here, use cudaMemcpyDeviceToHost and cudaMemcpyHostToDevice, cudaMemcpyDeviceToDevice. But the cudaMemcpyKind of cudaMemcpyDefault:

cudaMemcpyDefault = 4
Direction of the transfer is inferred from the pointer values. Requires unified virtual addressing

Can we use cudaMemcpyDefault to simplify code？

Sounds good. But, how to specialize one function to support both cases: (CPUPlace, GPUPlace),
(GPUPlace, CPUPlace).

One way may achieve that is std::enable_if, but it will dump too many annoying code.

QiJune · 2017-07-20T03:27:37Z

paddle/platform/gpu_info.h

@@ -31,7 +32,7 @@ int GetCurrentDeviceId();
 void SetDeviceId(int device_id);

 //！Get the memory usage of current GPU device.
-void GpuMemoryUsage(size_t& available, size_t& total);
+void GpuMemoryUsage(size_t &available, size_t &total);


Here, we should unify the code style

size_t& available, size_t& total;

& and * should close to type.

I didn't change this. Clang-format takes this job

QiJune · 2017-07-20T03:29:42Z

paddle/memory/memory.cc

+                                                  platform::CPUPlace src_place,
+                                                  const void* src, size_t num,
+                                                  cudaStream_t stream) {
+  platform::SetDeviceId(dst_place.device);


Maybe we should use platform::GPUPlaceGuard here

I think it's unnecessary to use the guard to implicitly roll back the device id.
For GPU device, it's better to explicitly set device id.

wangkuiyi

LGTM

wangkuiyi · 2017-07-20T17:06:11Z

paddle/memory/memory.h

+#ifndef PADDLE_ONLY_CPU
+template <typename DstPlace, typename SrcPlace>
+void Copy(DstPlace, void* dst, SrcPlace, const void* src, size_t num,
+          cudaStream_t stream);


It would be great to add a comment telling when would users call this second form of Copy.

Sorry for replying too late due to we are on duty yesterday. Yeah, I will annotate this function. Thanks.

wangkuiyi · 2017-07-20T17:06:44Z

paddle/platform/enforce.h

 #ifndef PADDLE_ONLY_CPU

 template <typename... Args>
-inline void throw_on_error(cudaError_t e, const Args&... args) {
+inline typename std::enable_if<sizeof...(Args) != 0, void>::type throw_on_error(


wangkuiyi · 2017-07-20T17:07:45Z

paddle/platform/gpu_info.h

@@ -42,6 +43,18 @@ size_t GpuMinChunkSize();
 //! Get the maximum chunk size for GPU buddy allocator.
 size_t GpuMaxChunkSize();

+//! Copy memory from address src to dst asynchronously.
+void GpuMemcpyAsync(void *dst, const void *src, size_t count,


Should we move these copying functions into a new source file, say, copy.{h,cc}? I am not sure. Just mention it.

gangliao added 5 commits July 19, 2017 13:13

Add memcpy

028f3dc

Add memcpy

e53a48b

Add cuda memcpy in gpu_info

b058864

Fix H2D and D2H order

527c859

Fix paddle enforce special cases

ca89bfa

gangliao requested review from QiJune, qingqing01 and JiayiFeng July 20, 2017 00:35

gangliao commented Jul 20, 2017

View reviewed changes

gangliao changed the title ~~CPU/GPU Memcpy in memory folder~~ Add CPU/GPU Memcpy in memory folder Jul 20, 2017

gangliao requested a review from wangkuiyi July 20, 2017 00:39

gangliao added 2 commits July 20, 2017 09:40

Add stdlib.h for memcpy

00500ee

Fix string.h for memcpy

0897d18

qingqing01 reviewed Jul 20, 2017

View reviewed changes

Add SetDeviceId in memcpy

b3115fb

QiJune reviewed Jul 20, 2017

View reviewed changes

wangkuiyi reviewed Jul 20, 2017

View reviewed changes

Fix conflicts

6cae35b

wangkuiyi approved these changes Jul 21, 2017

View reviewed changes

wangkuiyi merged commit e1140f2 into PaddlePaddle:develop Jul 21, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add CPU/GPU Memcpy in memory folder #2970

Add CPU/GPU Memcpy in memory folder #2970

gangliao commented Jul 19, 2017 •

edited

Loading

gangliao left a comment

gangliao Jul 20, 2017

qingqing01 Jul 20, 2017

gangliao Jul 20, 2017

QiJune Jul 20, 2017

gangliao Jul 20, 2017

QiJune Jul 20, 2017

gangliao Jul 22, 2017 •

edited

Loading

wangkuiyi left a comment

wangkuiyi Jul 20, 2017

gangliao Jul 22, 2017

wangkuiyi Jul 20, 2017

wangkuiyi Jul 20, 2017

Add CPU/GPU Memcpy in memory folder #2970

Add CPU/GPU Memcpy in memory folder #2970

Conversation

gangliao commented Jul 19, 2017 • edited Loading

gangliao left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gangliao Jul 22, 2017 • edited Loading

Choose a reason for hiding this comment

wangkuiyi left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gangliao commented Jul 19, 2017 •

edited

Loading

gangliao Jul 22, 2017 •

edited

Loading