hmem/cuda: Add dmabuf fd ops functions #9498

shijin-aws · 2023-10-31T18:10:18Z

Implement the get_dmabuf_fd API for cuda interface.

shijin-aws · 2023-10-31T18:19:08Z

Build check failed as

src/hmem_cuda.c: In function ‘cuda_get_dmabuf_fd’:
src/hmem_cuda.c:702:21: error: ‘struct <anonymous>’ has no member named ‘cuMemGetHandleForAddressRange’
  702 |  cuda_ret = cuda_ops.cuMemGetHandleForAddressRange(
      |                     ^
src/hmem_cuda.c:705:7: error: ‘CU_MEM_RANGE_HANDLE_TYPE_DMA_BUF_FD’ undeclared (first use in this function)
  705 |       CU_MEM_RANGE_HANDLE_TYPE_DMA_BUF_FD,
      |       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

I think this bit is only available in newer CUDA library. I need to add a macro defined in configure.ac

sunkuamzn · 2023-10-31T22:00:55Z

src/hmem_cuda.c

+	if (!cuda_is_dmabuf_supported())
+		return -FI_EOPNOTSUPP;
+
+	aligned_ptr = (uintptr_t) ofi_get_page_start(addr, host_page_size);


What happens when the DMA buf region spans several pages? Is offset always relative to the closest page?

The offset is always relative to the head of the first page

sunkuamzn · 2023-10-31T22:02:15Z

src/hmem_cuda.c

+	CUdevice dev;
+	int is_supported = 0;
+
+	if (cuda_attr.device_count <= 1)


Is it fair to not even check DMA buf support for single GPU instances? It makes sense for GDR but we may use DMA buf with EFA NIC for HPC applications.

I think this line need to be removed. I was copying something from detect_p2p_support

j-xiong · 2023-11-01T00:20:33Z

Intel CI failure seems to be random (rxm/verbs with mpichtestsuite):

17:02:29  Unexpected output in idup_comm_gen: [mpiexec@cb13.cluster] APPLICATION TIMED OUT, TIMEOUT = 180s
17:02:29  Program idup_comm_gen exited without No Errors

wenduwan

Minor comments

wenduwan · 2023-11-01T01:51:36Z

include/ofi_hmem.h

@@ -191,6 +191,9 @@ int cuda_dev_reg_copy_from_hmem(uint64_t handle, void *dest, const void *src,
 bool cuda_is_ipc_enabled(void);
 int cuda_get_ipc_handle_size(size_t *size);
 bool cuda_is_gdrcopy_enabled(void);
+bool cuda_is_dmabuf_supported(void);
+int cuda_get_dmabuf_fd(void *addr, uint64_t size, int *fd,
+			uint64_t *offset);


wenduwan · 2023-11-01T01:54:39Z

configure.ac

+			[[#include <cuda.h>]])
+
+	AC_CHECK_DECL([CU_DEVICE_ATTRIBUTE_DMA_BUF_SUPPORTED],
+			[have_cuda_dmabuf_parameters_support=1],


nit have_cuda_dmabuf_parameters_support -> have_cuda_device_dmabuf_support

No, they are not the same. It's some parameters for dmabuf support.

This parameter is removed in the latest revision

wenduwan · 2023-11-01T01:55:29Z

configure.ac

@@ -666,6 +679,10 @@ AS_IF([test x"$with_cuda" != x"no" && test -n "$with_cuda" && test "$have_cuda"

 AC_DEFINE_UNQUOTED([HAVE_CUDA], [$have_cuda], [CUDA support])

+AS_IF([ test x"$have_cuda" = x"1" && test x"$have_cuda_mem_get_handle_for_address_range" = x"1" && test x"$have_cuda_dmabuf_parameters_support" = x"1" ],


Line length

I do it in a different way. This line is removed

wenduwan · 2023-11-01T01:56:12Z

include/ofi_hmem.h

@@ -191,6 +191,9 @@ int cuda_dev_reg_copy_from_hmem(uint64_t handle, void *dest, const void *src,
 bool cuda_is_ipc_enabled(void);
 int cuda_get_ipc_handle_size(size_t *size);
 bool cuda_is_gdrcopy_enabled(void);
+bool cuda_is_dmabuf_supported(void);
+int cuda_get_dmabuf_fd(void *addr, uint64_t size, int *fd,


Curious why not uint64_t size -> size_t size?

because its the type of size in ibv_reg_dmabuf_mr. We just make it consistent

wenduwan · 2023-11-01T02:02:59Z

src/hmem_cuda.c

+	cuda_ret = cuda_ops.cuMemGetHandleForAddressRange(
+						(void *)fd,
+						aligned_ptr, aligned_size,
+						CU_MEM_RANGE_HANDLE_TYPE_DMA_BUF_FD,


Is it safe to assume that this symbol exists at this point?

this symbol is inside HAVE_CUDA_DMABUF. If the symbol isn't found in configure.ac it won't be compiled

But I think I need to check 2 symbols, CU_MEM_RANGE_HANDLE_TYPE_DMA_BUF_FD and CU_DEVICE_ATTRIBUTE_DMA_BUF_SUPPORTED. Currently I only checked 1 of them. I will update the PR

wenduwan · 2023-11-01T02:05:05Z

src/hmem_cuda.c

+}
+
+int cuda_get_dmabuf_fd(void *addr, uint64_t size, int *fd,
+			uint64_t *offset)


wenduwan · 2023-11-01T02:05:27Z

src/hmem_cuda.c

+ *         -FI_EIO upon CUDA API error
+ */
+int cuda_get_dmabuf_fd(void *addr, uint64_t size, int *fd,
+			uint64_t *offset)


j-xiong · 2023-11-01T18:26:48Z

The new version does look cleaner.

shijin-aws · 2023-11-01T22:20:48Z

@sunkuamzn made a good catch that I should calculate the page aligned address and length from the base address of cuda allocation. I updated it in the latest revision

darrylabbate · 2023-11-02T03:30:13Z

src/hmem_cuda.c

+#if HAVE_CUDA_DMABUF
+#define CUDA_DRIVER_FUNCS_DEF(_)	\
+	_(cuGetErrorName)		\
+	_(cuGetErrorString)		\
+	_(cuPointerGetAttribute)	\
+	_(cuPointerSetAttribute)	\
+	_(cuDeviceCanAccessPeer)	\
+	_(cuMemGetAddressRange)		\
+	_(cuMemGetHandleForAddressRange) \
+	_(cuDeviceGetAttribute)		\
+	_(cuDeviceGet)
+#else
 #define CUDA_DRIVER_FUNCS_DEF(_)	\
 	_(cuGetErrorName)		\
 	_(cuGetErrorString)		\
 	_(cuPointerGetAttribute)	\
 	_(cuPointerSetAttribute)	\
 	_(cuDeviceCanAccessPeer)	\
 	_(cuMemGetAddressRange)
+#endif /* HAVE_CUDA_DMABUF */


It's simpler to conditionally define a dedicated macro for the dmabuf-specific functions and have CUDA_DRIVER_FUNCS_DEF(_) unconditionally inherit them. Otherwise, you need to maintain the function list for both conditions, which defeats the purpose of the macros.

#if HAVE_CUDA_DMABUF #define CUDA_DRIVER_DMABUF_FUNCS_DEF(_) \ _(cuMemGetHandleForAddressRange) \ _(cuDeviceGetAttribute) \ _(cuDeviceGet) #else #define CUDA_DRIVER_DMABUF_FUNCS_DEF(_) #endif #define CUDA_DRIVER_FUNCS_DEF(_) \ _(cuGetErrorName) \ /* ... */ \ CUDA_DRIVER_DMABUF_FUNCS_DEF(_)

Thanks. Updated

I moved cuDeviceGetAttribute and cuDeviceGet from CUDA_DRIVER_DMABUF_FUNCS_DEF because they are not necessarily used for dmabuf and should be generally available for older cuda versions

darrylabbate · 2023-11-02T03:42:29Z

src/hmem_cuda.c

 	.use_ipc              = false,
 	.driver_handle        = NULL,
 	.runtime_handle       = NULL,
-	.nvml_handle          = NULL
+	.nvml_handle          = NULL,
+	.dmabuf_supported     = false


Style nit: initialize .dmabuf_supported after .use_ipc

shijin-aws · 2023-11-02T23:49:57Z

bot:aws:retest

Implement the get_dmabuf_fd API for cuda interface. Signed-off-by: Shi Jin <sjina@amazon.com>

shijin-aws · 2023-11-03T16:48:43Z

@darrylabbate is the new revision looks good to you?

darrylabbate · 2023-11-03T17:49:24Z

Sure

shijin-aws requested a review from j-xiong October 31, 2023 18:10

shijin-aws added the for-1.20.x label Oct 31, 2023

sunkuamzn reviewed Oct 31, 2023

View reviewed changes

shijin-aws force-pushed the cuda_dmabuf branch from 666a575 to 4ad9cde Compare October 31, 2023 23:14

shijin-aws requested a review from sunkuamzn October 31, 2023 23:17

j-xiong approved these changes Nov 1, 2023

View reviewed changes

wenduwan approved these changes Nov 1, 2023

View reviewed changes

shijin-aws force-pushed the cuda_dmabuf branch from 4ad9cde to 8bf56d5 Compare November 1, 2023 17:53

shijin-aws force-pushed the cuda_dmabuf branch from 8bf56d5 to 1aa068d Compare November 1, 2023 22:20

sunkuamzn approved these changes Nov 1, 2023

View reviewed changes

shijin-aws force-pushed the cuda_dmabuf branch 2 times, most recently from c1b2f4a to de1b7e3 Compare November 2, 2023 00:32

darrylabbate reviewed Nov 2, 2023

View reviewed changes

shijin-aws force-pushed the cuda_dmabuf branch 2 times, most recently from ccebc4a to 875de77 Compare November 2, 2023 04:57

shijin-aws requested a review from darrylabbate November 2, 2023 23:49

hmem/cuda: Add dmabuf fd ops functions

f7d813a

Implement the get_dmabuf_fd API for cuda interface. Signed-off-by: Shi Jin <sjina@amazon.com>

shijin-aws force-pushed the cuda_dmabuf branch from 875de77 to f7d813a Compare November 2, 2023 23:56

j-xiong merged commit ec6bdb6 into ofiwg:main Nov 3, 2023
8 checks passed

		@@ -666,6 +679,10 @@ AS_IF([test x"$with_cuda" != x"no" && test -n "$with_cuda" && test "$have_cuda"

		AC_DEFINE_UNQUOTED([HAVE_CUDA], [$have_cuda], [CUDA support])

		AS_IF([ test x"$have_cuda" = x"1" && test x"$have_cuda_mem_get_handle_for_address_range" = x"1" && test x"$have_cuda_dmabuf_parameters_support" = x"1" ],

hmem/cuda: Add dmabuf fd ops functions #9498

hmem/cuda: Add dmabuf fd ops functions #9498

Conversation

shijin-aws commented Oct 31, 2023

shijin-aws commented Oct 31, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

j-xiong commented Nov 1, 2023

wenduwan left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

j-xiong commented Nov 1, 2023

shijin-aws commented Nov 1, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shijin-aws Nov 2, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shijin-aws commented Nov 2, 2023

shijin-aws commented Nov 3, 2023

darrylabbate commented Nov 3, 2023

shijin-aws commented Oct 31, 2023 •

edited

Loading

shijin-aws Nov 2, 2023 •

edited

Loading