Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hmem/cuda: Add dmabuf fd ops functions #9498

Merged
merged 1 commit into from
Nov 3, 2023
Merged

Conversation

shijin-aws
Copy link
Contributor

Implement the get_dmabuf_fd API for cuda interface.

@shijin-aws
Copy link
Contributor Author

shijin-aws commented Oct 31, 2023

Build check failed as

src/hmem_cuda.c: In function ‘cuda_get_dmabuf_fd’:
src/hmem_cuda.c:702:21: error: ‘struct <anonymous>’ has no member named ‘cuMemGetHandleForAddressRange’
  702 |  cuda_ret = cuda_ops.cuMemGetHandleForAddressRange(
      |                     ^
src/hmem_cuda.c:705:7: error: ‘CU_MEM_RANGE_HANDLE_TYPE_DMA_BUF_FD’ undeclared (first use in this function)
  705 |       CU_MEM_RANGE_HANDLE_TYPE_DMA_BUF_FD,
      |       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

I think this bit is only available in newer CUDA library. I need to add a macro defined in configure.ac

src/hmem_cuda.c Outdated
if (!cuda_is_dmabuf_supported())
return -FI_EOPNOTSUPP;

aligned_ptr = (uintptr_t) ofi_get_page_start(addr, host_page_size);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens when the DMA buf region spans several pages? Is offset always relative to the closest page?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The offset is always relative to the head of the first page

src/hmem_cuda.c Outdated
CUdevice dev;
int is_supported = 0;

if (cuda_attr.device_count <= 1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it fair to not even check DMA buf support for single GPU instances? It makes sense for GDR but we may use DMA buf with EFA NIC for HPC applications.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this line need to be removed. I was copying something from detect_p2p_support

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

@j-xiong
Copy link
Contributor

j-xiong commented Nov 1, 2023

Intel CI failure seems to be random (rxm/verbs with mpichtestsuite):

17:02:29  Unexpected output in idup_comm_gen: [mpiexec@cb13.cluster] APPLICATION TIMED OUT, TIMEOUT = 180s
17:02:29  Program idup_comm_gen exited without No Errors

Copy link
Contributor

@wenduwan wenduwan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments

@@ -191,6 +191,9 @@ int cuda_dev_reg_copy_from_hmem(uint64_t handle, void *dest, const void *src,
bool cuda_is_ipc_enabled(void);
int cuda_get_ipc_handle_size(size_t *size);
bool cuda_is_gdrcopy_enabled(void);
bool cuda_is_dmabuf_supported(void);
int cuda_get_dmabuf_fd(void *addr, uint64_t size, int *fd,
uint64_t *offset);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alignment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

configure.ac Outdated
[[#include <cuda.h>]])

AC_CHECK_DECL([CU_DEVICE_ATTRIBUTE_DMA_BUF_SUPPORTED],
[have_cuda_dmabuf_parameters_support=1],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit have_cuda_dmabuf_parameters_support -> have_cuda_device_dmabuf_support

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, they are not the same. It's some parameters for dmabuf support.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This parameter is removed in the latest revision

configure.ac Outdated
@@ -666,6 +679,10 @@ AS_IF([test x"$with_cuda" != x"no" && test -n "$with_cuda" && test "$have_cuda"

AC_DEFINE_UNQUOTED([HAVE_CUDA], [$have_cuda], [CUDA support])

AS_IF([ test x"$have_cuda" = x"1" && test x"$have_cuda_mem_get_handle_for_address_range" = x"1" && test x"$have_cuda_dmabuf_parameters_support" = x"1" ],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line length

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do it in a different way. This line is removed

@@ -191,6 +191,9 @@ int cuda_dev_reg_copy_from_hmem(uint64_t handle, void *dest, const void *src,
bool cuda_is_ipc_enabled(void);
int cuda_get_ipc_handle_size(size_t *size);
bool cuda_is_gdrcopy_enabled(void);
bool cuda_is_dmabuf_supported(void);
int cuda_get_dmabuf_fd(void *addr, uint64_t size, int *fd,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious why not uint64_t size -> size_t size?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because its the type of size in ibv_reg_dmabuf_mr. We just make it consistent

cuda_ret = cuda_ops.cuMemGetHandleForAddressRange(
(void *)fd,
aligned_ptr, aligned_size,
CU_MEM_RANGE_HANDLE_TYPE_DMA_BUF_FD,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it safe to assume that this symbol exists at this point?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this symbol is inside HAVE_CUDA_DMABUF. If the symbol isn't found in configure.ac it won't be compiled

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I think I need to check 2 symbols, CU_MEM_RANGE_HANDLE_TYPE_DMA_BUF_FD and CU_DEVICE_ATTRIBUTE_DMA_BUF_SUPPORTED. Currently I only checked 1 of them. I will update the PR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

src/hmem_cuda.c Outdated
}

int cuda_get_dmabuf_fd(void *addr, uint64_t size, int *fd,
uint64_t *offset)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alignment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

src/hmem_cuda.c Outdated
* -FI_EIO upon CUDA API error
*/
int cuda_get_dmabuf_fd(void *addr, uint64_t size, int *fd,
uint64_t *offset)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alignment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@j-xiong
Copy link
Contributor

j-xiong commented Nov 1, 2023

The new version does look cleaner.

@shijin-aws
Copy link
Contributor Author

@sunkuamzn made a good catch that I should calculate the page aligned address and length from the base address of cuda allocation. I updated it in the latest revision

@shijin-aws shijin-aws force-pushed the cuda_dmabuf branch 2 times, most recently from c1b2f4a to de1b7e3 Compare November 2, 2023 00:32
src/hmem_cuda.c Outdated
Comment on lines 57 to 76
#if HAVE_CUDA_DMABUF
#define CUDA_DRIVER_FUNCS_DEF(_) \
_(cuGetErrorName) \
_(cuGetErrorString) \
_(cuPointerGetAttribute) \
_(cuPointerSetAttribute) \
_(cuDeviceCanAccessPeer) \
_(cuMemGetAddressRange) \
_(cuMemGetHandleForAddressRange) \
_(cuDeviceGetAttribute) \
_(cuDeviceGet)
#else
#define CUDA_DRIVER_FUNCS_DEF(_) \
_(cuGetErrorName) \
_(cuGetErrorString) \
_(cuPointerGetAttribute) \
_(cuPointerSetAttribute) \
_(cuDeviceCanAccessPeer) \
_(cuMemGetAddressRange)
#endif /* HAVE_CUDA_DMABUF */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's simpler to conditionally define a dedicated macro for the dmabuf-specific functions and have CUDA_DRIVER_FUNCS_DEF(_) unconditionally inherit them. Otherwise, you need to maintain the function list for both conditions, which defeats the purpose of the macros.

#if HAVE_CUDA_DMABUF
#define CUDA_DRIVER_DMABUF_FUNCS_DEF(_) \
  _(cuMemGetHandleForAddressRange)      \
  _(cuDeviceGetAttribute)               \
  _(cuDeviceGet)
#else
#define CUDA_DRIVER_DMABUF_FUNCS_DEF(_)
#endif

#define CUDA_DRIVER_FUNCS_DEF(_)  \
  _(cuGetErrorName)               \
  /* ... */                       \
  CUDA_DRIVER_DMABUF_FUNCS_DEF(_)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Updated

Copy link
Contributor Author

@shijin-aws shijin-aws Nov 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved cuDeviceGetAttribute and cuDeviceGet from CUDA_DRIVER_DMABUF_FUNCS_DEF because they are not necessarily used for dmabuf and should be generally available for older cuda versions

src/hmem_cuda.c Outdated
Comment on lines 112 to 116
.use_ipc = false,
.driver_handle = NULL,
.runtime_handle = NULL,
.nvml_handle = NULL
.nvml_handle = NULL,
.dmabuf_supported = false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style nit: initialize .dmabuf_supported after .use_ipc

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@shijin-aws shijin-aws force-pushed the cuda_dmabuf branch 2 times, most recently from ccebc4a to 875de77 Compare November 2, 2023 04:57
@shijin-aws
Copy link
Contributor Author

bot:aws:retest

Implement the get_dmabuf_fd API for cuda interface.

Signed-off-by: Shi Jin <sjina@amazon.com>
@shijin-aws
Copy link
Contributor Author

@darrylabbate is the new revision looks good to you?

@darrylabbate
Copy link
Member

Sure

@j-xiong j-xiong merged commit ec6bdb6 into ofiwg:main Nov 3, 2023
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants