-
Notifications
You must be signed in to change notification settings - Fork 802
[SYCL] [L0] Add user APIs to import / release host memory from USM. #9294
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
CUDA and HIP LGTM. However, I wonder if this has to be a oneapi-wide extension, would it not be better suited as |
|
Although the implementation has been fleshed out for L0, the intent is to allow other platforms to do what is necessary to accelerate host<->device transfers within the range of the prepare / release. |
sycl/doc/extensions/experimental/sycl_ext_oneapi_usm_prepare_release_device_copy.asciidoc
Outdated
Show resolved
Hide resolved
sycl/doc/extensions/experimental/sycl_ext_oneapi_usm_prepare_release_device_copy.asciidoc
Outdated
Show resolved
Hide resolved
sycl/doc/extensions/experimental/sycl_ext_oneapi_usm_prepare_release_device_copy.asciidoc
Outdated
Show resolved
Hide resolved
sycl/doc/extensions/experimental/sycl_ext_oneapi_usm_prepare_release_device_copy.asciidoc
Outdated
Show resolved
Hide resolved
sycl/doc/extensions/experimental/sycl_ext_oneapi_usm_prepare_release_device_copy.asciidoc
Outdated
Show resolved
Hide resolved
sycl/doc/extensions/experimental/sycl_ext_oneapi_usm_prepare_release_device_copy.asciidoc
Outdated
Show resolved
Hide resolved
sycl/doc/extensions/experimental/sycl_ext_oneapi_usm_prepare_release_device_copy.asciidoc
Outdated
Show resolved
Hide resolved
sycl/plugins/cuda/pi_cuda.cpp
Outdated
| return result; | ||
| } | ||
|
|
||
| pi_result piextUSMImport(void *ptr, size_t size, pi_context context) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we have something similar in CU, cudaHostRegister, cudaHostUnregister . should we expand the definition of the extension to cover the functionality offered by all backends?
Moreover, the L0 APIs are driver-experimental extensions, and when merged to the spec, they would likely be much different. i dont think that we should define a SYCL extension until the L0 interfaces are at least defined in the L0 spec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not quite equivalent to the cuda HostRegister APIs. Those make the address range accessible from host and device and in addition pin the host memory to physical memory. We should add "pinning" as a separate extension when it is supported by L0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @rdeodhar . Now that we are moving to the UR, all PI interfaces should be designed taking into account all interfaces. Or maybe, something like UR extensions would help here. I will bring up this in the UR WGs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moreover, the L0 APIs are driver-experimental extensions, and when merged to the spec, they would likely be much different. i dont think that we should define a SYCL extension until the L0 interfaces are at least defined in the L0 spec.
Note that this is also an experimental SYCL extension. We can replace it with a more general API later if the L0 API changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gmlueck : Even if this is an experimental extension, we need UR interfaces for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. I was only responding to your comment that I quoted, where you seemed to be implying that we should not have SYCL APIs that expose experimental L0 features.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rdeodhar Documentation on how to support experimental features is still being written by @kbenzie, but the draft PR is here: oneapi-src/unified-runtime#546
As this is experimental the UR experimental feature can look identical to the PI feature if that is the quickest way to keep this PR moving.
You can also ask us (@alycm, @jandres742, @kbenzie) any questions and we'll help you out.
|
Failure in post-commit: build failed on Linux and macOS. |
…ntel#9294) This change adds a SYCL interface to the Level Zero APIs zexDriverImportExternalPointer and zexDriverReleaseImportedPointer. These functions are used for importing host memory into USM for the duration of data transfer to increase bandwidth.
…ntel#9294) This change adds a SYCL interface to the Level Zero APIs zexDriverImportExternalPointer and zexDriverReleaseImportedPointer. These functions are used for importing host memory into USM for the duration of data transfer to increase bandwidth.
…ntel#9294) This change adds a SYCL interface to the Level Zero APIs zexDriverImportExternalPointer and zexDriverReleaseImportedPointer. These functions are used for importing host memory into USM for the duration of data transfer to increase bandwidth.
This change adds a SYCL interface to the Level Zero APIs zexDriverImportExternalPointer and zexDriverReleaseImportedPointer. These functions are used for importing host memory into USM for the duration of data transfer to increase bandwidth.