-
Notifications
You must be signed in to change notification settings - Fork 124
[UR] Proposal of new API for memory object properties #571
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[UR] Proposal of new API for memory object properties #571
Conversation
Signed-off-by: Tikhomirova, Kseniya <kseniya.tikhomirova@intel.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These files are automatically generated from a YAML description of the API. Please read the Contributing Guide for infomation about how to go about making changes.
This reverts commit a5c6888.
Signed-off-by: Tikhomirova, Kseniya <kseniya.tikhomirova@intel.com>
Signed-off-by: Tikhomirova, Kseniya <kseniya.tikhomirova@intel.com>
Signed-off-by: Tikhomirova, Kseniya <kseniya.tikhomirova@intel.com>
Signed-off-by: Tikhomirova, Kseniya <kseniya.tikhomirova@intel.com>
Thanks for pointing to that, updated. |
kbenzie
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Our script has hidden an error, I'm working on a fix for this bug.
In the file scripts/core/PROG.rst in the Kernels section, the example code should be updated to match the entry point as defined in the spec.
#580 has fixed this bug. Please rebase and rerun the |
Signed-off-by: Tikhomirova, Kseniya <kseniya.tikhomirova@intel.com>
done! |
| - $X_RESULT_ERROR_INVALID_KERNEL_ARGUMENT_INDEX | ||
| --- #-------------------------------------------------------------------------- | ||
| type: struct | ||
| desc: "Properties for for $xKernelSetArgMemObj." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
L0 doesnt set these memory properties per kernel, but per object. Do we have that functionality in CUDA or other adapters?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jandres742 AFAIK L0 plugin is intended to use access mode to get a proper memory handle (per object) to set it as exact kernel argument (per kernel). Other plugins now just ignore this property.
What L0 API you mentioned?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The closest L0 has is:
https://spec.oneapi.io/level-zero/latest/core/api.html?highlight=mostly#_CPPv418ze_memory_advice_t
but not WRITE_ONLY, READ_ONLY, READ_WRITE.
AFAIK L0 plugin is intended to use access mode to get a proper memory handle (per object) to set it as exact kernel argument (per kernel).
could you elaborate on that? How the L0 plugin gets the memory handle based on access mode?
enumerator ZE_MEMORY_ADVICE_SET_READ_MOSTLY
hint that memory will be read from frequently and written to rarely
enumerator ZE_MEMORY_ADVICE_CLEAR_READ_MOSTLY
removes the affect of [ZE_MEMORY_ADVICE_SET_READ_MOSTLY](https://spec.oneapi.io/level-zero/latest/core/api.html?highlight=mostly#ze__api_8h_1a54593bb72c476f6c3ac90909ae126cdfaf2a7ce987fb033859d05bc5dd812b1f5)
enumerator ZE_MEMORY_ADVICE_SET_PREFERRED_LOCATION
hint that the preferred memory location is the specified device
enumerator ZE_MEMORY_ADVICE_CLEAR_PREFERRED_LOCATION
removes the affect of [ZE_MEMORY_ADVICE_SET_PREFERRED_LOCATION](https://spec.oneapi.io/level-zero/latest/core/api.html?highlight=mostly#ze__api_8h_1a54593bb72c476f6c3ac90909ae126cdfaf56067d28d73977bb30cc6a35d89520c)
enumerator ZE_MEMORY_ADVICE_SET_NON_ATOMIC_MOSTLY
hints that memory will mostly be accessed non-atomically
enumerator ZE_MEMORY_ADVICE_CLEAR_NON_ATOMIC_MOSTLY
removes the affect of [ZE_MEMORY_ADVICE_SET_NON_ATOMIC_MOSTLY](https://spec.oneapi.io/level-zero/latest/core/api.html?highlight=mostly#ze__api_8h_1a54593bb72c476f6c3ac90909ae126cdfa62a34e1d7d43897cd3d9318474ba88fd)
enumerator ZE_MEMORY_ADVICE_BIAS_CACHED
hints that memory should be cached
enumerator ZE_MEMORY_ADVICE_BIAS_UNCACHED
hints that memory should be not be cached
enumerator ZE_MEMORY_ADVICE_FORCE_UINT32[¶](https://spec.oneapi.io/level-zero/latest/core/api.html?highlight=mostly#_CPPv4N18ze_memory_advice_t29ZE_MEMORY_ADVICE_FORCE_UINT32E)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clarified with @smaslov-intel and yes, we are actually already implementing this in L0 adapter.
@kbenzie : about the advantages: if the argument is only R, we avoid having extra copies needed to update the argument if it were RW or W, so there's actually advantages on doing this. @smaslov-intel would correct me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay cool. It also feels like other adapters could potentially also make use of this information if/when they support multiple devices per context.
|
Before we actually make this change, does this actaully produce a measurable performance improvement? |
kbenzie
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the API change is looking fine. There is still an open question about the benefits of this change.
lets wait for @smaslov-intel who submitted this request to SYCL RT. |
I had requested this to optimize SYCL buffer accesses in context with multiple devices. Currently we have to assume that buffer accesses from kernel are always read-write https://github.com/intel/llvm/blob/3024161f1b7b213dc0c5e6c0e879acb0fb83205f/sycl/plugins/unified_runtime/ur/adapters/level_zero/ur_level_zero_kernel.cpp#L683, which invalidates memory representing the buffer on different devices, resulting in redundant copies. They can take noticeable time for large buffers, but I don't have any measured benefits. |
|
@kbenzie, @jandres742, are you ok to proceed with the new API based on the explanation above? |
@smaslov-intel : I'm not following how the adapters (at least L0) would provide support for this. Could you elaborate? is there something we need to add to L0? does the support exist already on OCL or CUDA? |
@jandres742 : The link that I provided above is how we (start) handling a buffer migration (it is inside L0 adapter/plugin), Feel free to contact me for details |
veselypeta
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGMT. But can you ensure that all the kernel tests still compile.
|
@kbenzie Hi, I am trying to rebase changes to solve merge conflicts and script for spec generation silently fails now. It also fails on main branch without my changes. Is it something known? Could you advise how to solve it? |
I just checkout out your branch and successfully built it locally, so I think its a local configuration hiccup you have. My hunch is that your Python virtual environment was not enabled when |
I am able to build my branch locally too. The problem for me is with main. So I guess there is a change in configuration after code base I use for branch. |
Oh, I missed that. I have just checked out Is your local |
yes, I updated it today. |
I'm at a bit of a loss then as I'm unable to reproduce this locally. My last idea would be to try configuring a new build directory on the |
|
I've had a similar problem in the past. I was missing some sphinx dependency, and solved it with We don't check in the generated docs, so running sphinx is not strictly necessary. It's just useful to make sure that the docs are still being built correctly. If you continue to have issues with your setup, you could also just disable this by adding |
will try it, thank you! |
it works, thanks a lot! |
|
tests are updated, rebased/merged with main. Ready to final review and merge. |
aarongreig
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
kbenzie
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One last tweak to naming, I think this is ready to go after that change though.
Signed-off-by: Tikhomirova, Kseniya <kseniya.tikhomirova@intel.com>
scripts/core/common.yml
Outdated
| desc: $x_exp_command_buffer_desc_t | ||
| - name: EXP_SAMPLER_MIP_PROPERTIES | ||
| desc: $x_exp_sampler_mip_properties_t | ||
| - name: MEM_OBJ_PROPERTIES |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to fix the build
| - name: MEM_OBJ_PROPERTIES | |
| - name: KERNEL_ARG_MEM_OBJ_PROPERTIES |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed, thanks bee29ae
No description provided.