[OpenCL] Implement urEnqueueUSMMemcpy2D and allow large fill patterns.#976
[OpenCL] Implement urEnqueueUSMMemcpy2D and allow large fill patterns.#976aarongreig wants to merge 2 commits intooneapi-src:adaptersfrom
Conversation
source/adapters/opencl/enqueue.cpp
Outdated
| auto DeleteCallback = [](cl_event, cl_int, void *pUserData) { | ||
| delete[] static_cast<uint64_t *>(pUserData); | ||
| }; | ||
| CL_RETURN_ON_FAILURE( |
There was a problem hiding this comment.
We should call delete[] HostBuffer if this call fails. Otherwise there is a potential memory leak. Probably we need to wait until clEnqueueWriteBuffer is finished to do that.
There was a problem hiding this comment.
added here and in the usm one
source/adapters/opencl/usm.cpp
Outdated
| auto Info = new DeleteCallbackInfo{USMFree, CLContext, HostBuffer}; | ||
|
|
||
| auto DeleteCallback = [](cl_event, cl_int, void *pUserData) { | ||
| static_cast<DeleteCallbackInfo *>(pUserData)->execute(); |
There was a problem hiding this comment.
While this works, wouldn't it be more intuitive to specify a destructor in DeleteCallbackInfo and then calling delete on the object (instead of using delete this)
There was a problem hiding this comment.
agreed, this looks a lot nicer
source/adapters/opencl/usm.cpp
Outdated
| cl_context CLContext; | ||
| void *HostBuffer; | ||
| void execute() { | ||
| USMFree(CLContext, HostBuffer); |
There was a problem hiding this comment.
I'm a bit concerned with the fact that CLContext is required to call USMFree. What will happen if the user calls UrContextRelease after this function returns?
Did you try to use clEnqueueMemcpyINTEL() with a host pointer allocated with new? That would avoid having to use HostMemAlloc / USMFree. I'm under the impression that that might work but not 100%.
There was a problem hiding this comment.
Since I was adding a destructor anyway I gave the struct a proper constructor with does a retain on the context (with matching Release in the destructor), should prevent the released context scenario. Using a new pointer would probably work but it seems sketchier.
source/adapters/opencl/usm.cpp
Outdated
| Events.data(), cl_adapter::cast<cl_event *>(phEvent)); | ||
| } | ||
| for (const auto &E : Events) { | ||
| clReleaseEvent(E); |
There was a problem hiding this comment.
Probably should add CL_RETURN_ON_ERROR to this call as well. Same for line 441
There was a problem hiding this comment.
Added here, I'm going to leave 441 as it is to prioritise returning the error from the memcpy
Normally OpenCL limits fill type operations to a max pattern size of 128, this patch includes a workaround to extend that.
e3ee65a to
603dcfb
Compare
Normally OpenCL limits fill type operations to a max pattern size of 128, this patch includes a workaround to extend that.