Skip to content

[OpenCL] Implement urEnqueueUSMMemcpy2D and allow large fill patterns.#976

Closed
aarongreig wants to merge 2 commits intooneapi-src:adaptersfrom
aarongreig:aaron/cl2DUSMOps
Closed

[OpenCL] Implement urEnqueueUSMMemcpy2D and allow large fill patterns.#976
aarongreig wants to merge 2 commits intooneapi-src:adaptersfrom
aarongreig:aaron/cl2DUSMOps

Conversation

@aarongreig
Copy link
Contributor

Normally OpenCL limits fill type operations to a max pattern size of 128, this patch includes a workaround to extend that.

@aarongreig aarongreig requested a review from a team as a code owner October 20, 2023 10:50
auto DeleteCallback = [](cl_event, cl_int, void *pUserData) {
delete[] static_cast<uint64_t *>(pUserData);
};
CL_RETURN_ON_FAILURE(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should call delete[] HostBuffer if this call fails. Otherwise there is a potential memory leak. Probably we need to wait until clEnqueueWriteBuffer is finished to do that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added here and in the usm one

auto Info = new DeleteCallbackInfo{USMFree, CLContext, HostBuffer};

auto DeleteCallback = [](cl_event, cl_int, void *pUserData) {
static_cast<DeleteCallbackInfo *>(pUserData)->execute();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this works, wouldn't it be more intuitive to specify a destructor in DeleteCallbackInfo and then calling delete on the object (instead of using delete this)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed, this looks a lot nicer

cl_context CLContext;
void *HostBuffer;
void execute() {
USMFree(CLContext, HostBuffer);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit concerned with the fact that CLContext is required to call USMFree. What will happen if the user calls UrContextRelease after this function returns?

Did you try to use clEnqueueMemcpyINTEL() with a host pointer allocated with new? That would avoid having to use HostMemAlloc / USMFree. I'm under the impression that that might work but not 100%.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since I was adding a destructor anyway I gave the struct a proper constructor with does a retain on the context (with matching Release in the destructor), should prevent the released context scenario. Using a new pointer would probably work but it seems sketchier.

Events.data(), cl_adapter::cast<cl_event *>(phEvent));
}
for (const auto &E : Events) {
clReleaseEvent(E);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably should add CL_RETURN_ON_ERROR to this call as well. Same for line 441

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added here, I'm going to leave 441 as it is to prioritise returning the error from the memcpy

Normally OpenCL limits fill type operations to a max pattern size of
128, this patch includes a workaround to extend that.
Copy link
Contributor

@fabiomestre fabiomestre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@aarongreig aarongreig added the conformance Conformance test suite issues. label Nov 6, 2023
@aarongreig aarongreig closed this Nov 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

conformance Conformance test suite issues.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants