Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to provide both C and OpenCL filter implementations #55

Open
tfarago opened this issue Dec 6, 2014 · 0 comments
Open

How to provide both C and OpenCL filter implementations #55

tfarago opened this issue Dec 6, 2014 · 0 comments
Labels

Comments

@tfarago
Copy link
Contributor

tfarago commented Dec 6, 2014

I would like to implement a filter on a GPU (namely transpose) but there already is a C implementation. I thought I could make the C vs. OpenCL a property but there is this UFO_TASK_MODE_GPU which I see by many OpenCL-based filters. Does that just mean that the filter won't be executed on OpenCL CPU devices? If so I just don't use the flag and I should be good to go. Anyway, can @matze tell me what are the "best practices" here?

Sure, one can say that if the filter can run on any device I could just implement it only in OpenCL because it can run on CPU as well but there is the unnecessary host <-> device transfer overhead (let's say I have read -> transpose -> write pipeline, then I would of course use the pure C version, otherwise I would need to transfer "up" and "down" again, and if we say that the transpose is almost as fast as data copying it is pretty pointless to do it by OpenCL CPU device). If I remember correctly we could "map" the buffer or use CL_MEM_USE_HOST_PTR and then we don't need to transfer back and forth, but I don't know if the framework supports that already.

It would be maybe interesting for optimization of the graph execution to have only one task but a property or something which the user can use to make their graph efficient. This property could be available to the scheduler as well and it could even optimize the graph for the user automatically (minimize data copying).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant