Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPU support #2

Open
kazuyukitanimura opened this issue Mar 22, 2011 · 0 comments
Open

GPU support #2

kazuyukitanimura opened this issue Mar 22, 2011 · 0 comments

Comments

@kazuyukitanimura
Copy link
Owner

  1. Explicit data transportation between main memory and GPU memory
    This is a mistake that I made when I started this project. I believed that the code works for CPU works exactly the same for GPU (thought it is the claim of OpenCL). But, in fact, it is a miss leading. In narray.c, there is a line
    ary->buffer = clCreateBuffer(context, CL_MEM_READ_WRITE|CL_MEM_USE_HOST_PTR, na_sizeof[type]*total, ary->ptr, NULL);
    The specification does not explain clearly, but CL_MEM_USE_HOST_PTR works fine for CPU main memory but not for GPU memory (the newest version of OpenCL might be different). For GPU memory, one has to explicitly transfer the data from main memory to GPU memory and take care of pointers. In order to this, I might have to re-write the whole program that I added to the original NArray since I kept using the main memory pointers :P
  2. Overhead
    The overhead of data is quite expensive. If this is a normal OpenCL programing, one can save the reusable data on the GPU memory; however, the interpreter(or virtual machine) of ruby runs using the main memory so that all the data on the GPU memory have to be sent back to the main memory every time after each OpenCL NArray method is executed. This causes a lot of data transaction between the memories and make it difficult to get performance gain.
  3. Applications
    The applications that uses this project can get nice performance gain only if the applications need to compute a large size of NArray instances (since you can parallel data more). However, considering about the large overhead mentioned above, I don't know any application that needs to compute that large NArray instances. I will get excited and more motivated once I know that there are specific applications that have to utilize OpenCL NArray.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant