parallel_for equivalent in Ginkgo #1190
-
Hello, we have the following (mathematical) problem: We want to setup a RBF system matrix in order to solve for the coefficients. We furthermore want to accelerate this using CUDA (Gingko) by not only solving the linear system of equations, but also setting up the kernel matrix on the GPU. Now our question is if there is a way in Ginkgo (similar to parallel_for in Kokkos) to fill such a matrix on the GPU, i.e., evaluate each RBF entry on the GPU side without having to write custom CUDA kernels ourselves and pass an array pointer to it. It might be the case that we just didn't find this feature yet, so please excuse me if this question is already answered. Thank you and have a great day! Best regards |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 11 replies
-
We have such a framework in place already, it's only not yet exposed to the user since we are not certain it is final yet. For an example, see #938, or more specifically stencil_kernel.cpp. I'd be happy to pick this PR up again if it's of importance to you. |
Beta Was this translation helpful? Give feedback.
We have such a framework in place already, it's only not yet exposed to the user since we are not certain it is final yet. For an example, see #938, or more specifically stencil_kernel.cpp. I'd be happy to pick this PR up again if it's of importance to you.