Overview of what is supported for kernel programming #58

mattcbro · 2019-02-15T01:51:24Z

Many thanks to the developers of this amazing package! I thought I would submit a list of gotchas and issues I ran into trying to get my kernel to work. Most of these are known, and I will try to provide links where appropriate.

I realize this may be more appropriate in the documentation, but unfortunately it is a very fluid situation and I couldn't figure out where I should contribute my usage notes.

The power operator ^ does not work in any kernel code. Use CUDAnative.pow()

Ordinary array constructors do not work, ie
[1, 2, 3], or Array([1,2,3]) etc.

Theoretically you could use StaticArrays() and Marrays, Mvectors etc. Unfortunately a bug may cause intermittent memory faults see JuliaGPU/CUDAnative.jl#340 JuliaGPU/CuArrays.jl#278 . Hope to see this resolved soon, though the current #master branch for this package does not compile on my linux mint 19 machine.

What does work are tuples, for small arrays, ie (arr[1,2], arr[2,2], arr[3,2]) instead of arr[:,2] . I'm also seeing MUCH faster performance using tuples than Mvectors for this sort of thing.

reshape() does not work and you can not do array slicing in a kernel. However you can use view().
ie instead of arr[:,j] do view(arr, :, j)

Instead of sqrt() use CUDAnative.sqrt() .

There is no real support for complex exponentials. So instead of doing exp(j * theta) do CUDAnative.cos(theta) + 1im * CUDAnative.sin(theta).

@device_code_warntype in front of your @cuda call is quite useful for ironing out type instabilities, which must be quashed in your kernel code.

The stack traces for kernel code that does not compile are often not quite right. I think this is going to be resolved shortly. JuliaGPU/CUDAnative.jl#306

Since the affordable Nvidia GPUs have nerfed support for double's one often desires to use Float32 for your floating point arrays. As such remember to properly create Float32 and avoid mixing Float64 with Float32. The same applies to complex types. Also you can force 32 bit floating point constants using a syntax like 1.0f0. This will avoid a bunch of spurious type instabilities.

The text was updated successfully, but these errors were encountered:

maleadt · 2019-02-21T12:16:32Z

This should probably be put in the documentation.

mattcbro · 2019-02-22T02:56:17Z

Sure, where do you want it to go?

calebwin · 2020-03-13T17:35:07Z

Is there any update on this? Is there an up-to-date list of what CUDAnative.jl supports from the Julia language?

maleadt · 2020-03-19T08:28:14Z

No, there isn't.

maleadt changed the title ~~Summary of what does not work in the Kernel~~ Overview of what is supported for kernel programming Feb 21, 2019

maleadt transferred this issue from JuliaGPU/CUDAnative.jl May 27, 2020

maleadt added the documentation Improvements or additions to documentation label May 27, 2020

wsshin mentioned this issue Aug 5, 2021

Illegal memory access during complex exponential with large imaginary part as exponent #1085

Closed

Alexander-Barth mentioned this issue Dec 5, 2023

GC integration: multiple threads fail to free each others objects, lead to OOM #1522

Open

maleadt mentioned this issue Apr 27, 2024

Document (un)supported language features for kernel programming #13

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Overview of what is supported for kernel programming #58

Overview of what is supported for kernel programming #58

mattcbro commented Feb 15, 2019

maleadt commented Feb 21, 2019

mattcbro commented Feb 22, 2019

calebwin commented Mar 13, 2020

maleadt commented Mar 19, 2020

Overview of what is supported for kernel programming #58

Overview of what is supported for kernel programming #58

Comments

mattcbro commented Feb 15, 2019

maleadt commented Feb 21, 2019

mattcbro commented Feb 22, 2019

calebwin commented Mar 13, 2020

maleadt commented Mar 19, 2020