You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I like to compare CUDA and CPU results. NVIDIA's nvvm defaults to fma=1 behind the scene. This I can manage using PyCUDA but not NUMBA cuda.jit. The following exposes the fma option. Note: there is a blank line at the end of the diff. Oh how happy seeing the cuda.jit results matching the CPU.
Disabling fma allows me to witness cuda.jit match the CPU results. E.g. Launch mandel_kernel.py and press the letter x. That will auto zoom to location 2. The RGB values total matches the CPU.
I like to compare CUDA and CPU results. NVIDIA's nvvm defaults to fma=1 behind the scene. This I can manage using PyCUDA but not NUMBA cuda.jit. The following exposes the fma option. Note: there is a blank line at the end of the diff. Oh how happy seeing the cuda.jit results matching the CPU.
Exposing the fma option:
Use case: https://github.com/marioroy/mandelbrot-python
Disabling fma allows me to witness cuda.jit match the CPU results. E.g. Launch
mandel_kernel.py
and press the letterx
. That will auto zoom to location 2. The RGB values total matches the CPU.The text was updated successfully, but these errors were encountered: