-
Notifications
You must be signed in to change notification settings - Fork 290
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AVX ? #24
Comments
Good idea! |
@mborgerding How about support SIMD via ISPC(https://github.com/ispc/ispc)? ISPC is an open source compiler supported by Intel, which can generate SSE, AVX or neon code with only one copy of code. |
I would second that option, ISPC is great and we use it internally for some SIMD-friendly parts of our code. It allows you to write the code once, and it will scale to the host CPU automatically (no need to rewrite the code for each new instructions set to be supported). |
That's the first I've heard of ISPC. The first glance looks promising! |
I've been giving a look at the existing forks of kissfft on Github and found this commit implementing AVX |
|
@mborgerding I've added initial support of ISPC(not optimized, just ported some key functions to ISPC), and got some interesting benchmark data on my Mac Book Pro 2017: ======timing test (type=double) |
Interesting indeed! Can you push your fork? |
Sure, let me clean my code first. Sorry for the upper data, I double checked the non-ISPC version benchmark data, it should not be contribution by ISPC. Let me try and see any ISPC optimization helps. |
Hi,
considering the great improvement of the SIMD variant, what about an AVX version that would process the lines 8 by 8 ? Do you see that as doable ?
Actually, I've just compared KissFFT/SIMD against the AVX version of MuFFT on a 1024x1024 2D grid, both take more or less the same amount of time at the moment. Hence my feelding that an AVX version of KissFFT would outperform MuFFT AVX implementation (or even AVX-512)...
The text was updated successfully, but these errors were encountered: