Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

does openblas on armv7 support neon simd extension speedup? #1483

Closed
wl3b10s opened this issue Mar 9, 2018 · 5 comments
Closed

does openblas on armv7 support neon simd extension speedup? #1483

wl3b10s opened this issue Mar 9, 2018 · 5 comments

Comments

@wl3b10s
Copy link

wl3b10s commented Mar 9, 2018

does openblas on armv7 support neon simd extension speedup?

will it support future?

or is there similar blas library support neon on armv7 ?

thanks.

@martin-frbg
Copy link
Collaborator

As far as I know, the ARMv7 implementation (mainly) uses vfp in the inline assembly due to fundamental limitations in the ARMv7 implementation of neon. ARMv8 uses neon/asimd.

@wl3b10s
Copy link
Author

wl3b10s commented Mar 12, 2018

@martin-frbg you mean neon on v7 is not high performance or most armv7 processor does not support neon extension?

@martin-frbg
Copy link
Collaborator

See comments by wernsaar in #562 (the softfp support was added recently, but nothing else changed since then as far as I know)

@martin-frbg
Copy link
Collaborator

So while I cannot seem to find anything supporting the original claim that neon was deprecated on ARMv7, it still seems to be limited to single precision (allowing to process four float values in parallel).
So a more general improvement would probably be achieved by utilizing the FMA functions provided by vfp4 (as suggested earlier in #1127)

@sandwichmaker
Copy link

My understanding is that NEON is an option for ARMV7.

https://community.arm.com/tools/b/blog/posts/arm-cortex-a-processors-and-gcc-command-lines

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants