-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Detect AVX2 support at runtime #71
Comments
@jamesjer - is it hard to have 2 flavours of the package on Fedora, one with AVX2 support, and one without? And when the package gets installed on a box, |
We have had some discussions about doing that on the Fedora mailing list. So far the idea has failed to gain traction. It is technically doable, but the distribution does not currently support that approach. |
I thought something like this is done for openblas, but perhaps I'm mixing this up. |
I took a look at the openblas spec file. I don't see anything of the sort happening. In fact, I took a peek inside the openblas 0.3.26 tarball, and they're doing runtime CPU detection with cpuid, just as I'm proposing here. :-) Is that approach interesting at all? |
I heard that SIMDe is a viable approach to emulate AVX etc if these are not available. Although I've no idea how this can play out on the level of binaries. |
Linux distributions must build for the lowest common denominator CPU. For the Fedora Linux distribution, the original x86_64 is still supported, meaning we cannot build msolve with AVX2 support. Would you consider detecting AVX2 support at runtime instead of at compile time?
One way that could be done is to add this code somewhere in src/neogb:
That works for gcc and clang. If you want to support other compilers, the code might get a little more complex. With
have_avx2
available, then code like this:would be transformed into this:
On x86_64 platforms, then -mavx could be passed to the compiler always, since the AVX2 code is not executed if
__get_cpuid
indicates the CPU doesn't support AVX2. That would let you throw away most or all of several files in the m4 directory.I can open a PR if you like the idea.
The text was updated successfully, but these errors were encountered: